Difficulty in understanding expected output of Project

Hi All,

While solving below project I’m not able to understand what is the expected output.

Python Project - Churn Emails - Dataset

We have a text file which records mail activity from various individuals in an open source project development team. Below is the file location/cxldata/datasets/project/mbox-short.txtTo see the first 15 lines of mbox-short.txt , please use below command on the console

These files are in a standard format for a file containing multiple mail messages. The lines which start with "From " separate the messages and the lines which start with “From:” are part of the messages. For more information about the mbox format, please see this wikipedia article

e-mail address is appearing twice for a mail in the given file eg.

From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
From: stephen.marquard@uct.ac.za

What should be the output look like?

(1) stephen.marquard@uct.ac.za
stephen.marquard@uct.ac.za

(2) stephen.marquard@uct.ac.za (Only Unique e-mail id’s without From: )

(3) From: stephen.marquard@uct.ac.za (Only Unique e-mail id’s with From: )

(4) From stephen.marquard@uct.ac.za
From: stephen.marquard@uct.ac.za

I’m confused although I’ve completed this exercise where I’ve created a list object with the extracted e-mail ids. and then printed the list. You must mention your expected output in the question which is missing in this case.

Please help.

Regards
Manoj

Hi, Manoj.

Thanks for your suggestions!.
The sample output of all questions were already given.
I agree with you that duplicate emails are there but as per the questions and the expected output you need to write the logic of acceptance.

All the best!