facing issue for second project "Python Programming - Project - Part 2"

Hi Team,

I am doing second project "
Python Programming - Project - Part 2" and I am facing issue.

Kindly provide the setup by setup so that I can complete my projects.

Thanks in Advance for help.

Below find the code which I am using:
fhand = open(‘mbox-short.txt’, ‘r’)
for line in fhand:
if line.startswith(‘From:’) :
l = line.rstrip()
print(l.split()[1])

    words = l.split()
    counts = dict()
    for word in words:
        counts[word] = counts.get(word,0) + 1

        bigcount = None
        bigword = None
        for word,count in counts.items():
            if bigcount is None or count > bigcount:
                bigword = word
                bigcount = count

                print(bigword, bigcount)

Python Programming - Project - Part 2
Objective: The objective of this project is to analyze the email data you collected in the previous project.

INSTRUCTIONS
From the data you collected and the concepts you learned in the class (Numpy and dataframes in Pandas) do the following analysis on the data :

Create the dataframe from the emails using the technique defined in previous project.
Find the top 5 times of the week when you receive the most number of emails. Hint. Categorise the data on the basis of time into slots and find the slot with the most number of emails.
Who has sent you the maximum number of emails?
To whom have you sent the maximum number of emails?
On the basis of above two analysis decide who is your closest buddy. You are free to derive your own formula for choosing your closest buddy. One of the criteria can be the person whom you converse frequently.

Hi, Anil.

I have seen your email and the post in our discussion forum regarding Python projects.

I appreciate your efforts.

You have written code for one file only, but we want to parse through all the file present in your email folder.

Kindly use the os.walk() module for this.

I also request you to please go through the objective of the project again, your function should return the tuple having the fields as (‘to’,‘from’, ‘subject’). So, you need to extract these fields from the files.

  1. Parse the email by parsed_eml.

  2. Use parsed_eml[‘header’][‘from’] --> to get the info about the from where you got the email.

  3. parsed_eml[‘header’][‘to’] --> To whom the email has been sent.

  4. parsed_eml[‘header’][‘subject’] --> What is the subject of the email.

Now, you can proceed.

All the best.

Hi Satyajit,

This is regarding Project 1:
I am able to get the JSON for each file using eml_parser and os.walk. Just need to check if we need to hardcode “[‘header’]” or we need to make it more generic. like it will automatically reach to the header and check for passed parameters(‘to’, ‘from’, ‘subject’) in the function ?

Hi, Abhi.

Yes, you are right we need to give the regular expressions to extract the email from the file.
and you need to give the [‘header’] as it is the key by which you will get the content of the email.
You can try it in Juoyter.

All the best.