Unable to run a python Map_reduce_job

Hello,
I am trying to run a map reduce job which is in python and the within the python script it imports two packages called as MRJob and MRStep. I tried installing these jars but as I do not have root access, couldn’t install. So, I assumed they already exist.

So, I tried running the python map reduce job on both terminal and hadoop. but failed to sucessfully execute it due to the below error.

Error:
Traceback (most recent call last):
File “filename”, line 1, in
from mrjob.job import MRJob
ImportError: No module named mrjob.job

Can you please advise on this?

Many Thanks!

Sai

Hi @saicharan,

mrjob is installed in Python 3 enviroment which you can access from here

/usr/local/anaconda/bin/python

Thanks Abhinav. So, should all the python map reduce jobs be executed from /usr/local/anaconda/bin/ directory? Or should I specify this location name whenever executing the job?

Regards,
Sai

Hello,

Thanks for you help earlier. The issue was resolved by running /usr/local/anaconda/bin/python instead of just python. But now I am facing a below new error. Again this looks like a configuration issue if I am not wrong.

Error :
Scanning logs for probable cause of failure…
Can’t fetch history log; missing job ID
Can’t fetch task logs; missing application ID
Step 1 of 1 failed: Command ‘[’/bin/hadoop’, ‘jar’, ‘/usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming.jar’, ‘-files’, ‘hdfs:///user/charantheking16954/tmp
/mrjob/udemy_python_map_reduce_get_ratingscount.charantheking16954.20181206.173037.007147/files/mrjob.zip#mrjob.zip,hdfs:///user/charantheking16954/tmp/mrjob/udemy
_python_map_reduce_get_ratingscount.charantheking16954.20181206.173037.007147/files/setup-wrapper.sh#setup-wrapper.sh,hdfs:///user/charantheking16954/tmp/mrjob/ude
my_python_map_reduce_get_ratingscount.charantheking16954.20181206.173037.007147/files/udemy_python_map_reduce_get_ratingscount.py#udemy_python_map_reduce_get_ratin
gscount.py’, ‘-input’, ‘hdfs:///user/charantheking16954/tmp/mrjob/udemy_python_map_reduce_get_ratingscount.charantheking16954.20181206.173037.007147/files/u.data’,
‘-output’, ‘hdfs:///user/charantheking16954/tmp/mrjob/udemy_python_map_reduce_get_ratingscount.charantheking16954.20181206.173037.007147/output’, ‘-mapper’, ‘sh -
ex setup-wrapper.sh python3 udemy_python_map_reduce_get_ratingscount.py --step-num=0 --mapper’, ‘-reducer’, ‘sh -ex setup-wrapper.sh python3 udemy_python_map_reduc
e_get_ratingscount.py --step-num=0 --reducer’]’ returned non-zero exit status 256.

Can you please help here?

Thanks,
Sai

Hello There,
Could you please let me know how did you run python mapredure program

  1. locally
  2. hadoop environment?

Thanks