Unable to run a python Map_reduce_job


#1

Hello,
I am trying to run a map reduce job which is in python and the within the python script it imports two packages called as MRJob and MRStep. I tried installing these jars but as I do not have root access, couldn’t install. So, I assumed they already exist.

So, I tried running the python map reduce job on both terminal and hadoop. but failed to sucessfully execute it due to the below error.

Error:
Traceback (most recent call last):
File “filename”, line 1, in
from mrjob.job import MRJob
ImportError: No module named mrjob.job

Can you please advise on this?

Many Thanks!

Sai


#2

Hi @saicharan,

mrjob is installed in Python 3 enviroment which you can access from here

/usr/local/anaconda/bin/python


#3

Thanks Abhinav. So, should all the python map reduce jobs be executed from /usr/local/anaconda/bin/ directory? Or should I specify this location name whenever executing the job?

Regards,
Sai


#4

Hello,

Thanks for you help earlier. The issue was resolved by running /usr/local/anaconda/bin/python instead of just python. But now I am facing a below new error. Again this looks like a configuration issue if I am not wrong.

Error :
Scanning logs for probable cause of failure…
Can’t fetch history log; missing job ID
Can’t fetch task logs; missing application ID
Step 1 of 1 failed: Command ‘[’/bin/hadoop’, ‘jar’, ‘/usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming.jar’, ‘-files’, ‘hdfs:///user/charantheking16954/tmp
/mrjob/udemy_python_map_reduce_get_ratingscount.charantheking16954.20181206.173037.007147/files/mrjob.zip#mrjob.zip,hdfs:///user/charantheking16954/tmp/mrjob/udemy
_python_map_reduce_get_ratingscount.charantheking16954.20181206.173037.007147/files/setup-wrapper.sh#setup-wrapper.sh,hdfs:///user/charantheking16954/tmp/mrjob/ude
my_python_map_reduce_get_ratingscount.charantheking16954.20181206.173037.007147/files/udemy_python_map_reduce_get_ratingscount.py#udemy_python_map_reduce_get_ratin
gscount.py’, ‘-input’, ‘hdfs:///user/charantheking16954/tmp/mrjob/udemy_python_map_reduce_get_ratingscount.charantheking16954.20181206.173037.007147/files/u.data’,
‘-output’, ‘hdfs:///user/charantheking16954/tmp/mrjob/udemy_python_map_reduce_get_ratingscount.charantheking16954.20181206.173037.007147/output’, ‘-mapper’, ‘sh -
ex setup-wrapper.sh python3 udemy_python_map_reduce_get_ratingscount.py --step-num=0 --mapper’, ‘-reducer’, ‘sh -ex setup-wrapper.sh python3 udemy_python_map_reduc
e_get_ratingscount.py --step-num=0 --reducer’]’ returned non-zero exit status 256.

Can you please help here?

Thanks,
Sai