How to run MapReduce code?

Wright_Jim · May 18, 2017, 6:01pm

Hi,

I have just subscribed to CloudxLab for 3 months.

Congrats for providing lab for practicing Big Data. I had tried setting up virtual machine in the past, but it was too slow even in 8GB RAM machine. Overall learning experience was not good in virtual machines.

One quick question, can you please let me know how can I write and run MapReduce logic in CloudxLab?

Thanks

Jim

abhinav · May 18, 2017, 4:28pm

Thanks for your kind words Jim.

I agree with the pain points on virtual machines.

We’ve created a series of videos and assessments on writing and running MapReduce code on CloudxLab. These videos are part of our Big Data with Hadoop and Spark course.

Please access the videos and slides here

In above videos, we have shown

How to write MapReduce code using Java and Eclipse
Build MapReduce project using Apache Ant
Run MapReduce code using Hadoop Streaming

In the end, we have assessments, where you have to write code for problems using MapReduce and run it on CloudxLab. Some of the problems include

Count the frequency of characters in a file stored in HDFS
Find anagrams in a text file stored in HDFS
Find users having same DNA
Find users having mirror DNA

Please watch the videos and write the code for the above problems. Hope these videos and assessments will help you in writing MapReduce logic properly.

Happy learning!

Wright_Jim · May 18, 2017, 7:01pm

Thank you so much for a detailed answer Abhinav.

I will watch the videos and will come back to you if I’ll have any queries.

Thanks again

Best,
Jim