Spark code to count the frequency of characters in a file stored in HDFS

abhinav · September 30, 2017, 2:46pm

Problem

Write a Spark code to count the frequency of characters in a file stored in HDFS.

Dataset

The file is located at

/data/mr/wordcount/big.txt

Sample Output

Output contains the characters and their frequency

a     48839
b     84930
c     84939

RANGAREDDYB · August 9, 2018, 11:32am

Can i get code for this?

abhinav · August 11, 2018, 6:09am

I had written MapReduce code for the same long back. Maybe you can take inspiration from that

By the way, writing Spark code for the same is not hard. Maybe you can give it a try and I will help you if you are stuck.

Thanks