Query Regarding Mapreduce

asmita · December 21, 2018, 5:19pm

First raw data that is HDFS file blocks ( of size 128 MB) gets mapped , then reduced , and then saved in HDFS again. I am not able to understand , why the data is again saved in HDFS ?
Please elaborate

abhinavsingh · December 23, 2018, 4:30pm

Hi @asmita,

Because the final output of the MapReduce can be large enough to fit in the one machine. This is why the output is again saved back to HDFS.