First raw data that is HDFS file blocks ( of size 128 MB) gets mapped , then reduced , and then saved in HDFS again. I am not able to understand , why the data is again saved in HDFS ?
Please elaborate
Hi @asmita,
Because the final output of the MapReduce can be large enough to fit in the one machine. This is why the output is again saved back to HDFS.