Is is possible to drop more than one data input files in the input folder here?. If so, how they will get processed by M/R?
Yes, as per their documentation, It should be possible.
We can use either of the three functions:
addInputPaths - public static void addInputPaths(JobConf conf, String commaSeparatedPaths)
setInputPaths - public static void setInputPaths(JobConf conf, Path… inputPaths)
addInputPath - public static void addInputPath(JobConf conf, Path path)
Please see API fore more details: https://hadoop.apache.org/docs/r2.7.1/api/org/apache/hadoop/mapred/FileInputFormat.html#setInputPaths(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.fs.Path…)
Thanks, Sandeep. My question is, what if you have multiple input data files in a single input folder?. How they will get processed by the M/R?.
I would love to understand a little more.
The mapper will be executed on each inputsplit created from the data of these files. The input split is generally created for each block of each files in the input.