Mapper-reducer in select count(*) but not in select *

When I run below command , looks like hive runs mapper-reducer and then shows the result -

  hive>  select  count(*)  from nyse;

but below command gives the output instantly like in any other rdb -

 hive>  select  *  from nyse limit 3;

Why is different behaviour for two commands ? Does hive not run maper reducer for the 2nd command ?

Good observation.

The first query includes aggregation therefore the map-reduce framework is being used.

The second query is essentially reading the files from HDFS and presenting it. Therefore, it does not require any map-reduce phase.

1 Like

Thanks Sandeep !!!

Does that mean only the aggregation queries invoke map-reduce?

Yes, they do.

Imagine counting the number of people in a country. It is better to ask individual city owners to get the counts and then do the sum of all counts than going to each city and counting by yourself.

Therefore, the distributed computing matters even in simple aggregation operations such as counting.

1 Like