Analyses on structure data by different ways in hadoop

vivek_hadoop · December 8, 2018, 4:40am

we can do analyses on structure data by different ways in hadoop

by using hive queries in hive
:-> can we store output of these queries back to hdfs
by using RDD in spark
:->can we store output after applying transformation and action back to hdfs
by using dataframe in spark
:->can we store output of dataframe back to hdfs
by using sparksql
:-> can we store output of sparksql back to hdfs
by using hivecontext in spark
:-> can we store output back to hdfs

sgiri · December 14, 2018, 2:43pm

By using hive queries in the hive. can we store the output of these queries back to hdfs?

Yes. like this: INSERT OVERWRITE DIRECTORY ‘/my/output/directory’ SELECT * FROM mytable somecol= somevalue;

by using RDD in spark can we store output after applying transformation and action back to hdfs

Yes. You can use the actions such as rdd.saveAsTextFile("/path/to/output/folder/in/hdfs/")

Note that this function is same in python and scala.

by using dataframe in spark can we store output of dataframe back to hdfs

Yes. like this: mydatafram.write.csv("/my_output/folder")

by using sparksql can we store output of sparksql back to hdfs

Yes, either by the way of point number 3 or point number 1

By using hivecontext in spark, can we store output back to hdfs

Yes.