How to write to HDFS using spark RDD?


#1

I wanted to try

studentsRdd.saveAsTextFile(“hdfs://ip_addr:9000//myNewFolder”) but what would be the value of ip_addr i.e. name node ip and port number. How to get that from Ambari? Also name node UI like is not opening


#2

After Logon to ambari UI navigate to HDFS Component and Look for “Quick Links”, After Clicking on this You can see Name Node Details ( Both Active and Stand By)


#3

On CloudxLab, we have already configured Spark such that it knows where is HDFS, so you don’t need to specify the absolute URL.
You can simply do: studentsRdd.saveAsTextFile(“myfolder”)
Please note “myfolder” will be created. If it is already existing, it throws error.