I want to display data by using matplotlib in pyspark sql.
I am fetching data from hive for analysis and now I want to display the same using bar/histogram.
Is there any way to do this? If yes please suggest.
Follow this to access spark in Python 3 jupyter notebook: https://cloudxlab.com/blog/running-pyspark-jupyter-notebook/
Follow the standard tutorial to load data from Hive:
Once you have loaded the data from Hive, you can use the “take()” or collect() to bring data from DataFrame to in-memory and then use matplotlib on the in-memory data.