Hi @abhinavsingh & Team,
I am unable to run my Jupyter Notebook which was previously working in Spark 2.x after the lab upgrade. May i know how to fix this?
Hi @abhinavsingh & Team,
I am unable to run my Jupyter Notebook which was previously working in Spark 2.x after the lab upgrade. May i know how to fix this?
it works now… thank you
import os
import sys
os.environ[“SPARK_HOME”] = “/usr/hdp/current/spark2-client”
os.environ[“PYLIB”] = os.environ[“SPARK_HOME”] + “/python/lib”
os.environ[“PYSPARK_PYTHON”] = “/usr/local/anaconda/bin/python”
os.environ[“PYSPARK_DRIVER_PYTHON”] = “/usr/local/anaconda/bin/python”
sys.path.insert(0, os.environ[“PYLIB”] +"/py4j-0.10.6-src.zip")
sys.path.insert(0, os.environ[“PYLIB”] +"/pyspark.zip")
Please find my answers below
May i know how do i submit pyspark job using spark-submit in the shell?
Please check Spark documentation for the same.
Should my pyspark script include the following lines while submitting the script to spark-submit?
If you are running spark-submit from the command line then you do not require the above Python code.
Hope this helps.
Thanks