I have launched the pyspark with following command
pyspark --packages com.databricks:spark-avro_2.10:2.0.1 --master yarn --conf spark.ui.port=11111
also i tried to store my dataframe in avro file format by using following source code
df.repartition(2).write.format(‘com.databricks.spark.avro’).save("/user/kkgprest27288/wordcountavro",compression=“none”)
But,I am getting following error:
py4j.protocol.Py4JJavaError: An error occurred while calling o105.save.
: java.lang.ClassNotFoundException: org.apache.spark.sql.sources.HadoopFsRelationProvider was removed in Spark 2.0. Please check if your library is compatible with Spark 2.0
In some cases I am also getting following error messages
Failed to find data source: com.databricks.spark.avro. Please find an Avro package at http://spark.apache.org/third-party-projects.html
could you please help me?