Java Out of Memory Issue with Movie dataframe SQL


I am running the below data frame operations/ data frame SQLs but not able to complete due to java out of memory exception error:
Error:… Name: org.apache.spark.SparkException
Message: Job aborted due to stage failure: Task 1 in stage 40.0 failed 1 times, most recent failure: Lost task 1.0 in stage 40.0 (TID 287, localhost, executor driver): java.lang.OutOfMemoryError: Java heap space…


//val joindf=moviesdf.join(ratingsdf, moviesdf.col(“MovieID”)===ratingsdf.col(“MovieID”)).filter($“Genre”.like("%War%"))

spark.sql(“select Name, avg(rating) avg_rating from movies m join ratings r on m.MovieID=r.MovieID where genre like ‘%War%Comedy%’ group by Name”).show(10)
//spark.sql(“select name,rating from movies m join rating r on m.movieid=r.ratingid limit 10”)


not able to post the full code as its throwing error saying new user can put only 2 links :frowning:


Hi @sudhanshu_guru,

Can you try in yarn mode? If your code is using more than 2GB of memory either in local or yarn mode it will be killed by our bots.