Spark Shell - Version

Hi, I am having issue in running some spark code which works fine in upgraded versions. The current lab is having 2.1 version. hope this is creating the issue throwing error of my code.

code block: I’m trying to convert a string to date and format it using date_format and when trying to show the results lead to null.

Hi Uday Kiran,

Could you share the code snippets of the code you are trying to run and also the commands that you are using?

Hi Sandeep,

Thanks for the reply. Please find the attached screenshots of the dataframe and code executed in local and cloudxlab

dataframe and local result: https://ibb.co/ftLghrh
cloudxlab result : https://ibb.co/qNSWzy5

When you check the to_date function for upgraded versions it accepts two arguments.


Thank you for bringing it up. Looks like there is some difference in spark version 2.3 and before it.

You can try this using spark 2.3 which you can launch by: /usr/spark2.3/bin/spark-shell

I tested this:

var df = Seq(("06-03-2009"),("07-24-2009")).toDF("Date")

df.select(
    col("Date"),
    to_date(col("Date"),"MM-dd-yyyy").as("to_date")
  ).show()

It gave following results:

+----------+----------+
|      Date|   to_date|
+----------+----------+
|06-03-2009|2009-06-03|
|07-24-2009|2009-07-24|
+----------+----------+

Please note that the same code did not work in spark 2.1. Also, I have to import col and to_date explicitly

import org.apache.spark.sql.functions.{col, to_date}

var df = Seq(("06-03-2009"),("07-24-2009")).toDF("Date")

df.select(
    col("Date"),
    to_date(col("Date"),"MM-dd-yyyy").as("to_date")
  ).show()

it showed:

:29: error: too many arguments for method to_date: (e:org.apache.spark.sql.Column)org.apache.spark.sql.Column to_date(col(“Date”),“MM-dd-yyyy”).as(“to_date”)