Creating DataFrame with MYSQL


#1

Hi,

I am not able to create Dataframe with MYSQL. Please guide me ASAP.

[maheshharuvaiah8134@ip-172-31-38-146 sreeram]$ export SPARK_CLASSPATH=/usr/hdp/2.3.4.0-3485/hive/lib/mysql-connector-java.jar
[maheshharuvaiah8134@ip-172-31-38-146 sreeram]$ spark-shell

scala> val sqlContext = new SQLContext(sc)

scala> val df_mysql = sqlContext.read.format(“jdbc”).
| option(“driver”,“com.mysql.jdbc.Driver”).
| option(“url”,“jdbc:mysql://ip-172-31-20-247:3306”).
| option(“dbtable”,“retail_db.ratings”).
| option(“user”,“sqoopuser”).
| option(“password”,“NHkkP876rp”).
| load()
java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)


#2

To find mysql driver you can use the following command:
find /usr/hdp/|grep mysql-connector

This should list the location of mysql connector. Now, pass the mysql-connector to spark-shell:
spark-shell --jars /usr/hdp/2.3.4.0-3485/hive/lib/mysql-connector-java.jar

Now, this will launch the spark-shell’s ‘scala>’ prompt. On the scala prompt please try the following:

var sqlContext = spark.sqlContext
val df_mysql = sqlContext.read.format("jdbc").option("driver","com.mysql.jdbc.Driver").  option("url","jdbc:mysql://ip-172-31-20-247:3306").option("dbtable","retail_db.ratings").option("user","sqoopuser").  option("password","NHkkP876rp").load()
df_mysql.show()

Please note that in spark 2 onwards, the main entry point is “spark”.
The above should give the following result:

scala> df_mysql.show()
+---+-------+--------+------+-------------------+
| id|user_id|movie_id|rating|           rated_at|
+---+-------+--------+------+-------------------+
|  1|    196|     242|     3|1997-12-04 07:55:49|
|  2|    186|     302|     3|1998-04-04 11:22:22|
|  3|     22|     377|     1|1997-11-06 23:18:36|
|  4|    244|      51|     2|1997-11-26 21:02:03|
|  5|    166|     346|     1|1998-02-01 21:33:16|
|  6|    298|     474|     4|1998-01-07 06:20:06|
|  7|    115|     265|     2|1997-12-03 09:51:28|
|  8|    253|     465|     5|1998-04-03 10:34:27|
|  9|    305|     451|     3|1998-02-01 01:20:17|
| 10|      6|      86|     3|1997-12-31 13:16:53|