Hi Team,
I have written a java code using spark streaming library to read the data from kafka topic . Basically spark streaming app is reading data from kafka topic. When I tried to execute on the cluster , I got the below error:-
[vardhan17126602@cxln4 ~]$ java -cp DataFlow.jar:/home/vardhan17126602/* com.vardhan.SparkStreamingApp
Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:76)
at org.apache.spark.SparkConf.(SparkConf.scala:71)
at org.apache.spark.SparkConf.(SparkConf.scala:58)
at com.vardhan.SparkStreamingApp.main(SparkStreamingApp.java:21)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
… 4 more
I checked StackOverflow for this error and found below link
It suggests some changes in spark configuration files for which I won’t have access. Can you please take a look and let me know if something can be done or if I am doing something wrong.
Code and related jars are available at: /home/vardhan17126602
DataFlow.jar
Regards
Vardhan Bhoumik