Sqoop import to hive as parquet file is failing

sqoop import
–connect jdbc:mysql://ip-172-31-13-154/retail_db
–username sqoopuser
–password NHkkP876rp
–table mydata
–hive-import
–hive-table sshmparquet333
–as-parquetfile
-m 1

while running above code following error is coming and jobs are getting failed.
18/04/05 05:36:36 INFO mapreduce.Job: Task Id : attempt_1517296050843_11482_m_000000_0, Status : FAILED
Error: org.kitesdk.data.DatasetNotFoundException: Descriptor location does not exist: file:/tmp/default/.temp/job_1517296050843_11482/mr/job_15172960
50843_11482/.metadata
at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.checkExists(FileSystemMetadataProvider.java:562)
at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.find(FileSystemMetadataProvider.java:605)
at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.load(FileSystemMetadataProvider.java:114)
at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:192)
at org.kitesdk.data.spi.AbstractDatasetRepository.load(AbstractDatasetRepository.java:40)
at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.loadJobDataset(DatasetKeyOutputFormat.java:544)
at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.loadOrCreateTaskAttemptDataset(DatasetKeyOutputFormat.java:555)
at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.loadOrCreateTaskAttemptView(DatasetKeyOutputFormat.java:568)
at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.getRecordWriter(DatasetKeyOutputFormat.java:426)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

please help me to resove it ,I caannt able to find any solution.

Looks like kitesdk issue. Please try changing the dependency version for kite-sdk from 1.0.0 to 1.1.0.

hi thanks for reply.

will you please explain how to change the dependency and where to do it.

ls /usr/hdp/2.3.4.0-3485/sqoop/lib

by using it i found the location of kite jars of following 1.0.0 version

kite-data-core-1.0.0.jar
kite-data-hive-1.0.0.jar
kite-data-mapreduce-1.0.0.jar
kite-hadoop-compatibility-1.0.0.jar

while i try to copy new jars of version 1.1.0 i am betting permission denied as follows.

[manasareddy24032216@ip-172-31-60-179 lib]$ unzip /home/manasareddy24032216/kitejarmr.zip
Archive: /home/manasareddy24032216/kitejarmr.zip
replace jackson-databind-2.3.1.jar? [y]es, [n]o, [A]ll, [N]one, [r]ename: y
error: cannot delete old jackson-databind-2.3.1.jar
Permission denied
replace commons-logging-1.1.1.jar? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
error: cannot delete old commons-logging-1.1.1.jar
Permission denied
error: cannot create parquet-avro-1.6.0.jar
Permission denied

so please explain me other way to include those jars or please change the jars to 1.1.0 version if posible

Please anyone respond to this issue.

Hi Manasa,

This was occurring because the kite-sdk which are shipped with HDP were older. So, we had to debug the issue and manually copy the kite-sdk to sqoop folder.

Now, you can try again. I have tested with the following sqoop command and it seems to work fine:

sqoop import --connect jdbc:mysql://ip-172-31-13-154/retail_db --username sqoopuser --password XXXXX --table mydata --hive-import --hive-table sg_123 --as-parquetfile --m 1 --create-hive-table

Note that you might have replaced XXXXX with the actual password of MySQL as mentioned in CloudxLab.

All, we basically did was downloaded the new jars of kite-sdk and placed in the lib folder of sqoop.