SQOOP import to S3 bucket fails


#1

Hi Team,
I tried running below sqoop command
sqoop import -Dfs.s3a.access.key=<access key> -Dfs.s3a.secret.key=<secret key> --connect jdbc:mysql://cxln2.c.thelab-240901.internal:3306/sqoopex --username sqoopuser --password NHkkP876rp --table widgets --target-dir s3a://manoj1909/widget12345 --split-by id

it failed with below error

19/07/29 10:56:53 WARN s3a.S3AFileSystem: Client: Amazon S3 error 400: 400 Bad Request; Bad Request (retryable)
com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 78B2349F88FDCC09), S3 Extended Request ID: xaJiz
TAo3PxH7RJKBqDdVtXwdjlE9S1rhoj2A2T8cOdA6PKdlW8bYMuI3cU2+NYEewOWHCm4FPc=
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)

Upon searching, one of the suggested resolution was that SQOOP must be updated.

Thus, can you please update sqoop version to latest ?


#2

Can you please let us know to which version we should update it to?


#3

please update to Sqoop 1.4.7


#4

Hi @abhinav : any tentative timeframe for this update?


#5

Are you trying to download data from s3 into hdfs? If so, what is “–connect jdbc:mysql://cxln2.c.thelab-240901.internal:3306/sqoopex” doing in the command?

Can you try following this: https://www.cloudera.com/documentation/enterprise/latest/topics/admin_sqoop_s3_import.html


#6

Hi @sgiri (,@abhinav)
I am trying to copy data from MySQL to AWS S3. I was referring the same link which you’ve posted ( [https://www.cloudera.com/documentation/enterprise/latest/topics/admin_sqoop_s3_import.html]
As per this article, the syntax is
sqoop import -Dfs.s3a.access.key=$ACCES_KEY -Dfs.s3a.secret.key=$SECRET_KEY
–connect $CONN --username $USER --password $PWD --table $TABLENAME --target-dir s3a://example-bucket/target-directory

here $CONN is jdbc:mysql://cxln2.c.thelab-240901.internal:3306/sqoopex which is CloudXLab’s MySQL connection string

This command fails with the Error message which I posted in my original question.

Weird part is, the exact same command is working in another cluster(at my workplace). The only difference between CloudXLab cluster and the other cluster is version of Sqoop.

Thus. I think updating Sqoop to 1.4.7 would help.


#7

Hi @sgiri @abhinav
any update on Sqoop version change ?


#8

Found resolution here :
https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/bk_cloud-data-access/content/s3-trouble-bad-request.html
Issue was, we need to include additional parameter-Dfs.s3a.endpoint=s3.us-east-1.amazonaws.com in Sqoop
sqoop import -Dfs.s3a.access.key=<accessKey>-Dfs.s3a.secret.key=<secretKey> -Dfs.s3a.endpoint=s3.us-east-1.amazonaws.com --connect jdbc:mysql://cxln2.c.thelab-240901.internal:3306/sqoopex --username <MySQLUserName> --password <MySQLPassword> --table widgets --target-dir s3a://<bucketName>/<folderName> --split-by <MySQLColumnName>

Also, make sure to use correct endpoint. endpoint will vary based on the region where your create bucket.
Refer https://docs.aws.amazon.com/general/latest/gr/rande.html go to topic : Amazon Simple Storage Service (Amazon S3) to find your endpoint
my bucket was in US East (N. Virginia) thus endpoint as per above link is s3.us-east-1.amazonaws.com
-Dfs.s3a.endpoint=s3.us-east-1.amazonaws.com