Talend BigData Open Studio ETL connection with CloudX Hadoop cluster

Hi,

I’m trying to establish connection with Talend BigData Open Studio ETL with CloudX Hadoop cluster.
I was able to fetch cluster details through Ambari to Talend Hadoop connection BUT , in next step, it fails bcs It’s not able to reach the nodes from configurations. It seems, Cloudx Hadoop cluster’s nodes are not accessible from outside. Could you please anyone help me on this?

Hi,
You can use putty client or using the SSH protocol you can connect.
Refer below.

https://cloudxlab.com/faq/28/how-do-i-connect-to-cloudxlab-from-my-local-machine

It’s not the same thing.

1 Like

@Cloudx Lab, Could you please help me know how to access hadoop cluster using JDBC driver supported ETL tool like Talend?

Pl use the public up addresses as mentioned in ‘my lab’ section.

Hi @Arif_Hussain,

Namenode ports are blocked for outside CloudxLab environment so unfortunately there is no way to connect to HDFS outside the lab environment.

Thanks,
Abhinav

Thanks for confirmation.

I have other question? is there anyway if i want to upload external XML SerDe for Hive to its lib location?

Hi @Arif_Hussain,

We can add JAR in Hive shell with the help of ADD JAR command.

Its load jar only to that particular CLI session. I have scenario where i do join between two xml serde tables through ETLs and it doesn’t find the jar even from HDFS through “add Jar” ETL since it starts MapReduce job and its session is different than HIVE CLI. EVEN though i add jar through ETL it only tagged with single node not on cluster. So, i think solution will be place this jar in HIVE lib directory and configure it from HIVE_SITE.xml. I would love to hear it your inputs on it.