How do i import data from HDFS to Jupyter to work on Machine learning

hi, i have question. could any one help me please. My question is “how do i import my data from HDFS to Jupyer”
currently, i’m working on a machine learning project so i wanted to import the data from HDFS to jupyter.
df = pd.read_csv("/user/myid/diabetes.csv") i tried this but not working. please help me

You could use hadoop fs -copyToLocal on the console to copy data from HDFS to the linux console (same as what is visible in jupyter).

Thanks for your reply sgiri, may i know if there is any other way, by just defining the path to load the data?
i appreciate your response. Thanks

Try this:

from hdfs3 import HDFileSystem
hdfs = HDFileSystem();
hdfs.ls("/user/sandeepgiri9034")

import pandas as pd
with hdfs.open('/user/sandeepgiri9034/my_movie_ratings.csv') as f:
     df = pd.read_csv(f, nrows=1000)
df

Hi, this doesn’t seem to be working on the lab:


ImportError: Can not find the shared library: libhdfs3.so
See installation instructions at http://hdfs3.readthedocs.io/en/latest/install.html

How can I fix it please?