hi, i have question. could any one help me please. My question is “how do i import my data from HDFS to Jupyer”
currently, i’m working on a machine learning project so i wanted to import the data from HDFS to jupyter.
df = pd.read_csv("/user/myid/diabetes.csv") i tried this but not working. please help me
You could use hadoop fs -copyToLocal on the console to copy data from HDFS to the linux console (same as what is visible in jupyter).
Thanks for your reply sgiri, may i know if there is any other way, by just defining the path to load the data?
i appreciate your response. Thanks
Try this:
from hdfs3 import HDFileSystem
hdfs = HDFileSystem();
hdfs.ls("/user/sandeepgiri9034")
import pandas as pd
with hdfs.open('/user/sandeepgiri9034/my_movie_ratings.csv') as f:
df = pd.read_csv(f, nrows=1000)
df
Hi, this doesn’t seem to be working on the lab:
ImportError: Can not find the shared library: libhdfs3.so
See installation instructions at http://hdfs3.readthedocs.io/en/latest/install.html
How can I fix it please?