Hello,
I have created following dataframe:
df = spark.read.csv(“file:///home/pratik58892973/olist_sellers_dataset.csv”, header=“True”, sep="|")
While I’m able to fetch data:
df.show(5)
and display the schema:
df.printSchema()
root
|-- seller_id,seller_zip_code_prefix,seller_city,seller_state: string (nullable = true)
It is showing an error while fetching a single column value for the given dataframe
df.select(“seller_city”).show()
Error:
AnalysisException: “cannot resolve ‘seller_city
’ given input columns: [seller_id,seller_zip_code_prefix,seller_city,seller_state];;\n’Project ['seller_city]\n± Relation[seller_id,seller_zip_code_prefix,seller_city,seller_state#10] csv\n”
Can anyone suggest the solution?