Unable create Data base using Spark.sql

spark.sql(“create database boss_retail_db”)

pyspark.sql.utils.AnalysisException: u’org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Unable to create database path file:/home/bublyboss6247/spark-warehouse/b
oss_retail_db.db, failed to create database boss_retail_db);’

Can you share the entire code?

Hi @sandeepgiri I am getting the error

AnalysisException: ‘org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.security.AccessControlException: Permission denied: user=isusmitabiswas3880, access=WRITE, inode="/":hdfs:hdfs:drwxr-xr-x\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1955)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1939)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1913)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:8750)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:2089)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.java:1466)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345)\n);’

while I am trying to execute below code
import findspark
findspark.init(’/usr/spark2.4.3’)
import os
import sys

os.environ[“SPARK_HOME”] = “/usr/spark2.4.3”
os.environ[“PYLIB”] = os.environ[“SPARK_HOME”] + “/python/lib”

os.environ[“PYSPARK_PYTHON”] = “/usr/local/anaconda/bin/python”
os.environ[“PYSPARK_DRIVER_PYTHON”] = “/usr/local/anaconda/bin/python”
sys.path.insert(0, os.environ[“PYLIB”] +"/py4j-0.10.7-src.zip")
sys.path.insert(0, os.environ[“PYLIB”] +"/pyspark.zip")
from os.path import abspath
from pyspark.sql import SparkSession
warehouse_loc = abspath(‘spark-warehouse’)
spark = SparkSession.builder.appName(“Spark SQL basic example”).config(‘spark.sql.warehouse.dir’, warehouse_loc).enableHiveSupport().getOrCreate()
spark.sql(“create database if not exists employee_db”)

Please share your reply asap.

Hey Susmita,

The error you are facing is a permission error as you do not have write permission in the root directory of hdfs. Also, spark SQL takes the HDFS path instead of the local path. So, you can modify the variable warehouse_loc as:

warehouse_loc = "/user/<<your_username>>/spark-warehouse"

Make sure to create spark-warehouse directory in your HDFS home directory before it.