Unable to access hive from pytest

Hi,

We are unable to access hive while running pytest scripts.
Here are a few details about the script.
from pyhive import hive
conn = hive.Connection(host=“YOUR_HIVE_HOST”, port=PORT, username=“YOU”)
Error info:
Traceback (most recent call last):
File “”, line 1, in
File “/home/rdbi/.local/lib/python2.7/site-packages/pyhive/hive.py”, line 15, in
from TCLIService import TCLIService
File “/home/rdbi/.local/lib/python2.7/site-packages/TCLIService/TCLIService.py”, line 9, in
from thrift.Thrift import TType, TMessageType, TFrozenDict, TException, TApplicationException
ImportError: cannot import name TFrozenDict

Also, we are unable to install two more libraries which are needed.
pip install sasl
pip install thrift-sasl

Please help me to resolve the issue.

Thanks

Hi,

May I know in which exercise or topic you are trying to run running pytest scripts?
conn = hive.Connection(host=“YOUR_HIVE_HOST”, port=PORT, username=“YOU”)
In this connection you need to give your hostname, portname and your username not just the string.

All the best!

Hi,

we want to access hive tables through python. so we started python in web console and just type

from pyhive import hive

but it gives error

Traceback (most recent call last):
File “”, line 1, in
File “/home/rdbi/.local/lib/python2.7/site-packages/pyhive/hive.py”, line 15, in
from TCLIService import TCLIService
File “/home/rdbi/.local/lib/python2.7/site-packages/TCLIService/TCLIService.py”, line 9, in
from thrift.Thrift import TType, TMessageType, TFrozenDict, TException, TApplicationException
ImportError: cannot import name TFrozenDict

while doing research we found some more libraries need to install
pip install sasl
pip install thrift-sasl

we are reffering this link.

and for pytest code we written in pyspark to access the hive.

from pyspark.sql import SparkSession
def test_schema_validation():
sparkSession = SparkSession
.builder
.master(“local”)
.appName(“read data from hive table”)
.enableHiveSupport()
.getOrCreate()

schemaDF = sparkSession.sql("select * from dataops.us_covid")
schemaResult = {}
for name, dtype in schemaDF.dtypes:
    schemaResult[name] = dtype
schema_expected = {'date': 'string', 'county': 'string', 'state': 'string', 'fips': 'string', 'cases': 'string','deaths': 'string'}
assert schemaResult == schema_expected

But it gives error like this

======================================================================================= FAILURES =======================================================================================
________________________________________________________________________________ test_schema_validation ________________________________________________________________________________
def test_schema_validation():
sparkSession = SparkSession
.builder
.master(“local”)
.appName(“read data from hive table”)
.enableHiveSupport()
.getOrCreate()

  schemaDF = sparkSession.sql("select * from dataops.us_covid")

test_schma.py:11:


…/.local/lib/python2.7/site-packages/pyspark/sql/session.py:646: in sql
return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
…/.local/lib/python2.7/site-packages/py4j/java_gateway.py:1305: in call
answer, self.gateway_client, self.target_id, self.name)
…/.local/lib/python2.7/site-packages/pyspark/sql/utils.py:137: in deco
raise_from(converted)


e = AnalysisException()
def raise_from(e):

  raise e

E AnalysisException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwxr-xr-x;

Thanks

Hi,

We already installed the PyHive but when I tried to import it shows error.
Please find the attachment.

Hi.

You need to upgrade the Pip version from 20.0.2 to 20.1.1.
By pip install -U pip
But you do not have permissions for upgrading any packages.
Kindly try in your local.

All the best!

Hi,

We are using Cloudx environment to demo to customer. It’s important for us to execute pytest in Cloudx environment. Please let us know if there is a way to update in our subscribed Cloudx environment?

Please install python packages in virtual environment.

This blog will help you with the same