win10 Python3.6連線hive
環境
windows 10
python 3.6
hive 3.1.0
hive部署在虛擬機器上
Python依賴包
- thrift
- thriftpy
- thrift_sasl
- pure_sasl
- impyla
- bitarray
如果無法使用pip install安裝上述包的話,可以在https://www.lfd.uci.edu/~gohlke/pythonlibs/網頁下載相應的whl檔案,使用python install XXX.whl命令進行安裝
開啟服務
- 開啟hadoop叢集 start-all.sh
- 開啟hive server2並在後臺執行
hive --service metastore &
hive --service hiveserver2 &
連線測試
from impala.dbapi import connect
conn = connect(host='192.168.33.101', port=10000, database='***', user='***', password='', auth_mechanism='NOSASL')
cur = conn.cursor()
auth_mechanism 需要與hive-site.xml檔案中hive.server2.authentication欄位value值相對應
報錯
1
ThriftParserError: ThriftPy does not support generating module with path in protocol ‘d’
修改site-packages\thriftpy\parser\parser.py檔案
修改前:
if url_scheme == '':
with open(path) as fh:
data = fh.read()
elif url_scheme in ('http', 'https'):
data = urlopen(path).read()
修改後:
if url_scheme == '': with open(path) as fh: data = fh.read() elif url_scheme in ('c', 'd','e','f''): with open(path) as fh: data = fh.read() elif url_scheme in ('http', 'https'): data = urlopen(path).read()
2
Traceback (most recent call last):
File “”, line 1, in
File “D:\Coder\anaconda\lib\site-packages\impala\hiveserver2.py”, line 125, in cursor
session = self.service.open_session(user, configuration)
File “D:\Coder\anaconda\lib\site-packages\impala\hiveserver2.py”, line 995, in open_session
resp = self._rpc(‘OpenSession’, req)
File “D:\Coder\anaconda\lib\site-packages\impala\hiveserver2.py”, line 925, in _rpc
err_if_rpc_not_ok(response)
File “D:\Coder\anaconda\lib\site-packages\impala\hiveserver2.py”, line 704, in err_if_rpc_not_ok
raise HiveServer2Error(resp.status.errorMessage)
impala.error.HiveServer2Error: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: root is not allowed to impersonate JoJo
參考stackoverflow,修改core-site.xml檔案,將hive.server2.enable.doAs欄位置為false問題解決