pyhive和kerberos在几次调用后抛出身份验证错误

w8rqjzmb  于 2021-06-27  发布在  Hive
关注(0)|答案(1)|浏览(340)

我正在尝试使用python(pyhive lib)连接到配置单元以读取一些数据,然后我进一步将其连接到配置单元flask以在 Jmeter 板中显示。
这一切工作正常,为少数调用Hive,但不久之后,我得到以下错误。

Traceback (most recent call last):
  File "libs/hive.py", line 63, in <module>
    cur = h.connect().cursor()
  File "libs/hive.py", line 45, in connect
    kerberos_service_name='hive')
  File "/home1/igns/git/emsr/.venv/lib/python2.7/site-packages/pyhive/hive.py", line 94, in connect
    return Connection(*args,**kwargs)
  File "/home1/igns/git/emsr/.venv/lib/python2.7/site-packages/pyhive/hive.py", line 192, in __init__
    self._transport.open()
  File "/home1/igns/git/emsr/.venv/lib/python2.7/site-packages/thrift_sasl/__init__.py", line 79, in open
    message=("Could not start SASL: %s" % self.sasl.getError()))
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_cdc995595290_51CD7j))

下面是我的代码

from pyhive import hive
class Hive(object):
    def connect(self):
        return hive.connect(host='hive.hadoop-prod.abc.com',
                            port=10000,
                            database='temp',
                            username='gaurang.shah',
                            auth='KERBEROS',
                            kerberos_service_name='hive')

if __name__ == '__main__':

    h = Hive()
    cur = h.connect().cursor()
    cur.execute("select * from temp.migration limit 1")
    res = cur.fetchall()
    print res

调用脚本

source .venv/bin/activate
for i in {1..50}
do
    python get_hive_data.py
    sleep 300
done

观察当它工作时,我可以看到服务主体中的配置单元,但当我执行klist时,我看不到上面的错误消息。
当它工作的时候

Ticket cache: FILE:/tmp/krb5cc_cdc995595290_XyMnhu
Default principal: gaurang.shah@ABC.COM

Valid starting       Expires              Service principal
12/04/2018 14:37:28  12/05/2018 00:37:28  krbtgt/ABC.COM@ABC.COM
    renew until 12/05/2018 14:37:24
12/04/2018 14:39:06  12/05/2018 00:37:28  hive/hive_server.ABC.COM@ABC.COM
    renew until 12/05/2018 14:37:24

当它不起作用的时候

Ticket cache: FILE:/tmp/krb5cc_cdc995595290_XyMnhu
Default principal: gaurang.shah@ABC.COM

Valid starting       Expires              Service principal
12/04/2018 14:37:28  12/05/2018 00:37:28  krbtgt/ABC.COM@ABC.COM
    renew until 12/05/2018 14:37:24

更新:
所以我不认为是在某个特定的电话之后,但是我认为是在某个特定的时间之后我想是一个小时)。我把睡眠时间改为3600秒,就在第一次通话后,我开始出错。
这很奇怪,因为 hive/hive_server.ABC.COM@ABC.COM 有效期为7天

gudnpqoy

gudnpqoy1#

我知道这是一个老职位。但如果每次打电话时都建立新连接,则应该解决此问题。

from pyhive import hive
class Hive(object):
    def connect(self):
        return hive.connect(host='hive.hadoop-prod.abc.com',
                            port=10000,
                            database='temp',
                            username='gaurang.shah',
                            auth='KERBEROS',
                            kerberos_service_name='hive')

if __name__ == '__main__':
    def newConnect(query):
       h = Hive()
       cur = h.connect().cursor()
       cur.execute(query)
       res = cur.fetchall()
       return res

    myConnectionAndResults = newConnect("select * from temp.migration limit 1")
    print myConnectionAndResults

相关问题