[impalajdbcdriver](500150)设置/关闭连接时出错:获取错误

oxf4rvwz  于 2021-06-26  发布在  Impala
关注(0)|答案(0)|浏览(449)

我有一个spark(v1.6.0)应用程序,它使用sqlcontext连接到impala以创建Dataframe

Dataframe dataframe = sqlContext.read().format("jdbc").options(ImpalaConnection.getDatabaseConnection(dbtable)).load();

getdatabase方法返回一个带有以下值的Map:impalajdbc4(2.5.35)driver

options.put("url", "jdbc:impala:xxxx:21050;AuthMech=1;KrbRealm=CLOUDERA;KrbHostFQDN=host;KrbServiceName=impala");
    options.put("driver", "com.cloudera.impala.jdbc4.Driver");
    options.put("dbtable", "db.table");
    options.put("partitionColumn", "rowId");
    options.put("numPartitions", "6");
    options.put("upperBound", "10000000");
    options.put("lowerBound", "1");
    options.put("fetchSize", "50000");

    return options;

当我对这个Dataframe执行操作时,dataframe->javardd->map
我得到以下错误

java.sql.SQLException: [Simba][ImpalaJDBCDriver](500150) Error setting/closing connection: Fetch Error.
at com.cloudera.impala.hivecommon.api.HS2Client.checkFetchErrors(HS2Client.java:176)
at com.cloudera.impala.hivecommon.dataengine.BackgroundFetcher.getNextBuffer(BackgroundFetcher.java:201)
at com.cloudera.impala.hivecommon.dataengine.HiveJDBCResultSet.moveToNextRow(HiveJDBCResultSet.java:391)
at com.cloudera.impala.jdbc.common.SForwardResultSet.next(SForwardResultSet.java:2910)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.getNext(JDBCRDD.scala:369)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.hasNext(JDBCRDD.scala:498)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply$mcV$sp(PairRDDFunctions.scala:1197)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1197)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1197)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1251)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1205)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

有什么意见吗?谢谢

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题