Hive中带有bigint柱的Parquet桌

3pmvbmvn  于 2021-06-24  发布在  Hive
关注(0)|答案(0)|浏览(311)

当我尝试使用pyspark读取具有bigint列的Parquet地板表时,它给出了一个错误。有什么建议吗?

df=spark.table("db.table").filter("partitiondate='2020-10-22'");
df.count()   // this works
df.show(1)  // Gives error

[阶段5:=============>(1+1)/4]20/10/23 14:14:04警告调度器.tasksetmanager:阶段5.0中丢失任务0.0(tid 275,anp-r12wn03.c03.hadoop.td.com,executor 13):java.lang.unsupportedoperationexception:parquet.column.values.dictionary.plainvaluesdictionary$plainbinarydictionary位于parquet.column.dictionary.decodetolong(dictionary)。java:52)在org.apache.spark.sql.execution.vectoried.onheapcolumnvector.getlong(onheapcolumnvector。java:295)在org.apache.spark.sql.execution.vectoried.columnarbatch$row.getlong(columnarbatch。java:191)位于org.apache.spark.sql.catalyst.expressions.generatedclass$specificansafeprojection.apply6$(未知源),位于org.apache.spark.sql.catalyst.expressions.generatedclass$specificansafeprojection.apply(未知源)org.apache.spark.sql.catalyst.expressions.generatedclass$specificansafeprojection.apply(未知源代码)位于scala.collection.iterator$$anon$11.next(iterator)。scala:409)在scala.collection.iterator$$anon$11.next(iterator。scala:409)在org.apache.spark.sql.execution.sparkplan$$anonfun$2.apply(sparkplan。scala:235)在org.apache.spark.sql.execution.sparkplan$$anonfun$2.apply(sparkplan。scala:228)在org.apache.spark.rdd.rdd$$anonfun$mappartitionsinternal$1$$anonfun$apply$25.apply(rdd。scala:835)在org.apache.spark.rdd.rdd$$anonfun$mappartitionsinternal$1$$anonfun$apply$25.apply(rdd。scala:835)在org.apache.spark.rdd.mappartitionsrdd.compute(mappartitionsrdd。scala:49)在org.apache.spark.rdd.rdd.computeorreadcheckpoint(rdd。scala:323)在org.apache.spark.rdd.rdd.iterator(rdd。scala:287)在org.apache.spark.scheduler.resulttask.runtask(resulttask。scala:87)在org.apache.spark.scheduler.task.run(task。scala:109)在org.apache.spark.executor.executor$taskrunner.run(executor。scala:380)位于java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor。java:1149)在java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor。java:624)在java.lang.thread.run(线程。java:748)
[第五阶段:============================>(2+2)/4]20/1

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题