将Dataframe写入phoenix

dohp0rv5  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(462)

我试图将Dataframe写入phoenix表,但遇到异常。
这是我的密码:

df.write.format("org.apache.phoenix.spark").mode(SaveMode.Overwrite).options(collection.immutable.Map(
                "zkUrl" -> "localhost:2181/hbase-unsecure",
                "table" -> "TEST")).save();

例外情况是:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most recent failure: 
Lost task 0.3 in stage 3.0 (TID 411, ip-xxxxx-xx-xxx.ap-southeast-1.compute.internal):
java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for jdbc:phoenix:localhost:2181:/hbase-unsecure;
            at org.apache.phoenix.mapreduce.PhoenixOutputFormat.getRecordWriter(PhoenixOutputFormat.java:58)
            at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsNewAPIHadoopDataset$1$anonfun$12.apply(PairRDDFunctions.scala:1030)
            at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsNewAPIHadoopDataset$1$anonfun$12.apply(PairRDDFunctions.scala:1014)
            at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
            at org.apache.spark.scheduler.Task.run(Task.scala:88)
            at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

我补充说 phoenix-spark 以及 phoenix-core jars到my pom.xml

rlcwz9us

rlcwz9us1#

根据PhoenixSpark插件文件,如果你没有,你可能想设置两者 spark.executor.extraClassPath 以及 spark.driver.extraClassPathSPARK_HOME/conf/spark-defaults.conf 包括 phoenix-<version>-client.jar .

相关问题