hortonworks oozie spark action-空点异常

z18hc3ub  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(267)

我正在运行hdp2.5.3和oozie4.2.0。spark操作设置为在客户机模式下运行。spark作业用于从配置单元表中获取数据、处理数据并将其存储在hdfs中。但当我尝试提交spark action的spark应用程序时,我发现 NullPointerException .
工作流.xml

<workflow-app xmlns="uri:oozie:workflow:0.5" name="Spark_Test">
   <global>
      <job-tracker>${job_tracker}</job-tracker>
      <name-node>${name_node}</name-node>
   </global>
   <credentials>
      <credential name="hiveCredentials" type="hive2">
         <property>
            <name>hive2.jdbc.url</name>
            <value>${hive_beeline_server}</value>
         </property>
         <property>
            <name>hive2.server.principal</name>
            <value>${hive_kerberos_principal}</value>
         </property>
      </credential>
   </credentials>
   <start to="SparkTest" />
   <action name="SparkTest" cred="hiveCredentials">
      <spark xmlns="uri:oozie:spark-action:0.1">
         <job-tracker>${job_tracker}</job-tracker>
         <name-node>${name_node}</name-node>
         <master>yarn-client</master>
         <name>Spark Hive Example</name>
         <class>com.fbr.genjson.exec.GenExecJson</class>
         <jar>${jarPath}/fedebomrpt_genjson.jar</jar>
         <spark-opts>--jars /usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar --files /etc/hive/conf/hive-site.xml --conf spark.sql.hive.convertMetastoreOrc=false --driver-memory 2g --executor-memory 16g --executor-cores 4 --conf spark.ui.port=5051 --queue fbr</spark-opts>
         <arg>${arg1}</arg>
         <arg>${arg2}</arg>
      </spark>
      <ok to="end" />
      <error to="fail" />
   </action>
   <kill name="fail">
      <message>Spark Java PatentCitation failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
   </kill>
   <end name="end" />
</workflow-app>

例外情况:

SERVER[xxx.hpc.xx.com] USER[prxtcbrd] GROUP[-] TOKEN[] APP[Spark_Test] JOB[0004629-170625082345353-oozie-oozi-W] ACTION[0004629-170625082345353-oozie-oozi-W@SparkTest] Error starting action [SparkTest]. ErrorType [ERROR], ErrorCode [NullPointerException], Message [NullPointerException: null]
org.apache.oozie.action.ActionExecutorException: NullPointerException: null
    at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:446)
    at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1202)
    at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1373)
    at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232)
    at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
    at org.apache.oozie.command.XCommand.call(XCommand.java:287)
    at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:331)
    at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:260)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
    at org.apache.oozie.action.hadoop.SparkActionExecutor.setupActionConf(SparkActionExecutor.java:85)
    at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1091)
    ... 11 more

我不知道我在哪里做错了。。我是否需要添加除 hive-site.xml ?

lnlaulya

lnlaulya1#

在您的示例中,导入jar、文件(hive site.xml)。我认为没有必要进口这些东西,oozie已经进口了。你能检查一下下面的Spark行动我想它可能会解决你的问题。

<action name="myfirstsparkjob" cred="hive_credentials">
    <spark xmlns="uri:oozie:spark-action:0.1">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <configuration>
            <property>
                <name>mapred.compress.map.output</name>
                <value>true</value>
            </property>
            <property>
                <name>mapred.job.queue.name</name>
                <value>${queueName}</value>
            </property>
        </configuration>
        <master>yarn</master>
        <mode>cluster</mode>
        <name>Spark Hive Example</name>
        <class>com.fbr.genjson.exec.GenExecJson</class>
        <jar>${jarPath}/fedebomrpt_genjson.jar</jar>
        <spark-opts>--queue queue_name --executor-memory 28G --num-executors 70 --executor-cores 5</spark-opts>
    </spark>
  <ok to="end" />
  <error to="fail" />

并在workflow.xml文件中设置以下oozie属性 oozie.use.system.libpath=true oozie.libpath=${jarPath} 确保将所有用户创建的lib和文件放在${jarfile}中

相关问题