spark应用程序在独立模式下提交错误

9rygscc1  于 2021-05-27  发布在  Spark
关注(0)|答案(0)|浏览(297)

环境:

OS: CentOS 7
Java: 1.8  
Spark: 2.4.5  
Hadoop: 2.7.7
Scala: 2.12.11  
Hardware: 3 computers

我构建了一个简单的scala应用程序。我的代码是:

import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}

object wordCount {
  def main(args: Array[String]): Unit = {
    val conf: SparkConf = new SparkConf().setAppName("WordCount")
    val context: SparkContext = new SparkContext(conf)
    val lines: RDD[String] = context.textFile(args(0))
    val words: RDD[String] = lines.flatMap(_.split(" "))
    val tuples: RDD[(String, Int)] = words.map((_, 1))
    val sumed: RDD[(String, Int)] = tuples.reduceByKey(_ + _)
    val sorted: RDD[(String, Int)] = sumed.sortBy(_._2, false)
    sorted.saveAsTextFile(args(1))
    context.stop()
  }
}

文件build.sbt是:

name := "SparkScalaTest2"

version := "0.1"

scalaVersion := "2.12.11"

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.4.5"

我的目录布局如下所示:

$ find .
.
./build.sbt
./src
./src/main
./src/main/scala
./src/main/scala/wordCount.scala

然后我用 sbt package 要打包jar,请使用以下命令在spark上提交应用程序:

spark-submit \
--class wordCount \
--master spark://master:7077 \
--executor-memory 512M \
--total-executor-cores 3 \
/home/spark/IdeaProjects/SparkScalaTest2/target/scala-2.12/sparkscalatest2_2.12-0.1.jar \
hdfs://master:9000/data/text.txt \
hdfs://master:9000/result/wordCount

但是我在bash shell中发现了一个错误:

20/07/20 10:11:18 INFO spark.SparkContext: Created broadcast 0 from textFile at wordCount.scala:39
Exception in thread "main" java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: scala/runtime/java8/JFunction2$mcIII$sp
        at wordCount$.main(wordCount.scala:45)
        at wordCount.main(wordCount.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.NoClassDefFoundError: scala/runtime/java8/JFunction2$mcIII$sp
        ... 14 more
Caused by: java.lang.ClassNotFoundException: scala.runtime.java8.JFunction2$mcIII$sp
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        ... 14 more

为应用程序分配的执行器的stderr日志页显示以下错误:

20/07/20 10:11:48 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIG

我在网上搜索,发现scala的版本可能导致这个问题。但spark的官方网站(http://spark.apache.org/docs/2.4.5/index.html)说的是:
spark运行在Java8、Python2.7+/3.4+和R3.1+上。对于ScalaAPI,Spark2.4.5使用Scala2.12。您需要使用兼容的scala版本(2.12.x)。
我不知道为什么。无论如何,我试着按照sparkshell的建议安装scala-2.11.12并设置环境变量。提交应用程序后,bash shell中没有显示任何错误,但应用程序的已分配执行器的stderr日志页显示以下错误:

20/07/21 14:13:12 INFO executor.Executor: Finished task 1.0 in stage 4.0 (TID 7). 1459 bytes result sent to driver
20/07/21 14:13:12 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown
20/07/21 14:13:12 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
tdown

尽管出现了错误,但应用程序似乎可以成功运行,这次我得到了正确的输出。当我查看项目结构时,我找到了目录 ./project/target/scala-2.12 以及 ./target/scala-2.11 . 为什么这两个目录建议使用不同版本的scala?这就是我问题的原因吗?我怎样才能解决这个问题?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题