我正在使用docker在windows机器上执行我的示例spark+kafka项目。我正面临
Failed to find data source: kafka. Please deploy the application as per the deploym ent section of "Structured Streaming + Kafka Integration Guide".;
构建.sbt
lazy val root = (project in file(".")).
settings(
inThisBuild(List(
version := "0.1.0",
scalaVersion := "2.12.2",
assemblyJarName in assembly := "sparktest.jar"
)),
name := "sparktest",
libraryDependencies ++= List(
"org.apache.spark" %% "spark-core" % "2.4.0",
"org.apache.spark" %% "spark-sql" % "2.4.0",
"org.apache.spark" %% "spark-streaming" % "2.4.0",
"org.apache.spark" %% "spark-sql-kafka-0-10" % "2.4.0" % "provided",
"org.apache.kafka" %% "kafka" % "2.1.0",
"org.scalatest" %% "scalatest" % "3.0.5",
"com.typesafe.scala-logging" %% "scala-logging" % "3.9.0"
),
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-databind" % "2.9.8",
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-core" % "2.9.8",
dependencyOverrides += "com.fasterxml.jackson.module" % "jackson-module-scala_2.12" % "2.9.8")
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs@_*) => MergeStrategy.discard
case x => MergeStrategy.first
}
代码
val inputStreamDF = spark.readStream.format("kafka").option("kafka.bootstrap.servers", "kafka:9092")
.option("subscribe", "test1")
.option("startingOffsets", "earliest")
.load()
有没有人遇到过类似的问题,你是如何解决的?
暂无答案!
目前还没有任何答案,快来回答吧!