提供scala sbt库依赖项-避免下载第三方库

wnvonmuf  于 2021-05-27  发布在  Spark
关注(0)|答案(1)|浏览(398)

我有以下引用第三方库的spark scala代码,

package com.protegrity.spark

import org.apache.spark.sql.api.java.UDF2
import com.protegrity.spark.udf.ptyProtectStr
import com.protegrity.spark.udf.ptyProtectInt

class ptyProtectStr extends UDF2[String, String, String] {

  def call(input: String, dataElement: String): String = {
    return ptyProtectStr(input, dataElement);
  }
}

class ptyUnprotectStr extends UDF2[String, String, String] {

  def call(input: String, dataElement: String): String = {
    return ptyUnprotectStr(input, dataElement);
  }
}

class ptyProtectInt extends UDF2[Integer, String, Integer] {

  def call(input: Integer, dataElement: String): Integer = {
    return ptyProtectInt(input, dataElement);
  }
}

class ptyUnprotectInt extends UDF2[Integer, String, Integer] {

       def call(input: Integer, dataElement: String): Integer = {
                     return ptyUnprotectInt(input, dataElement);
       }
}

我想用sbt创建jar文件。我的build.sbt如下所示,

name := "Protegrity UDF"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies ++= Seq(
    "com.protegrity.spark" % "udf" % "2.3.2" % "provided",
    "org.apache.spark" %% "spark-core" % "2.3.2" % "provided",
    "org.apache.spark" %% "spark-sql" % "2.3.2" % "provided"
)

如您所见,我试图使用“provided”选项创建一个精简jar文件,因为我的spark环境已经包含了这些库。
尽管使用了“provided”,sbt还是试图从maven下载并抛出下面的错误,

[warn]  Note: Unresolved dependencies path:
[error] sbt.librarymanagement.ResolveException: Error downloading com.protegrity.spark:udf:2.3.2
[error]   Not found
[error]   Not found
[error]   not found: C:\Users\user1\.ivy2\local\com.protegrity.spark\udf\2.3.2\ivys\ivy.xml
[error]   not found: https://repo1.maven.org/maven2/com/protegrity/spark/udf/2.3.2/udf-2.3.2.pom
[error]         at lmcoursier.CoursierDependencyResolution.unresolvedWarningOrThrow(CoursierDependencyResolution.scala:249)
[error]         at lmcoursier.CoursierDependencyResolution.$anonfun$update$35(CoursierDependencyResolution.scala:218)
[error]         at scala.util.Either$LeftProjection.map(Either.scala:573)
[error]         at lmcoursier.CoursierDependencyResolution.update(CoursierDependencyResolution.scala:218)
[error]         at sbt.librarymanagement.DependencyResolution.update(DependencyResolution.scala:60)
[error]         at sbt.internal.LibraryManagement$.resolve$1(LibraryManagement.scala:52)
[error]         at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$12(LibraryManagement.scala:102)
[error]         at sbt.util.Tracked$.$anonfun$lastOutput$1(Tracked.scala:69)
[error]         at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$20(LibraryManagement.scala:115)
[error]         at scala.util.control.Exception$Catch.apply(Exception.scala:228)
[error]         at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11(LibraryManagement.scala:115)
[error]         at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11$adapted(LibraryManagement.scala:96)
[error]         at sbt.util.Tracked$.$anonfun$inputChanged$1(Tracked.scala:150)
[error]         at sbt.internal.LibraryManagement$.cachedUpdate(LibraryManagement.scala:129)
[error]         at sbt.Classpaths$.$anonfun$updateTask0$5(Defaults.scala:2950)
[error]         at scala.Function1.$anonfun$compose$1(Function1.scala:49)
[error]         at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:62)
[error]         at sbt.std.Transform$$anon$4.work(Transform.scala:67)
[error]         at sbt.Execute.$anonfun$submit$2(Execute.scala:281)
[error]         at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:19)
[error]         at sbt.Execute.work(Execute.scala:290)
[error]         at sbt.Execute.$anonfun$submit$1(Execute.scala:281)
[error]         at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:178)
[error]         at sbt.CompletionService$$anon$2.call(CompletionService.scala:37)
[error]         at java.util.concurrent.FutureTask.run(Unknown Source)
[error]         at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
[error]         at java.util.concurrent.FutureTask.run(Unknown Source)
[error]         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
[error]         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
[error]         at java.lang.Thread.run(Unknown Source)
[error] (update) sbt.librarymanagement.ResolveException: Error downloading com.protegrity.spark:udf:2.3.2
[error]   Not found
[error]   Not found
[error]   not found: C:\Users\user1\.ivy2\local\com.protegrity.spark\udf\2.3.2\ivys\ivy.xml
[error]   not found: https://repo1.maven.org/maven2/com/protegrity/spark/udf/2.3.2/udf-2.3.2.pom

我应该在build.sbt中做什么更改来跳过maven下载“com.protegrity.spark”?有趣的是,对于同一个版本的“org.apache.spark”,我没有遇到这个问题

kd3sttzy

kd3sttzy1#

假设您在编译代码的任何地方都有可用的jar文件(但不是通过maven或其他工件库),只需将jar(默认情况下)放在 lib 项目中的目录(可以使用 unmanagedBase 正在设置 build.sbt 如果你出于某种原因需要这样做)。
注意,这将导致非托管jar包含在程序集jar中。如果您想构建一个“稍微不那么胖”的jar来排除非托管jar,那么您必须过滤掉它。实现这一点的方法之一是

assemblyExcludedJars in assembly := {
  val cp = (fullClasspath in assembly).value
  cp.filter(_.data.getName == "name-of-unmanaged.jar")
}

如果您手头没有jar(或者可能是非常接近jar的东西),那么您希望编译器如何准确地将您的调用类型检查到jar中?

相关问题