hadoop安全groupmappingserviceprovider通过dataproc api处理spark作业时出现异常

vjhs03f7  于 2021-05-29  发布在  Hadoop
关注(0)|答案(2)|浏览(483)

我试图在google dataproc群集上运行spark作业,但出现以下错误:

Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: class org.apache.hadoop.security.JniBasedUnixGroupsMapping not org.apache.hadoop.security.GroupMappingServiceProvider
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2330)
    at org.apache.hadoop.security.Groups.<init>(Groups.java:108)
    at org.apache.hadoop.security.Groups.<init>(Groups.java:102)
    at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:450)
    at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:310)
    at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:277)
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:833)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:803)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:676)
    at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2430)
    at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2430)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2430)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:295)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at com.my.package.spark.SparkModule.provideJavaSparkContext(SparkModule.java:59)
    at com.my.package.spark.SparkModule$$ModuleAdapter$ProvideJavaSparkContextProvidesAdapter.get(SparkModule$$ModuleAdapter.java:140)
    at com.my.package.spark.SparkModule$$ModuleAdapter$ProvideJavaSparkContextProvidesAdapter.get(SparkModule$$ModuleAdapter.java:101)
    at dagger.internal.Linker$SingletonBinding.get(Linker.java:364)
    at spark.Main$$InjectAdapter.get(Main$$InjectAdapter.java:65)
    at spark.Main$$InjectAdapter.get(Main$$InjectAdapter.java:23)
    at dagger.ObjectGraph$DaggerObjectGraph.get(ObjectGraph.java:272)
    at spark.Main.main(Main.java:45)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: class org.apache.hadoop.security.JniBasedUnixGroupsMapping not org.apache.hadoop.security.GroupMappingServiceProvider
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2324)
    ... 31 more

dataproc版本:1.1.51和1.2.15
作业配置:
地区:全球
群集我的群集
工作类型:spark
jar文件:gs://bucket/jars/spark-job.jar
主类或jar:spark.main
论据:
属性:
spark.driver.extraclasspath:/path/to/google-api-client-1.20.0.jar
spark.driver.userclasspathfirst:真
我可以在命令行上这样运行它:

spark-submit --conf "spark.driver.extraClassPath=/path/to/google-api-client-1.20.0.jar" --conf "spark.driver.userClassPathFirst=true" --class spark.Main /path/to/spark-job.jar

但是ui/api不允许您同时传递类名和jar,因此看起来是这样的:

spark-submit --conf spark.driver.extraClassPath=/path/to/google-api-client-1.20.0.jar --conf spark.driver.userClassPathFirst=true --class spark.Main --jars /tmp/1f4d5289-37af-4311-9ccc-5eee34acaf62/spark-job.jar /usr/lib/hadoop/hadoop-common.jar

我不知道提供extraclasspath是否有问题,或者spark-job.jar和hadoop-common.jar是否有冲突。

csga3l58

csga3l581#

如果你不想关机 spark.driver.userClassPathFirst=true 你可以查一下 "org.apache.spark" %% "spark-core" % SPARK_VERSION 依赖关系存在,范围定义正确。什么时候 spark-core jar在类路径中,不会引发异常。

8ljdwjyq

8ljdwjyq2#

我认为这是由userclasspathfirst和/usr/lib/hadoop/hadoop-common.jar的组合造成的,后者是jar dataproc指定的spark submit。在某些情况下,将使用用户类加载器中的groupmappingserviceprovider示例,而在其他情况下,将使用系统类加载器中的示例。由于从一个类装入器装入的类不等于从另一个类装入器装入的同一个类,因此会出现此异常。
代替userclasspathfirst,使用maven shade之类的方法重新定位冲突的类有意义吗?

相关问题