spark进程中的任务卡在javapairdd.repartition上-溢出到磁盘的内存Map为20.6gb

pqwbnv8z  于 2021-07-14  发布在  Spark
关注(0)|答案(0)|浏览(234)

我的spark进程中的一些任务(在ec2上的amazonemr上运行)卡住了。
spark历史服务器显示
单个活动工单
阶段:成功/总计
任务(适用于所有阶段):1867/10521(13次运行)
单个活动阶段,包含13个运行任务
以下堆栈跟踪:

org.apache.spark.api.java.JavaPairRDD.repartition(JavaPairRDD.scala:121)
org.apache.beam.runners.spark.translation.GroupCombineFunctions.reshuffle(GroupCombineFunctions.java:198)
org.apache.beam.runners.spark.translation.TransformTranslator$10.evaluate(TransformTranslator.java:547)
org.apache.beam.runners.spark.translation.TransformTranslator$10.evaluate(TransformTranslator.java:529)
org.apache.beam.runners.spark.SparkRunner$Evaluator.doVisitTransform(SparkRunner.java:425)
org.apache.beam.runners.spark.SparkRunner$Evaluator.enterCompositeTransform(SparkRunner.java:370)
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:653)
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317)
org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251)
org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:458)
org.apache.beam.runners.spark.SparkRunner.lambda$run$1(SparkRunner.java:224)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)

编辑:在遗嘱执行人的日志中,我发现它被卡住了

21/03/18 13:08:35 INFO ExternalSorter: Thread 257 spilling in-memory map of 20.6 GB to disk (10 times so far)
21/03/18 13:11:16 INFO ExternalSorter: Thread 257 spilling in-memory map of 20.6 GB to disk (11 times so far)
21/03/18 13:12:51 INFO WriteFiles: Opening writer fc922f25-637e-43ac-ab4e-be7333300ac0 for window org.apache.beam.sdk.transforms.windowing.GlobalWindow@560c63b pane PaneInfo{isFirst=true, isLast=true, timing=ON_TIME, index=0, onTimeIndex=0} destination KV{GLB, FEATURE_PROPERTY_ENTRY}

此执行器具有shuffle remote reads=54gb shuffle read 950mib 80000000条记录shuffle spill(内存)=256gb shuffle spill(磁盘)=53gb

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题