使用spark运行python脚本时出错

c8ib6hqw  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(422)

因此,我正在运行一个python脚本(由于某些安全原因,我无法共享),并且在运行它时遇到了一些问题。我使用的是spark,在使用groupbykey().mapvalues函数和sortbykey()函数时出现此错误。
我在google上搜索了这个错误,并尝试了spark job fails:storage.diskblockobjectwriter:uncaught exception,同时恢复对文件的部分写入和其他类似的答案。
这是完全错误。

2019-07-26 09:35:20,698 ERROR storage.DiskBlockObjectWriter: Uncaught 
      exception while reverting partial writes to file /tmp/blockmgr-0b509bcc- 
      b1b3-4edb-993f-208ca6107f06/12/temp_shuffle_6569085f-65d9-46c3-9466- 
      a37d7fbc8caf
      java.io.FileNotFoundException: /tmp/blockmgr-0b509bcc-b1b3-4edb-993f- 
      208ca6107f06/12/temp_shuffle_6569085f-65d9-46c3-9466-a37d7fbc8caf (Too 
      many open files)
      at java.io.FileOutputStream.open0(Native Method)
      at java.io.FileOutputStream.open(FileOutputStream.java:270)
      at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
      at    org.apache.spark.storage.DiskBlockObjectWriter$$anonfun$revertPartialWritesAndClose$2.apply$mcV$sp(DiskBlockObjectWriter.scala:217)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1369)
        at org.apache.spark.storage.DiskBlockObjectWriter.revertPartialWritesAndClose(DiskBlockObjectWriter.scala:214)
        at  org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.stop(BypassMergeSortShuffleWriter.java:237)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:105)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

2019-07-26 09:35:20,724 ERROR sort.BypassMergeSortShuffleWriter: Error while deleting file /tmp/blockmgr-0b509bcc-b1b3-4edb-993f-208ca6107f06/12/temp_shuffle_6569085f-65d9-46c3-9466-a37d7fbc8caf
2019-07-26 09:35:21,822 ERROR executor.Executor: Exception in task 53.0 in stage 1.0 (TID 152)
    java.io.FileNotFoundException: /tmp/blockmgr-0b509bcc-b1b3-4edb-993f-208ca6107f06/3c/temp_shuffle_a4e5cecd-e130-4671-b0ba-36e98e2dc158 (Too many open files)
        at java.io.FileOutputStream.open0(Native Method)
        at java.io.FileOutputStream.open(FileOutputStream.java:270)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
        at org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:103)
        at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:116)
        at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:237)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
2019-07-26 09:35:21,861 WARN scheduler.TaskSetManager: Lost task 53.0 in stage 1.0 (TID 152, localhost, executor driver): java.io.FileNotFoundException: /tmp/blockmgr-0b509bcc-b1b3-4edb-993f-208ca6107f06/3c/temp_shuffle_a4e5cecd-e130-4671-b0ba-36e98e2dc158 (Too many open files)
        at java.io.FileOutputStream.open0(Native Method)
        at java.io.FileOutputStream.open(FileOutputStream.java:270)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
        at org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:103)
        at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:116)
        at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:237)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

2019-07-26 09:35:21,863 ERROR scheduler.TaskSetManager: Task 53 in stage 1.0 failed 1 times; aborting job
    Traceback (most recent call last):
      File "BowtieSpark.py", line 60, in <module>
        readsRDD = reads_tuple.sortByKey().values()
      File "/s1/snagaraj/spark/python/pyspark/rdd.py", line 667, in sortByKey
        rddSize = self.count()
      File "/s1/snagaraj/spark/python/pyspark/rdd.py", line 1055, in count
        return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
      File "/s1/snagaraj/spark/python/pyspark/rdd.py", line 1046, in sum
        return self.mapPartitions(lambda x: [sum(x)]).fold(0, operator.add)
      File "/s1/snagaraj/spark/python/pyspark/rdd.py", line 917, in fold
        vals = self.mapPartitions(func).collect()
      File "/s1/snagaraj/spark/python/pyspark/rdd.py", line 816, in collect
        sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
      File "/s1/snagaraj/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
      File "/s1/snagaraj/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
2019-07-26 09:35:21,887 WARN scheduler.TaskSetManager: Lost task 56.0 in stage 1.0 (TID 155, localhost, executor driver): TaskKilled (Stage cancelled)
    py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
    : org.apache.spark.SparkException: Job aborted due to stage failure: Task 53 in stage 1.0 failed 1 times, most recent failure: Lost task 53.0 in stage 1.0 (TID 152, localhost, executor driver): java.io.FileNotFoundException: /tmp/blockmgr-0b509bcc-b1b3-4edb-993f-208ca6107f06/3c/temp_shuffle_a4e5cecd-e130-4671-b0ba-36e98e2dc158 (Too many open files)
        at java.io.FileOutputStream.open0(Native Method)
        at java.io.FileOutputStream.open(FileOutputStream.java:270)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
        at org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:103)
        at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:116)
        at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:237)
        at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

然后驱动stacktrace有更多关于打开的文件太多和内存空间不足的错误消息( ERROR storage.DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /tmp/blockmgr-0b509bcc-b1b3-4edb-993f-208ca6107f06/0d/temp_shuffle_f170174d-3de0-44be-89a2-5d2b7f6ac3bf ).
它也有这个错误( ERROR sort.BypassMergeSortShuffleWriter: Error while deleting file /tmp/blockmgr-0b509bcc-b1b3-4edb-993f-208ca6107f06/0e/temp_shuffle_9f1b47df-71d1-4941-9da1-2f8bee09d968 )

evrscar2

evrscar21#

根据日志,filenotfound错误是由于executor失败,executor失败是由于 Too many open files . 你能用电脑增加打开的文件吗 ulimit -n 65636 命令

相关问题