hadoop发现2个意外参数

zc0qhyus  于 2021-07-13  发布在  Hadoop
关注(0)|答案(1)|浏览(676)

我在windows上运行hadoop,我试图提交一个mrjob,但是它返回了一个错误 Found 2 unexpected arguments on the command line .

(cmtle) d:\>python norad_counts.py -r hadoop --hadoop-streaming-jar C:\hadoop-3.3.0\share\hadoop\tools\lib\hadoop-streaming-3.3.0.jar all_files.txt
No configs found; falling back on auto-configuration
No configs specified for hadoop runner
Looking for hadoop binary in C:\hadoop-3.3.0\bin\bin...
Looking for hadoop binary in $PATH...
Found hadoop binary: C:\hadoop-3.3.0\bin\hadoop.CMD
Using Hadoop version 3.3.0
Creating temp directory C:\Users\mille\AppData\Local\Temp\norad_counts.mille.20210318.083636.028559
uploading working dir files to hdfs:///user/mille/tmp/mrjob/norad_counts.mille.20210318.083636.028559/files/wd...
Copying other local files to hdfs:///user/mille/tmp/mrjob/norad_counts.mille.20210318.083636.028559/files/
Running step 1 of 1...
  Found 2 unexpected arguments on the command line [hdfs:///user/mille/tmp/mrjob/norad_counts.mille.20210318.083636.028559/files/wd/norad_counts.py#norad_counts.py, hdfs:///user/mille/tmp/mrjob/norad_counts.mille.20210318.083636.028559/files/wd/setup-wrapper.sh#setup-wrapper.sh]
  Try -help for more information
  Streaming Command Failed!
Attempting to fetch counters from logs...
Can't fetch history log; missing job ID
No counters found
Scanning logs for probable cause of failure...
Can't fetch history log; missing job ID
Can't fetch task logs; missing application ID
Step 1 of 1 failed: Command '['C:\\hadoop-3.3.0\\bin\\hadoop.CMD', 'jar', 'C:\\hadoop-3.3.0\\share\\hadoop\\tools\\lib\\hadoop-streaming-3.3.0.jar', '-files', 'hdfs:///user/mille/tmp/mrjob/norad_counts.mille.20210318.083636.028559/files/wd/mrjob.zip#mrjob.zip,hdfs:///user/mille/tmp/mrjob/norad_counts.mille.20210318.083636.028559/files/wd/norad_counts.py#norad_counts.py,hdfs:///user/mille/tmp/mrjob/norad_counts.mille.20210318.083636.028559/files/wd/setup-wrapper.sh#setup-wrapper.sh', '-input', 'hdfs:///user/mille/tmp/mrjob/norad_counts.mille.20210318.083636.028559/files/all_files.txt', '-output', 'hdfs:///user/mille/tmp/mrjob/norad_counts.mille.20210318.083636.028559/output', '-mapper', '/bin/sh -ex setup-wrapper.sh python3 norad_counts.py --step-num=0 --mapper', '-combiner', '/bin/sh -ex setup-wrapper.sh python3 norad_counts.py --step-num=0 --combiner', '-reducer', '/bin/sh -ex setup-wrapper.sh python3 norad_counts.py --step-num=0 --reducer']' returned non-zero exit status 1.

以下是 norad_count.py :

from mrjob.job import MRJob, JSONProtocol
import pandas as pd

class MRNoradCounts(MRJob):

    def mapper(self, _, file_path):
        try:
            df = pd.read_csv(file_path, compression='gzip', low_memory=False)
            df = df[(df.MEAN_MOTION > 11.25) & (df.ECCENTRICITY < 0.25)]
        except:
            raise Exception(f'Failed to open {file_path}') 
        #print(f'File: {file_path}')
        for norad in df.NORAD_CAT_ID.to_list():
            yield norad, 1

    def combiner(self, norad, counts):
        yield norad, sum(counts)

    def reducer(self, norad, counts):
        yield norad, sum(counts)

if __name__ == "__main__":
    MRNoradCounts.run()
alen0pnh

alen0pnh1#

我通过重新安装javajdk解决了这个问题。我最初把它安装到 C:\Program Files\Java 但把它移到了 C:\Java 根据其他指示。我以为更新环境变量就足够了,但显然不是。所以我卸载了java并重新安装了它。这一次是 C:\Java 解决了我的问题。

相关问题