我正在运行一个正确运行的mapreduce作业。但是对于正在生成的日志文件,我有一些困惑。
命令运行红色Map
hadoop jar mapred-0.0.1-SNAPSHOT.jar tcs.hadoop.org.mapreduce.MaxTemperatureDriver /priya/sample.txt /output
14/02/20 17:35:10 INFO input.FileInputFormat: Total input paths to process : 1
14/02/20 17:35:10 WARN snappy.LoadSnappy: Snappy native library is available
14/02/20 17:35:10 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/02/20 17:35:10 INFO snappy.LoadSnappy: Snappy native library loaded
14/02/20 17:35:10 INFO mapred.JobClient: Running job: job_201402111203_0034
14/02/20 17:35:11 INFO mapred.JobClient: map 0% reduce 0%
14/02/20 17:35:22 INFO mapred.JobClient: map 100% reduce 0%
14/02/20 17:35:36 INFO mapred.JobClient: map 100% reduce 100%
14/02/20 17:35:39 INFO mapred.JobClient: Job complete: job_201402111203_0034
14/02/20 17:35:40 INFO mapred.JobClient: Counters: 26
14/02/20 17:35:40 INFO mapred.JobClient: Job Counters
14/02/20 17:35:40 INFO mapred.JobClient: Launched reduce tasks=2
14/02/20 17:35:40 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=11900
14/02/20 17:35:40 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/02/20 17:35:40 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/02/20 17:35:40 INFO mapred.JobClient: Launched map tasks=1
14/02/20 17:35:40 INFO mapred.JobClient: Data-local map tasks=1
14/02/20 17:35:40 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=23142
14/02/20 17:35:40 INFO mapred.JobClient: FileSystemCounters
14/02/20 17:35:40 INFO mapred.JobClient: FILE_BYTES_READ=34
14/02/20 17:35:40 INFO mapred.JobClient: HDFS_BYTES_READ=633
14/02/20 17:35:40 INFO mapred.JobClient: FILE_BYTES_WRITTEN=154973
14/02/20 17:35:40 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=17
14/02/20 17:35:40 INFO mapred.JobClient: Map-Reduce Framework
14/02/20 17:35:40 INFO mapred.JobClient: Map input records=5
14/02/20 17:35:40 INFO mapred.JobClient: Reduce shuffle bytes=34
14/02/20 17:35:40 INFO mapred.JobClient: Spilled Records=4
14/02/20 17:35:40 INFO mapred.JobClient: Map output bytes=45
14/02/20 17:35:40 INFO mapred.JobClient: CPU time spent (ms)=4420
14/02/20 17:35:40 INFO mapred.JobClient: Total committed heap usage (bytes)=172822528
14/02/20 17:35:40 INFO mapred.JobClient: Combine input records=5
14/02/20 17:35:40 INFO mapred.JobClient: SPLIT_RAW_BYTES=103
14/02/20 17:35:40 INFO mapred.JobClient: Reduce input records=2
14/02/20 17:35:40 INFO mapred.JobClient: Reduce input groups=2
14/02/20 17:35:40 INFO mapred.JobClient: Combine output records=2
14/02/20 17:35:40 INFO mapred.JobClient: Physical memory (bytes) snapshot=300945408
14/02/20 14/02/20 17:35:40 INFO mapred.JobClient: Virtual memory (bytes) snapshot=7375564800
14/02/20 17:35:40 INFO mapred.JobClient: Map output records=517:35:40 INFO mapred.JobClient: Reduce output records=2
因此,我可以看到我正在创建一个map任务和两个reduce任务。
但是,当我查看位于$hadoop\u home/logs/history目录中的作业历史日志时,我发现作业跟踪器触发了5个任务,如下所示(仅提供日志行)。我不能理解为什么5个任务而不是3个。
MapAttempt TASK_TYPE="SETUP" TASKID="task_201402111203_0034_m_000002" TASK_ATTEMPT_ID="attempt_201402111203_0034_m_000002_0" START_TIME="1392897911096" TRACKER_NAME="tracker_IMBDBOX1:IMBDBOX
1/157\.227\.44\.207:40925" HTTP_PORT="50060" .
MapAttempt TASK_TYPE="MAP" TASKID="task_201402111203_0034_m_000000" TASK_ATTEMPT_ID="attempt_201402111203_0034_m_000000_0" TASK_STATUS="SUCCESS" FINISH_TIME="1392897989806" HOSTNAME="/defaul
ReduceAttempt TASK_TYPE="REDUCE" TASKID="task_201402111203_0034_r_000001" TASK_ATTEMPT_ID="attempt_201402111203_0034_r_000001_0" START_TIME="1392897947754" TRACKER_NAME="tracker_IMBDBOX3:loc
alhost/127\.0\.0\.1:34625" HTTP_PORT="50060"
ReduceAttempt TASK_TYPE="REDUCE" TASKID="task_201402111203_0034_r_000000" TASK_ATTEMPT_ID="attempt_201402111203_0034_r_000000_0" START_TIME="1392897992388" TRACKER_NAME="tracker_IMBDBOX4:loc
alhost/127\.0\.0\.1:59439" HTTP_PORT="50060" .
MapAttempt TASK_TYPE="CLEANUP" TASKID="task_201402111203_0034_m_000001" TASK_ATTEMPT_ID="attempt_201402111203_0034_m_000001_0" START_TIME="1392898004324" TRACKER_NAME="tracker_IMBDBOX4:local
host/127\.0\.0\.1:59439" HTTP_PORT="50060"
同样,当我进入驻留在$hadoop\u home/logs/userlogs中的userlog时,我只能看到生成的一个map任务的日志。为什么没有为其他map和reduce任务生成日志?
请帮忙。谢谢!
用户日志目录
total 8
-rw-r----- 1 hadoop hdusers 497 2014-02-20 17:35 job-acls.xml
lrwxrwxrwx 1 hadoop hdusers 96 2014-02-20 17:35 attempt_201402111203_0034_m_000002_0 -> /app/hadoop/tmp/mapred/local/userlogs/job_201402111203_0034/attempt_201402111203_0034_m_000002_0
1条答案
按热度按时间bgibtngc1#
请注意以下几点:
hadoop需要在mapreduce作业生命周期中运行几个辅助作业。