通过单机的Hadoop伪分布式运行理解Hadoop运行过程(3)

下面对正式进入工作状态的各个进程执行的跟踪日志进行分析,进一步理解Hadoop运行的过程。

Hadoop进程已经处于就绪状态,马上就可以执行任务了。

复制待处理数据文件

我们使用如下命令:

$ bin/hadoop dfs -put ./input input

 

开始执行复制任务了。

首先,查看NameNode的日志文件,在hadoop-SHIYANJUN-namenode-cbbd2ce9428e48b.log文件中,根据时间线索信息,可以看到:

2008-09-21 18:39:45,281 INFO org.apache.hadoop.fs.FSNamesystem: Number of transactions: 3 Total time for transactions(ms): 0 Number of syncs: 0 SyncTimes(ms): 0
2008-09-21 18:39:45,468 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/SHIYANJUN/input/b.txt. blk_7287293315123774920
2008-09-21 18:39:45,765 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_7287293315123774920 size 10109
2008-09-21 18:39:46,265 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/SHIYANJUN/input/c.txt. blk_3670615963974276357
2008-09-21 18:39:46,390 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_3670615963974276357 size 1957
2008-09-21 18:39:46,437 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/SHIYANJUN/input/d.txt. blk_125370523583213471
2008-09-21 18:39:46,546 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_125370523583213471 size 1987
2008-09-21 18:39:46,593 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/SHIYANJUN/input/e.txt. blk_-8983105898459096464
2008-09-21 18:39:46,718 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_-8983105898459096464 size 1957
2008-09-21 18:39:46,796 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/SHIYANJUN/input/f.txt. blk_-6348337313643072566
2008-09-21 18:39:46,890 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_-6348337313643072566 size 1985
2008-09-21 18:39:46,953 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/SHIYANJUN/input/g.txt. blk_-140532538463136818
2008-09-21 18:39:47,062 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_-140532538463136818 size 1957
2008-09-21 18:39:47,109 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/SHIYANJUN/input/a.txt. blk_2961784518440227574
2008-09-21 18:39:47,203 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_2961784518440227574 size 1957
 

复制的文件共有7个,按照NameNode指派的复制顺序为:b.txt、c.txt、d.txt、e.txt、f.txt、g.txt、a.txt,在复制过程中,即使更新HDFS的改变状态信息,可以从上面的日志中看到详细的情况。

再看一下DataNode的日志信息,可以在hadoop-SHIYANJUN-datanode-cbbd2ce9428e48b.log文件中,根据时间线索信息,可以看到:

2008-09-21 18:39:20,000 INFO org.apache.hadoop.dfs.DataNode: BlockReport of 0 blocks got processed in 16 msecs
2008-09-21 18:39:45,562 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_7287293315123774920 src: /127.0.0.1:1854 dest: /127.0.0.1:50010
2008-09-21 18:39:45,609 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:39:45,750 INFO org.apache.hadoop.dfs.DataNode: Received block blk_7287293315123774920 of size 10109 from /127.0.0.1
2008-09-21 18:39:45,750 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_7287293315123774920 terminating
2008-09-21 18:39:46,281 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_3670615963974276357 src: /127.0.0.1:1856 dest: /127.0.0.1:50010
2008-09-21 18:39:46,281 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:39:46,390 INFO org.apache.hadoop.dfs.DataNode: Received block blk_3670615963974276357 of size 1957 from /127.0.0.1
2008-09-21 18:39:46,390 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_3670615963974276357 terminating
2008-09-21 18:39:46,437 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_125370523583213471 src: /127.0.0.1:1857 dest: /127.0.0.1:50010
2008-09-21 18:39:46,453 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:39:46,546 INFO org.apache.hadoop.dfs.DataNode: Received block blk_125370523583213471 of size 1987 from /127.0.0.1
2008-09-21 18:39:46,546 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_125370523583213471 terminating
2008-09-21 18:39:46,609 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-8983105898459096464 src: /127.0.0.1:1858 dest: /127.0.0.1:50010
2008-09-21 18:39:46,609 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:39:46,718 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-8983105898459096464 of size 1957 from /127.0.0.1
2008-09-21 18:39:46,734 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_-8983105898459096464 terminating
2008-09-21 18:39:46,796 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-6348337313643072566 src: /127.0.0.1:1859 dest: /127.0.0.1:50010
2008-09-21 18:39:46,796 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:39:46,890 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-6348337313643072566 of size 1985 from /127.0.0.1
2008-09-21 18:39:46,890 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_-6348337313643072566 terminating
2008-09-21 18:39:46,968 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-140532538463136818 src: /127.0.0.1:1860 dest: /127.0.0.1:50010
2008-09-21 18:39:46,968 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:39:47,062 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-140532538463136818 of size 1957 from /127.0.0.1
2008-09-21 18:39:47,078 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_-140532538463136818 terminating
2008-09-21 18:39:47,125 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_2961784518440227574 src: /127.0.0.1:1861 dest: /127.0.0.1:50010
2008-09-21 18:39:47,125 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:39:47,218 INFO org.apache.hadoop.dfs.DataNode: Received block blk_2961784518440227574 of size 1957 from /127.0.0.1
2008-09-21 18:39:47,234 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_2961784518440227574 terminating
2008-09-21 18:40:50,859 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_7287293315123774920
2008-09-21 18:43:44,187 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_3670615963974276357
2008-09-21 18:45:01,328 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-8983105898459096464
2008-09-21 18:47:08,421 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-140532538463136818
2008-09-21 18:47:29,421 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_2961784518440227574
2008-09-21 18:47:56,421 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_125370523583213471
 

在复制文件的过程中,主要是NameNode和DataNode这两个进程在交互进行,完成待处理数据的复制,并更新HDFS的改变状态。

运行WordCount工具

使用如下命令开始运行WordCount工具,进行词频统计任务:

$ bin/hadoop jar hadoop-0.16.4-examples.jar wordcount input output

 

这次,除了SecondaryNameNode进程以外,其他的都已经处于运行状态了。

当然,还是从NameNode开始,来进行任务的部署:

2008-09-21 18:54:10,750 WARN org.apache.hadoop.dfs.StateChange: DIR* FSDirectory.unprotectedDelete: failed to remove /tmp/hadoop-SHIYANJUN/mapred/system/job_200809211811_0001 because it does not exist
2008-09-21 18:54:10,750 INFO org.apache.hadoop.fs.FSNamesystem: Number of transactions: 24 Total time for transactions(ms): 16 Number of syncs: 16 SyncTimes(ms): 237
2008-09-21 18:54:10,984 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /tmp/hadoop-SHIYANJUN/mapred/system/job_200809211811_0001/job.jar. blk_1451674918492452869
 

之后,JobTracer和TaskTracer也进入工作状态了。

NameNode要实时保持与各个进程进行通讯的,可以从它的日志文件看到:

2008-09-21 18:54:11,171 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_1451674918492452869 size 84034
2008-09-21 18:54:11,203 INFO org.apache.hadoop.fs.FSNamesystem: Increasing replication for file /tmp/hadoop-SHIYANJUN/mapred/system/job_200809211811_0001/job.jar. New replication is 10
2008-09-21 18:54:11,781 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /tmp/hadoop-SHIYANJUN/mapred/system/job_200809211811_0001/job.split. blk_-4098064083959376195
2008-09-21 18:54:11,921 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_-4098064083959376195 size 859
2008-09-21 18:54:12,375 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /tmp/hadoop-SHIYANJUN/mapred/system/job_200809211811_0001/job.xml. blk_-1411929805036254458
2008-09-21 18:54:12,484 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_-1411929805036254458 size 12003
2008-09-21 18:55:09,968 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/SHIYANJUN/output/_temporary/_task_200809211811_0001_r_000000_0/part-00000. blk_6547411606566553711
2008-09-21 18:55:10,250 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_6547411606566553711 size 64
2008-09-21 18:55:11,453 INFO org.apache.hadoop.fs.FSNamesystem: Number of transactions: 59 Total time for transactions(ms): 32 Number of syncs: 47 SyncTimes(ms): 1656
2008-09-21 18:55:11,500 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.delete: blk_1451674918492452869 is added to invalidSet of 127.0.0.1:50010
2008-09-21 18:55:11,500 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.delete: blk_-4098064083959376195 is added to invalidSet of 127.0.0.1:50010
2008-09-21 18:55:11,500 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.delete: blk_-1411929805036254458 is added to invalidSet of 127.0.0.1:50010
2008-09-21 18:55:11,515 WARN org.apache.hadoop.dfs.StateChange: DIR* FSDirectory.unprotectedDelete: failed to remove /tmp/hadoop-SHIYANJUN/mapred/system/job_200809211811_0001 because it does not exist
2008-09-21 18:55:13,625 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.blockToInvalidate: ask 127.0.0.1:50010 to delete blk_-4098064083959376195 blk_1451674918492452869 blk_-1411929805036254458
 

看一下DataNode的日志信息,可以在hadoop-SHIYANJUN-datanode-cbbd2ce9428e48b.log文件中,根据时间线索信息,可以看到:

2008-09-21 18:54:11,031 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_1451674918492452869 src: /127.0.0.1:2349 dest: /127.0.0.1:50010
2008-09-21 18:54:11,031 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:54:11,171 INFO org.apache.hadoop.dfs.DataNode: Received block blk_1451674918492452869 of size 84034 from /127.0.0.1
2008-09-21 18:54:11,171 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_1451674918492452869 terminating
2008-09-21 18:54:11,796 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4098064083959376195 src: /127.0.0.1:2350 dest: /127.0.0.1:50010
2008-09-21 18:54:11,796 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:54:11,921 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4098064083959376195 of size 859 from /127.0.0.1
2008-09-21 18:54:11,937 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_-4098064083959376195 terminating
2008-09-21 18:54:12,390 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-1411929805036254458 src: /127.0.0.1:2351 dest: /127.0.0.1:50010
2008-09-21 18:54:12,406 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:54:12,484 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-1411929805036254458 of size 12003 from /127.0.0.1
2008-09-21 18:54:12,500 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_-1411929805036254458 terminating
2008-09-21 18:54:12,734 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_-1411929805036254458 to /127.0.0.1
2008-09-21 18:54:13,015 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_1451674918492452869 to /127.0.0.1
2008-09-21 18:54:13,328 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_-4098064083959376195 to /127.0.0.1
2008-09-21 18:54:14,718 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_-1411929805036254458 to /127.0.0.1
2008-09-21 18:54:15,000 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_1451674918492452869 to /127.0.0.1
2008-09-21 18:54:26,687 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_7287293315123774920 to /127.0.0.1
2008-09-21 18:54:26,687 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_125370523583213471 to /127.0.0.1
2008-09-21 18:54:41,609 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_-6348337313643072566 to /127.0.0.1
2008-09-21 18:54:42,703 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_2961784518440227574 to /127.0.0.1
2008-09-21 18:54:53,953 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_3670615963974276357 to /127.0.0.1
2008-09-21 18:54:54,531 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_-8983105898459096464 to /127.0.0.1
2008-09-21 18:55:00,765 INFO org.apache.hadoop.dfs.DataNode: 127.0.0.1:50010 Served block blk_-140532538463136818 to /127.0.0.1
2008-09-21 18:55:10,000 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_6547411606566553711 src: /127.0.0.1:2433 dest: /127.0.0.1:50010
2008-09-21 18:55:10,000 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 forwarding connect ack to upstream firstbadlink is
2008-09-21 18:55:10,250 INFO org.apache.hadoop.dfs.DataNode: Received block blk_6547411606566553711 of size 64 from /127.0.0.1
2008-09-21 18:55:10,265 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 0 for block blk_6547411606566553711 terminating
2008-09-21 18:55:13,968 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4098064083959376195 file \tmp\hadoop-SYSTEM\dfs\data\current\blk_-4098064083959376195
2008-09-21 18:55:13,968 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-1411929805036254458 file \tmp\hadoop-SYSTEM\dfs\data\current\blk_-1411929805036254458
2008-09-21 18:55:13,968 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_1451674918492452869 file \tmp\hadoop-SYSTEM\dfs\data\current\blk_1451674918492452869
 

这是DataNode结点上数据变化的详细情况。

JobTracer和TaskTracer才是最忙碌的,它们要做很多的工作。

如下列出了hadoop-SHIYANJUN-jobtracer-cbbd2ce9428e48b.log文件中,JobTracer执行任务的信息:

2008-09-21 18:54:13,500 INFO org.apache.hadoop.mapred.JobInProgress: Choosing cached task tip_200809211811_0001_m_000000
2008-09-21 18:54:13,546 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200809211811_0001_m_000000_0' to tip tip_200809211811_0001_m_000000, for tracker 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:54:17,406 INFO org.apache.hadoop.mapred.JobInProgress: Choosing cached task tip_200809211811_0001_m_000001
2008-09-21 18:54:17,406 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200809211811_0001_m_000001_0' to tip tip_200809211811_0001_m_000001, for tracker 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:54:22,953 INFO org.apache.hadoop.mapred.JobInProgress: Choosing normal task tip_200809211811_0001_r_000000
2008-09-21 18:54:22,968 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200809211811_0001_r_000000_0' to tip tip_200809211811_0001_r_000000, for tracker 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:54:33,359 INFO org.apache.hadoop.mapred.JobInProgress: Choosing cached task tip_200809211811_0001_m_000002
2008-09-21 18:54:33,359 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200809211811_0001_m_000002_0' to tip tip_200809211811_0001_m_000002, for tracker 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:54:33,843 INFO org.apache.hadoop.mapred.JobInProgress: Choosing cached task tip_200809211811_0001_m_000003
2008-09-21 18:54:33,843 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200809211811_0001_m_000003_0' to tip tip_200809211811_0001_m_000003, for tracker 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:54:33,953 INFO org.apache.hadoop.mapred.TaskRunner: Saved output of task 'task_200809211811_0001_m_000000_0' to hdfs://localhost:9000/user/SHIYANJUN/output
2008-09-21 18:54:33,953 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200809211811_0001_m_000000_0' has completed tip_200809211811_0001_m_000000 successfully.
2008-09-21 18:54:34,187 INFO org.apache.hadoop.mapred.TaskRunner: Saved output of task 'task_200809211811_0001_m_000001_0' to hdfs://localhost:9000/user/SHIYANJUN/output
2008-09-21 18:54:34,187 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200809211811_0001_m_000001_0' has completed tip_200809211811_0001_m_000001 successfully.
2008-09-21 18:54:45,953 INFO org.apache.hadoop.mapred.JobInProgress: Choosing cached task tip_200809211811_0001_m_000004
2008-09-21 18:54:45,953 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200809211811_0001_m_000004_0' to tip tip_200809211811_0001_m_000004, for tracker 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:54:46,109 INFO org.apache.hadoop.mapred.TaskRunner: Saved output of task 'task_200809211811_0001_m_000002_0' to hdfs://localhost:9000/user/SHIYANJUN/output
2008-09-21 18:54:46,109 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200809211811_0001_m_000002_0' has completed tip_200809211811_0001_m_000002 successfully.
2008-09-21 18:54:46,234 INFO org.apache.hadoop.mapred.TaskRunner: Saved output of task 'task_200809211811_0001_m_000003_0' to hdfs://localhost:9000/user/SHIYANJUN/output
2008-09-21 18:54:46,234 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200809211811_0001_m_000003_0' has completed tip_200809211811_0001_m_000003 successfully.
2008-09-21 18:54:47,093 INFO org.apache.hadoop.mapred.JobInProgress: Choosing cached task tip_200809211811_0001_m_000005
2008-09-21 18:54:47,156 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200809211811_0001_m_000005_0' to tip tip_200809211811_0001_m_000005, for tracker 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:54:57,093 INFO org.apache.hadoop.mapred.JobInProgress: Choosing cached task tip_200809211811_0001_m_000006
2008-09-21 18:54:57,093 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'task_200809211811_0001_m_000006_0' to tip tip_200809211811_0001_m_000006, for tracker 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:54:57,140 INFO org.apache.hadoop.mapred.TaskRunner: Saved output of task 'task_200809211811_0001_m_000004_0' to hdfs://localhost:9000/user/SHIYANJUN/output
2008-09-21 18:54:57,140 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200809211811_0001_m_000004_0' has completed tip_200809211811_0001_m_000004 successfully.
2008-09-21 18:54:57,218 INFO org.apache.hadoop.mapred.TaskRunner: Saved output of task 'task_200809211811_0001_m_000005_0' to hdfs://localhost:9000/user/SHIYANJUN/output
2008-09-21 18:54:57,218 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200809211811_0001_m_000005_0' has completed tip_200809211811_0001_m_000005 successfully.
2008-09-21 18:55:02,484 INFO org.apache.hadoop.mapred.TaskRunner: Saved output of task 'task_200809211811_0001_m_000006_0' to hdfs://localhost:9000/user/SHIYANJUN/output
2008-09-21 18:55:02,484 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200809211811_0001_m_000006_0' has completed tip_200809211811_0001_m_000006 successfully.
2008-09-21 18:55:11,484 INFO org.apache.hadoop.mapred.TaskRunner: Saved output of task 'task_200809211811_0001_r_000000_0' to hdfs://localhost:9000/user/SHIYANJUN/output
2008-09-21 18:55:11,484 INFO org.apache.hadoop.mapred.JobInProgress: Task 'task_200809211811_0001_r_000000_0' has completed tip_200809211811_0001_r_000000 successfully.
2008-09-21 18:55:11,562 INFO org.apache.hadoop.mapred.JobInProgress: Job job_200809211811_0001 has completed successfully.
2008-09-21 18:55:12,562 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200809211811_0001_m_000000_0' from 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:55:12,562 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200809211811_0001_m_000001_0' from 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:55:12,562 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200809211811_0001_m_000002_0' from 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:55:12,562 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200809211811_0001_m_000003_0' from 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:55:12,562 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200809211811_0001_m_000004_0' from 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:55:12,562 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200809211811_0001_m_000005_0' from 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:55:12,562 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200809211811_0001_m_000006_0' from 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
2008-09-21 18:55:12,562 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200809211811_0001_r_000000_0' from 'tracker_cbbd2ce9428e48b:localhost/127.0.0.1:4881'
 

TaskTracer执行,它的日志文件记录了在执行WordCount工具进行词频统计过程中详细信息,包括执行进度信息,可以在hadoop-SHIYANJUN-tasktracer-cbbd2ce9428e48b.log文件中,如下所示:

2008-09-21 18:54:13,640 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: task_200809211811_0001_m_000000_0
2008-09-21 18:54:17,437 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: task_200809211811_0001_m_000001_0
2008-09-21 18:54:23,140 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: task_200809211811_0001_r_000000_0
2008-09-21 18:54:31,859 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_m_000000_0 1.0% hdfs://localhost:9000/user/SHIYANJUN/input/b.txt:0+10109
2008-09-21 18:54:32,015 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_m_000001_0 1.0% hdfs://localhost:9000/user/SHIYANJUN/input/d.txt:0+1987
2008-09-21 18:54:32,015 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200809211811_0001_m_000001_0 is done.
2008-09-21 18:54:32,140 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_m_000000_0 1.0% hdfs://localhost:9000/user/SHIYANJUN/input/b.txt:0+10109
2008-09-21 18:54:32,140 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200809211811_0001_m_000000_0 is done.
2008-09-21 18:54:32,218 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_m_000001_0 1.0% hdfs://localhost:9000/user/SHIYANJUN/input/d.txt:0+1987
2008-09-21 18:54:32,218 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_m_000001_0 Ignoring status-update since task is 'done'
2008-09-21 18:54:33,390 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: task_200809211811_0001_m_000002_0
2008-09-21 18:54:33,968 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: task_200809211811_0001_m_000003_0
2008-09-21 18:54:37,031 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_r_000000_0 0.0% reduce > copy >
2008-09-21 18:54:43,156 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_r_000000_0 0.047619052% reduce > copy (1 of 7 at 0.00 MB/s) >
2008-09-21 18:54:44,796 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_m_000002_0 1.0% hdfs://localhost:9000/user/SHIYANJUN/input/f.txt:0+1985
2008-09-21 18:54:44,796 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200809211811_0001_m_000002_0 is done.
2008-09-21 18:54:45,328 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_m_000003_0 1.0% hdfs://localhost:9000/user/SHIYANJUN/input/a.txt:0+1957
2008-09-21 18:54:45,343 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200809211811_0001_m_000003_0 is done.
2008-09-21 18:54:45,968 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: task_200809211811_0001_m_000004_0
2008-09-21 18:54:47,156 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: task_200809211811_0001_m_000005_0
2008-09-21 18:54:49,234 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_r_000000_0 0.095238104% reduce > copy (2 of 7 at 0.00 MB/s) >
2008-09-21 18:54:55,390 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_r_000000_0 0.19047621% reduce > copy (4 of 7 at 0.00 MB/s) >
2008-09-21 18:54:56,437 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_m_000004_0 1.0% hdfs://localhost:9000/user/SHIYANJUN/input/c.txt:0+1957
2008-09-21 18:54:56,453 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200809211811_0001_m_000004_0 is done.
2008-09-21 18:54:56,609 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_m_000005_0 1.0% hdfs://localhost:9000/user/SHIYANJUN/input/e.txt:0+1957
2008-09-21 18:54:56,625 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200809211811_0001_m_000005_0 is done.
2008-09-21 18:54:57,093 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: task_200809211811_0001_m_000006_0
2008-09-21 18:55:01,437 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_r_000000_0 0.19047621% reduce > copy (4 of 7 at 0.00 MB/s) >
2008-09-21 18:55:01,765 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_m_000006_0 1.0% hdfs://localhost:9000/user/SHIYANJUN/input/g.txt:0+1957
2008-09-21 18:55:01,765 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200809211811_0001_m_000006_0 is done.
2008-09-21 18:55:07,453 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_r_000000_0 0.28571433% reduce > copy (6 of 7 at 0.00 MB/s) >
2008-09-21 18:55:10,312 INFO org.apache.hadoop.mapred.TaskTracker: task_200809211811_0001_r_000000_0 0.81237113% reduce > reduce
2008-09-21 18:55:10,312 INFO org.apache.hadoop.mapred.TaskTracker: Task task_200809211811_0001_r_000000_0 is done.
2008-09-21 18:55:12,593 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200809211811_0001
2008-09-21 18:55:12,593 INFO org.apache.hadoop.mapred.TaskRunner: task_200809211811_0001_m_000003_0 done; removing files.
2008-09-21 18:55:12,609 INFO org.apache.hadoop.mapred.TaskRunner: task_200809211811_0001_m_000000_0 done; removing files.
2008-09-21 18:55:12,609 INFO org.apache.hadoop.mapred.TaskRunner: task_200809211811_0001_m_000004_0 done; removing files.
2008-09-21 18:55:12,625 INFO org.apache.hadoop.mapred.TaskRunner: task_200809211811_0001_r_000000_0 done; removing files.
2008-09-21 18:55:12,625 INFO org.apache.hadoop.mapred.TaskRunner: task_200809211811_0001_m_000006_0 done; removing files.
2008-09-21 18:55:12,640 INFO org.apache.hadoop.mapred.TaskRunner: task_200809211811_0001_m_000002_0 done; removing files.
2008-09-21 18:55:12,656 INFO org.apache.hadoop.mapred.TaskRunner: task_200809211811_0001_m_000005_0 done; removing files.
2008-09-21 18:55:12,656 INFO org.apache.hadoop.mapred.TaskRunner: task_200809211811_0001_m_000001_0 done; removing files.
 

一个Job就是这样进行执行的。

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/pxpfd.html