三、向HDFS中放入文件
向HDFS中放入文件,需要使用命令:bin/Hadoop fs -put inputdata /data/input
放入文件完毕后,我们查看HDFS:bin/hadoop fs -ls /data/input,结果为:
-rw-r--r-- 3 hadoop supergroup 6119928 2010-11-21 00:47 /data/input/inputdata
这个时候,我们查看DataNode下的/data/hadoopdir/tmp文件夹发生了变化:
在datanode01, datanode02, datanode03上的/data/hadoopdir/tmp/dfs/data/current下面都多了如下的block文件
可见block文件被复制了三份
.:
dfs mapred
./dfs:
data
./dfs/data:
current detach in_use.lock storage tmp
./dfs/data/current:
blk_2672607439166801630 blk_2672607439166801630_1002.meta dncp_block_verification.log.curr VERSION
./dfs/data/detach:
./dfs/data/tmp:
./mapred:
local
./mapred/local:
在放入文件的过程中,我们可以看log如下:
namenode的hadoop-namenode-namenode.log如下:
//创建/data/input/inputdata
ugi=admin,sambashareip=/192.168.1.104 cmd=create src=/data/input/inputdata dst=null perm=hadoop:supergroup:rw-r--r--
//分配block
NameSystem.allocateBlock: /data/input/inputdata. blk_2672607439166801630_1002
NameSystem.addStoredBlock: blockMap updated: 192.168.1.107:50010 is added toblk_2672607439166801630_1002 size 6119928
NameSystem.addStoredBlock: blockMap updated: 192.168.1.105:50010 is added toblk_2672607439166801630_1002 size 6119928
NameSystem.addStoredBlock: blockMap updated: 192.168.1.106:50010 is added toblk_2672607439166801630_1002 size 6119928
datanode01的hadoop-datanode-datanode01.log如下:
//datanode01从客户端接收一个block
Receiving block blk_2672607439166801630_1002 src: /192.168.1.104:41748 dest: /192.168.1.105:50010
src: /192.168.1.104:41748, dest: /192.168.1.105:50010, bytes: 6119928, op: HDFS_WRITE, cliID: DFSClient_-1541812792, srvID: DS-1042573498-192.168.1.105-50010-1290313555129, blockid: blk_2672607439166801630_1002
PacketResponder 2 for block blk_2672607439166801630_1002 terminating
datanode02的hadoop-datanode-datanode02.log如下:
//datanode02从datanode01接收一个block
Receiving block blk_2672607439166801630_1002 src: /192.168.1.105:60266 dest: /192.168.1.106:50010
src: /192.168.1.105:60266, dest: /192.168.1.106:50010, bytes: 6119928, op: HDFS_WRITE, cliID: DFSClient_-1541812792, srvID: DS-1366730865-192.168.1.106-50010-1290313543717, blockid: blk_2672607439166801630_1002
PacketResponder 1 for block blk_2672607439166801630_1002 terminating
datanode03的hadoop-datanode-datanode03.log如下:
//datanode03从datanode02接收一个block
Receiving block blk_2672607439166801630_1002 src: /192.168.1.106:58899 dest: /192.168.1.107:50010
src: /192.168.1.106:58899, dest: /192.168.1.107:50010, bytes: 6119928, op: HDFS_WRITE, cliID: DFSClient_-1541812792, srvID: DS-765014609-192.168.1.107-50010-1290313555841, blockid: blk_2672607439166801630_1002
PacketResponder 0 for block blk_2672607439166801630_1002 terminating
Verification succeeded for blk_2672607439166801630_1002
四、运行一个Map-Reduce程序
运行Map-Reduce函数,需要运行命令:bin/hadoop jar hadoop-0.19.2-examples.jar wordcount /data/input /data/output
为了能够观察Map-Reduce一步步运行的情况,我们首先远程调试JobTracker,将断点设置在JobTracker.submitJob函数中。
按照我们上一篇文章讨论的那样,DFSClient向JobTracker提交任务之前,会将任务运行所需要的三类文件放入HDFS,从而可被JobTracker和TaskTracker得到:
运行的jar文件:job.jar
运行所需要的input split的信息:job.split
运行所需的配置:job.xml
当Map-Reduce程序停在JobTracker.submitJob函数中的时候,让我们查看HDFS中有如下的变化:
bin/hadoop fs -ls /data/hadoopdir/tmp/mapred/system
其中多了一个文件夹job_201011202025_0001,这是当前运行的Job的ID,在这个文件夹中有三个文件:
bin/hadoop fs -ls /data/hadoopdir/tmp/mapred/system/job_201011202025_0001
Found 3 items
-rw-r--r-- /data/hadoopdir/tmp/mapred/system/job_201011202025_0001/job.jar
-rw-r--r-- /data/hadoopdir/tmp/mapred/system/job_201011202025_0001/job.split
-rw-r--r-- /data/hadoopdir/tmp/mapred/system/job_201011202025_0001/job.xml
现在我们可以断开对JobTracker的远程调试。
在JobTracker.submitJob的函数中,会读取这些上传到HDFS的文件,从而将Job拆分成Map Task和Reduce Task。
当TaskTracker通过heartbeat向JobTracker请求一个Map Task或者Reduce Task来运行,按照我们上面的配置,显然datanode01会请求Map Task来执行,而datanode03会申请Reduce Task来执行。
下面我们首先来看datanode01上Map Task的执行过程:
当TaskTracker得到一个Task的时候,它会调用TaskTracker.localizeJob将job运行的三个文件从HDFS中拷贝到本地文件夹,然后调用TaskInProgress.localizeTask创建Task运行的本地工作目录。
我们来远程调试datanode01上的TaskTracker,分别将断点设在localizeJob和localizeTask函数中,当程序停在做完localizeTask后,我们来看datanode01上的/data/hadoopdir/tmp/mapred/local/taskTracker/jobcache下多了一个文件夹
job_201011202025_0001,在此文件夹下面有如下的结构:
datanode01:/data/hadoopdir/tmp/mapred/local/taskTracker/jobcache/job_201011202025_0001$ ls -R
.:
attempt_201011202025_0001_m_000000_0 attempt_201011202025_0001_m_000003_0 jars job.xml work
./attempt_201011202025_0001_m_000000_0:
job.xml split.dta work
./attempt_201011202025_0001_m_000000_0/work:
./attempt_201011202025_0001_m_000003_0:
pid work
./attempt_201011202025_0001_m_000003_0/work:
tmp
./attempt_201011202025_0001_m_000003_0/work/tmp:
./jars:
job.jar META-INF org
./work:
其中,job.xml, job.jar,split.dta为配置文件和运行jar以及input split,jars文件夹下面为job.jar的解压缩。
接下来datanode01要创建Child JVM来执行Task,这时我们在datanode01上运行ps aux | grep java,可以发现各有一个新的JVM被创建:
/bin/java
……
-Xmx200m -agentlib:jdwp=transport=dt_socket,address=8883,server=y,suspend=y
……
org.apache.hadoop.mapred.Child
127.0.0.1 58297
attempt_201011202025_0001_m_000003_0 2093922206
从JVM的参数我们可以看出,这是一个map任务。从上面的文件我们可以看出,其实此TaskTracker已经在同一个Child JVM里面运行了两个map task,其中一个是attempt_201011202025_0001_m_000003_0,这个没有input split,后来发现他是一个job setup task,而另一个是attempt_201011202025_0001_m_000000_0,是一个真正处理数据的map task,当然如果需要处理的数据量足够大,会有多个处理数据的map task被运行。
我们可以对Child JVM进行远程调试,把断点设在MapTask.run函数中,从上一篇文章中我们知道,map的结果一开始都是保存在buffer中的,当数据量足够大,则spill到硬盘中,形成spill文件,在map task结束之前,我们查看attempt_201011202025_0001_m_000000_0文件夹,我们可以看到,大量的spill文件已经生成:
datanode01:/data/hadoopdir/tmp/mapred/local/taskTracker/jobcache/job_201011202025_0001/attempt_201011202025_0001_m_000000_0$ ls -R
.:
job.xml output split.dta work
./output:
spill0.out spill16.out spill22.out spill29.out spill35.out spill41.out spill48.out spill54.out spill60.out spill67.out spill73.out spill7.out
spill10.out spill17.out spill23.out spill2.out spill36.out spill42.out spill49.out spill55.out spill61.out spill68.out spill74.out spill80.out
spill11.out spill18.out spill24.out spill30.out spill37.out spill43.out spill4.out spill56.out spill62.out spill69.out spill75.out spill81.out
spill12.out spill19.out spill25.out spill31.out spill38.out spill44.out spill50.out spill57.out spill63.out spill6.out spill76.out spill82.out
spill13.out spill1.out spill26.out spill32.out spill39.out spill45.out spill51.out spill58.out spill64.out spill70.out spill77.out spill83.out
spill14.out spill20.out spill27.out spill33.out spill3.out spill46.out spill52.out spill59.out spill65.out spill71.out spill78.out spill8.out
spill15.out spill21.out spill28.out spill34.out spill40.out spill47.out spill53.out spill5.out spill66.out spill72.out spill79.out spill9.out
./work:
tmp
./work/tmp:
当整个map task结束后,所有的spill文件会合并成一个文件,这时候我们再查看attempt_201011202025_0001_m_000000_0文件夹:
datanode01:/data/hadoopdir/tmp/mapred/local/taskTracker/jobcache/job_201011202025_0001/attempt_201011202025_0001_m_000000_0$ ls -R
.:
job.xml output split.dta work
./output:
file.out file.out.index
./work:
tmp
./work/tmp:
当然如果有多个map task处理数据,就会生成多个file.out,在本例子中,一共只有两个map task处理数据,所以最后的结果为:
datanode01:/data/hadoopdir/tmp/mapred/local/taskTracker/jobcache/job_201011202025_0001$ ls -R attempt_201011202025_0001_m_00000*
attempt_201011202025_0001_m_000000_0:
job.xml output split.dta work
attempt_201011202025_0001_m_000000_0/output:
file.out file.out.index
attempt_201011202025_0001_m_000000_0/work:
tmp
attempt_201011202025_0001_m_000000_0/work/tmp:
attempt_201011202025_0001_m_000001_0:
job.xml output split.dta work
attempt_201011202025_0001_m_000001_0/output:
file.out file.out.index
attempt_201011202025_0001_m_000001_0/work:
tmp
attempt_201011202025_0001_m_000001_0/work/tmp:
attempt_201011202025_0001_m_000003_0:
pid work
attempt_201011202025_0001_m_000003_0/work:
tmp
attempt_201011202025_0001_m_000003_0/work/tmp:
然后我们再来看datanode03上reduce task的运行情况:
我们同样远程调试datanode03上的TaskTracker,将断点设在localizeJob和localizeTask函数中,当程序停在做完localizeTask后,我们来看datanode03上的/data/hadoopdir/tmp/mapred/local/taskTracker/jobcache下也多了一个文件夹job_201011202025_0001,在此文件夹下面有如下的结构:
datanode03:/data/hadoopdir/tmp/mapred/local/taskTracker/jobcache/job_201011202025_0001$ ls -R attempt_201011202025_0001_r_00000*
attempt_201011202025_0001_r_000000_0:
job.xml work
attempt_201011202025_0001_r_000000_0/work:
tmp
attempt_201011202025_0001_r_000000_0/work/tmp:
attempt_201011202025_0001_r_000002_0:
pid work
attempt_201011202025_0001_r_000002_0/work:
tmp
attempt_201011202025_0001_r_000002_0/work/tmp:
上面的两个Reduce Task中,attempt_201011202025_0001_r_000002_0是一个job setup task,真正处理数据的是attempt_201011202025_0001_r_000000_0。
接下来datanode03要创建Child JVM来执行Task,这时我们在datanode03上运行ps aux | grep java,可以发现各有一个新的JVM被创建:
/bin/java
……
-Xmx200m -agentlib:jdwp=transport=dt_socket,address=8883,server=y,suspend=y -
……
org.apache.hadoop.mapred.Child
127.0.0.1 37546
attempt_201011202025_0001_r_000002_0 516504201
从JVM的参数我们可以看出,这是一个map任务。
从上一篇文章中我们知道,Reduce Task包括三个过程:copy,sort,reduce
拷贝过程即将所有的map结果复制到reduce task的本地
datanode03:/data/hadoopdir/tmp/mapred/local/taskTracker/jobcache/job_201011202025_0001/attempt_201011202025_0001_r_000000_0$ ls -R
.:
job.xml output pid work
./output:
map_0.out map_1.out map_2.out map_3.out
./work:
tmp
./work/tmp:
如图所示,如果共有4个map task,则共拷贝到本地4个map.out。
在拷贝的过程中,有一个背后的线程会对已经拷贝到本地的map.out进行预先的合并,形成map.merged文件,合并的规则是按照io.sort.factor来进行合并,对于我们的配置就是两两合并,下面我们看到的就是map_2.out和map_3.out合并成map_3.out.merged,在另外两个还没有合并的时候,拷贝过程结束了,则背后的合并进程也就结束了。
datanode03:/data/hadoopdir/tmp/mapred/local/taskTracker/jobcache/job_201011202025_0001/attempt_201011202025_0001_r_000000_0$ ls -R
.:
job.xml output pid work
./output:
map_0.out map_1.out map_3.out.merged
./work:
tmp
./work/tmp:
sort过程就是将拷贝过来的map输出合并并排序,也是按照io.sort.factor来进行合并,也即两两合并。下面我们看到的就是map_0.out和map_1.out合并为一个intermediate.1,加上另外的map_3.out.merged,数目已经小于io.sort.factor了,于是不再合并。
datanode03:/data/hadoopdir/tmp/mapred/local/attempt_201011202025_0001_r_000000_0$ ls -r
intermediate.1
datanode03:/data/hadoopdir/tmp/mapred/local/taskTracker/jobcache/job_201011202025_0001/attempt_201011202025_0001_r_000000_0$ ls -R
.:
job.xml output pid work
./output:
map_3.out.merged
./work:
tmp
./work/tmp:
reduce的过程就是循环调用reducer的reduce函数,将结果输出到HDFS中。
namenode:/data/hadoop-0.19.2$ bin/hadoop fs -ls /data/output
Found 2 items
/data/output/_logs
/data/output/part-00000
当然我们通过log,也可以看到Map-Reduce的运行过程:
命令行输出的日志如下:
namenode:/data/hadoop-0.19.2$ bin/hadoop jar hadoop-0.19.2-examples.jar wordcount /data/input /data/output
10/11/22 07:38:44 INFO mapred.FileInputFormat: Total input paths to process : 4
10/11/22 07:38:45 INFO mapred.JobClient: Running job: job_201011202025_0001
10/11/22 07:38:46 INFO mapred.JobClient: map 0% reduce 0%
10/11/22 07:39:14 INFO mapred.JobClient: map 25% reduce 0%
10/11/22 07:39:23 INFO mapred.JobClient: map 50% reduce 0%
10/11/22 07:39:27 INFO mapred.JobClient: map 75% reduce 0%
10/11/22 07:39:30 INFO mapred.JobClient: map 100% reduce 0%
10/11/22 07:39:31 INFO mapred.JobClient: map 100% reduce 8%
10/11/22 07:39:36 INFO mapred.JobClient: map 100% reduce 25%
10/11/22 07:39:40 INFO mapred.JobClient: map 100% reduce 100%
10/11/22 07:39:41 INFO mapred.JobClient: Job complete: job_201011202025_0001
10/11/22 07:39:41 INFO mapred.JobClient: Counters: 16
10/11/22 07:39:41 INFO mapred.JobClient: File Systems
10/11/22 07:39:41 INFO mapred.JobClient: HDFS bytes read=61199280
10/11/22 07:39:41 INFO mapred.JobClient: HDFS bytes written=534335
10/11/22 07:39:41 INFO mapred.JobClient: Local bytes read=74505214
10/11/22 07:39:41 INFO mapred.JobClient: Local bytes written=81308914
10/11/22 07:39:41 INFO mapred.JobClient: Job Counters
//四个map,一个reduce
10/11/22 07:39:41 INFO mapred.JobClient: Launched reduce tasks=1
10/11/22 07:39:41 INFO mapred.JobClient: Launched map tasks=4
10/11/22 07:39:41 INFO mapred.JobClient: Data-local map tasks=4
10/11/22 07:39:41 INFO mapred.JobClient: Map-Reduce Framework
10/11/22 07:39:41 INFO mapred.JobClient: Reduce input groups=37475
10/11/22 07:39:41 INFO mapred.JobClient: Combine output records=351108
10/11/22 07:39:41 INFO mapred.JobClient: Map input records=133440
10/11/22 07:39:41 INFO mapred.JobClient: Reduce output records=37475
10/11/22 07:39:41 INFO mapred.JobClient: Map output bytes=31671148
10/11/22 07:39:41 INFO mapred.JobClient: Map input bytes=24479712
10/11/22 07:39:41 INFO mapred.JobClient: Combine input records=2001312
10/11/22 07:39:41 INFO mapred.JobClient: Map output records=1800104
10/11/22 07:39:41 INFO mapred.JobClient: Reduce input records=149900
在namenode的hadoop-jobtracker-namenode.log中,我们可以看到JobTracker的运行情况:
//创建一个Job,分成四个map task
JobInProgress: Input size for job job_201011220735_0001 = 24479712
JobInProgress: Split info for job:job_201011220735_0001
JobInProgress: tip:task_201011220735_0001_m_000000 has split on node:/default-rack/datanode02
JobInProgress: tip:task_201011220735_0001_m_000000 has split on node:/default-rack/datanode01
JobInProgress: tip:task_201011220735_0001_m_000000 has split on node:/default-rack/datanode03
JobInProgress: tip:task_201011220735_0001_m_000001 has split on node:/default-rack/datanode03
JobInProgress: tip:task_201011220735_0001_m_000001 has split on node:/default-rack/datanode01
JobInProgress: tip:task_201011220735_0001_m_000001 has split on node:/default-rack/datanode02
JobInProgress: tip:task_201011220735_0001_m_000002 has split on node:/default-rack/datanode02
JobInProgress: tip:task_201011220735_0001_m_000002 has split on node:/default-rack/datanode01
JobInProgress: tip:task_201011220735_0001_m_000002 has split on node:/default-rack/datanode03
JobInProgress: tip:task_201011220735_0001_m_000003 has split on node:/default-rack/datanode01
JobInProgress: tip:task_201011220735_0001_m_000003 has split on node:/default-rack/datanode02
JobInProgress: tip:task_201011220735_0001_m_000003 has split on node:/default-rack/datanode03
//datanode01通过heartbeat向JobTracker申请运行一个job setup task
JobTracker: Adding task 'attempt_201011220735_0001_m_000005_0' to tip task_201011220735_0001_m_000005, for tracker 'tracker_datanode01:localhost/127.0.0.1:48339'
JobTracker: tracker_datanode01:localhost/127.0.0.1:48339 -> LaunchTask: attempt_201011220735_0001_m_000005_0
JobInProgress: Task 'attempt_201011220735_0001_m_000005_0' has completed task_201011220735_0001_m_000005 successfully.
//datanode01向JobTracker请求运行第一个map task
JobTracker: Adding task 'attempt_201011220735_0001_m_000000_0' to tip task_201011220735_0001_m_000000, for tracker 'tracker_datanode01:localhost/127.0.0.1:48339'
JobInProgress: Choosing data-local task task_201011220735_0001_m_000000
JobTracker: tracker_datanode01:localhost/127.0.0.1:48339 -> LaunchTask: attempt_201011220735_0001_m_000000_0
JobInProgress: Task 'attempt_201011220735_0001_m_000000_0' has completed task_201011220735_0001_m_000000 successfully.
//datanode01向JobTracker请求运行第二个map task
JobTracker: Adding task 'attempt_201011220735_0001_m_000001_0' to tip task_201011220735_0001_m_000001, for tracker 'tracker_datanode01:localhost/127.0.0.1:48339'
JobInProgress: Choosing data-local task task_201011220735_0001_m_000001
JobTracker: tracker_datanode01:localhost/127.0.0.1:48339 -> LaunchTask: attempt_201011220735_0001_m_000001_0
JobInProgress: Task 'attempt_201011220735_0001_m_000001_0' has completed task_201011220735_0001_m_000001 successfully.
//datanode01向JobTracker请求运行第三个map task
JobTracker: Adding task 'attempt_201011220735_0001_m_000002_0' to tip task_201011220735_0001_m_000002, for tracker 'tracker_datanode01:localhost/127.0.0.1:48339'
JobInProgress: Choosing data-local task task_201011220735_0001_m_000002
JobTracker: tracker_datanode01:localhost/127.0.0.1:48339 -> LaunchTask: attempt_201011220735_0001_m_000002_0
JobInProgress: Task 'attempt_201011220735_0001_m_000002_0' has completed task_201011220735_0001_m_000002 successfully.
//datanode01向JobTracker请求运行第四个map task
JobTracker: Adding task 'attempt_201011220735_0001_m_000003_0' to tip task_201011220735_0001_m_000003, for tracker 'tracker_datanode01:localhost/127.0.0.1:48339'
JobInProgress: Choosing data-local task task_201011220735_0001_m_000003
JobTracker: tracker_datanode01:localhost/127.0.0.1:48339 -> LaunchTask: attempt_201011220735_0001_m_000003_0
JobTracker: Got heartbeat from: tracker_datanode01:localhost/127.0.0.1:48339 (initialContact: false acceptNewTasks: true) with responseId: 39
JobInProgress: Task 'attempt_201011220735_0001_m_000003_0' has completed task_201011220735_0001_m_000003 successfully.
//datanode03向JobTracker申请运行一个commit task
JobTracker: Adding task 'attempt_201011220735_0001_r_000000_0' to tip task_201011220735_0001_r_000000, for tracker 'tracker_datanode03:localhost/127.0.0.1:44118'
JobTracker: tracker_datanode03:localhost/127.0.0.1:44118 -> LaunchTask: attempt_201011220735_0001_r_000000_0
JobTracker: tracker_datanode03:localhost/127.0.0.1:44118 -> CommitTaskAction: attempt_201011220735_0001_r_000000_0
JobInProgress: Task 'attempt_201011220735_0001_r_000000_0' has completed task_201011220735_0001_r_000000 successfully.
//datanode03向JobTracker申请运行一个reduce task
JobTracker: Adding task 'attempt_201011220735_0001_r_000001_0' to tip task_201011220735_0001_r_000001, for tracker 'tracker_datanode03:localhost/127.0.0.1:44118'
JobTracker: tracker_datanode03:localhost/127.0.0.1:44118 -> LaunchTask: attempt_201011220735_0001_r_000001_0
JobInProgress: Task 'attempt_201011220735_0001_r_000001_0' has completed task_201011220735_0001_r_000001 successfully.
JobInProgress: Job job_201011220735_0001 has completed successfully.
同样,在datanode01的hadoop-tasktracker-datanode01.log可以看到TaskTracker的运行过程。
在datanode01的logs/userlogs下面,我们可以看到为了运行map task所生成的Child JVM打印出的log,每个map task一个文件夹,在本例中,由于多个map task共用一个JVM,所以只输出了一组log文件
datanode01:/data/hadoop-0.19.2/logs/userlogs$ ls -R
.:
attempt_201011220735_0001_m_000000_0 attempt_201011220735_0001_m_000002_0 attempt_201011220735_0001_m_000005_0
attempt_201011220735_0001_m_000001_0 attempt_201011220735_0001_m_000003_0
./attempt_201011220735_0001_m_000000_0:
log.index
./attempt_201011220735_0001_m_000001_0:
log.index
./attempt_201011220735_0001_m_000002_0:
log.index
./attempt_201011220735_0001_m_000003_0:
log.index
./attempt_201011220735_0001_m_000005_0:
log.index stderr stdout syslog
同样,在datanode03的hadoop-tasktracker-datanode03.log可以看到TaskTracker运行的过程。
在datanode03的logs/users下面,也有一组文件夹,每个reduce task一个文件夹,也是多个reduce task共用一个JVM:
datanode03:/data/hadoop-0.19.2/logs/userlogs$ ls -R
.:
attempt_201011220735_0001_r_000000_0 attempt_201011220735_0001_r_000001_0
./attempt_201011220735_0001_r_000000_0:
log.index stderr stdout syslog
./attempt_201011220735_0001_r_000001_0: