Hadoop处理数据的轨迹(2)

格式化HDFS需要运行命令:bin/Hadoop namenode –format

于是打印出如下的日志:

10/11/20 19:52:21 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:  host = namenode/192.168.1.104
STARTUP_MSG:  args = [-format]
STARTUP_MSG:  version = 0.19.2
STARTUP_MSG:  build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.19 -r 789657; compiled by 'root' on Tue Jun 30 12:40:50 EDT 2009
************************************************************/
10/11/20 19:52:21 INFO namenode.FSNamesystem: fsOwner=admin,sambashare
10/11/20 19:52:21 INFO namenode.FSNamesystem: supergroup=supergroup
10/11/20 19:52:21 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/11/20 19:52:21 INFO common.Storage: Image file of size 97 saved in 0 seconds.
10/11/20 19:52:21 INFO common.Storage: Storage directory /data/hadoopdir/tmp/dfs/name has been successfully formatted.
10/11/20 19:52:21 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at namenode/192.168.1.104
************************************************************/

 

这个时候在NameNode的/data/hadoopdir/tmp下面出现如下的文件树形结构:

+- dfs
      +- name
              +--- current
                        +---- edits
                        +---- fsimage
                        +---- fstime
                        +---- VERSION
              +---image
                        +---- fsimage
 

这个时候,DataNode的/data/hadoopdir/tmp中还是空的。

二、启动Hadoop

启动Hadoop需要调用命令bin/start-all.sh,输出的日志如下:

starting namenode, logging to logs/hadoop-namenode-namenode.out

192.168.1.106: starting datanode, logging to logs/hadoop-datanode-datanode02.out

192.168.1.105: starting datanode, logging to logs/hadoop-datanode-datanode01.out

192.168.1.107: starting datanode, logging to logs/hadoop-datanode-datanode03.out

192.168.1.104: starting secondarynamenode, logging to logs/hadoop-secondarynamenode-namenode.out

starting jobtracker, logging to logs/hadoop-jobtracker-namenode.out

192.168.1.106: starting tasktracker, logging to logs/hadoop-tasktracker-datanode02.out

192.168.1.105: starting tasktracker, logging to logs/hadoop-tasktracker-datanode01.out

192.168.1.107: starting tasktracker, logging to logs/hadoop-tasktracker-datanode03.out

 

从日志中我们可以看出,此脚本启动了NameNode, 三个DataNode,SecondaryName,JobTracker以及三个TaskTracker.

下面我们分别从NameNode和三个DataNode中运行jps -l,看看到底运行了那些java程序:

在NameNode中:

22214 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode

22107 org.apache.hadoop.hdfs.server.namenode.NameNode

22271 org.apache.hadoop.mapred.JobTracker

 

在datanode01中:

12580 org.apache.hadoop.mapred.TaskTracker

12531 org.apache.hadoop.hdfs.server.datanode.DataNode

 

在datanode02中:

10548 org.apache.hadoop.hdfs.server.datanode.DataNode

 

在datanode03中:

12593 org.apache.hadoop.hdfs.server.datanode.DataNode

12644 org.apache.hadoop.mapred.TaskTracker

 

同我们上面的配置完全符合。

当启动了Hadoop以后,/data/hadoopdir/tmp目录也发生了改变,通过ls -R我们可以看到。

对于NameNode:

在name文件夹中,多了in_use.lock文件,说明NameNode已经启动了

多了nameseondary文件夹,用于存放SecondaryNameNode的数据

.:

dfs

./dfs:

name  namesecondary

./dfs/name:

current  image  in_use.lock

./dfs/name/current:

edits  fsimage  fstime  VERSION

./dfs/name/image:

fsimage

./dfs/namesecondary:

current  image  in_use.lock

./dfs/namesecondary/current:

edits  fsimage  fstime  VERSION

./dfs/namesecondary/image:

fsimage

 

对于DataNode:

多了dfs和mapred两个文件夹

dfs文件夹用于存放HDFS的block数据的

mapred用于存放Map-Reduce Task任务执行所需要的数据的。

.:

dfs  mapred

./dfs:

data

./dfs/data:

current  detach  in_use.lock  storage  tmp

./dfs/data/current:

dncp_block_verification.log.curr  VERSION

./dfs/data/detach:

./dfs/data/tmp:

./mapred:

local

./mapred/local:

 

当然随着Hadoop的启动,logs文件夹下也多个很多的日志:

在NameNode上,日志有:

NameNode的日志:

hadoop-namenode-namenode.log此为log4j的输出日志

hadoop-namenode-namenode.out此为stdout和stderr的输出日志

SecondaryNameNode的日志:

hadoop-secondarynamenode-namenode.log此为log4j的输出日志

hadoop-secondarynamenode-namenode.out此为stdout和stderr的输出日志

JobTracker的日志:

hadoop-jobtracker-namenode.log此为log4j的输出日志

hadoop-jobtracker-namenode.out此为stdout和stderr的输出日志

在DataNode上的日志有(以datanode01为例子):

DataNode的日志

hadoop-datanode-datanode01.log此为log4j的输出日志

hadoop-datanode-datanode01.out此为stdout和stderr的输出日志

TaskTracker的日志

hadoop-tasktracker-datanode01.log此为log4j的输出日志

hadoop-tasktracker-datanode01.out此为stdout和stderr的输出日志

下面我们详细查看这些日志中的有重要意义的信息:

在hadoop-namenode-namenode.log文件中,我们可以看到NameNode启动的过程:

Namenode up at: namenode/192.168.1.104:9000

//文件的数量

Number of files = 0

Number of files under construction = 0

//加载fsimage和edits文件形成FSNamesystem

Image file of size 97 loaded in 0 seconds.

Edits file /data/hadoopdir/tmp/dfs/name/current/edits of size 4 edits # 0 loaded in 0 seconds.

Image file of size 97 saved in 0 seconds.

Finished loading FSImage in 12812 msecs

//统计block的数量和状态

Total number of blocks = 0

Number of invalid blocks = 0

Number of under-replicated blocks = 0

Number of  over-replicated blocks = 0

//离开safe mode

Leaving safe mode after 12 secs.

//注册DataNode

Adding a new node: /default-rack/192.168.1.106:50010

Adding a new node: /default-rack/192.168.1.105:50010

Adding a new node: /default-rack/192.168.1.107:50010

 

在hadoop-secondarynamenode-namenode.log文件中,我们可以看到SecondaryNameNode的启动过程:

Secondary Web-server up at: 0.0.0.0:50090

//进行Checkpoint的周期

Checkpoint Period  :3600 secs (60 min)

Log Size Trigger    :67108864 bytes (65536 KB)

//进行一次checkpoint,从NameNode下载fsimage和edits

Downloaded file fsimage size 97 bytes.

Downloaded file edits size 370 bytes.

//加载edit文件,进行合并,将合并后的fsimage保存,我们可以看到fsimage变大了

Edits file /data/hadoopdir/tmp/dfs/namesecondary/current/edits of size 370 edits # 6 loaded in 0 seconds.

Image file of size 540 saved in 0 seconds.

//此次checkpoint结束

Checkpoint done. New Image Size: 540

 

在hadoop-jobtracker-namenode.log文件中,我们可以看到JobTracker的启动过程:

JobTracker up at: 9001

JobTracker webserver: 50030

//清除HDFS中的/data/hadoopdir/tmp/mapred/system文件夹,是用于Map-Reduce运行过程中保存数据的

Cleaning up the system directory

//不断的从TaskTracker收到heartbeat,第一次是注册TaskTracker

Got heartbeat from: tracker_datanode01:localhost/127.0.0.1:58297

Adding a new node: /default-rack/datanode01

Got heartbeat from: tracker_datanode03:localhost/127.0.0.1:37546

Adding a new node: /default-rack/datanode03

 

在hadoop-datanode-datanode01.log中,可以看到DataNode的启动过程:

//格式化DataNode存放block的文件夹

Storage directory /data/hadoopdir/tmp/dfs/data is not formatted.

Formatting ...

//启动DataNode

Opened info server at 50010

Balancing bandwith is 1048576 bytes/s

Initializing JVM Metrics with processName=DataNode, sessionId=null

//向NameNode注册此DataNode

dnRegistration = DatanodeRegistration(datanode01:50010, storageID=, infoPort=50075, ipcPort=50020)

New storage id DS-1042573498-192.168.1.105-50010-1290313555129 is assigned to data-node 192.168.1.105:5001

DatanodeRegistration(192.168.1.105:50010, storageID=DS-1042573498-192.168.1.105-50010-1290313555129, infoPort=50075, ipcPort=50020)In DataNode.run, data = FSDataset{dirpath='/data/hadoopdir/tmp/dfs/data/current'}

//启动block scanner

Starting Periodic block scanner.

 

在hadoop-tasktracker-datanode01.log中,可以看到TaskTracker的启动过程:

//启动TaskTracker

Initializing JVM Metrics with processName=TaskTracker, sessionId=

TaskTracker up at: localhost/127.0.0.1:58297

Starting tracker tracker_datanode01:localhost/127.0.0.1:58297

//向JobTracker发送heartbeat

Got heartbeatResponse from JobTracker with responseId: 0 and 0 actions

 

一个特殊的log文件是hadoop-tasktracker-datanode02.log中,因为我们设置的最大Map Task数目和最大Reduce Task数据为0,而报了一个Exception,Can not start task tracker because java.lang.IllegalArgumentException,从而使得datanode02上的TaskTracker没有启动起来。

当Hadoop启动起来以后,在HDFS中也创建了一些文件夹/data/hadoopdir/tmp/mapred/system,用来保存Map-Reduce运行时候的共享资源。

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/59cd1ff0afffd8d24f149472d0b4b849.html