hadoop 集群HA高可用搭建以及问题解决方案 (3)

4)在ch01机器上,将ch01的数据复制到ch02上来,在ch02上执行 
hdfs namenode –bootstrapStandby 
5)启动ch02上的namenode,执行命令后 
hadoop-daemon.sh start namenode 
浏览::50070/dfshealth.jsp可以看到m2的状态。 
这个时候在网址上可以发现m1和m2的状态都是standby。

6)启动所有的datanode,在ch01上执行 
hadoop-daemons.sh start datanode

7)启动yarn,在ch01上执行以下命令 
start-yarn.sh 
8)、启动 ZooKeeperFailoverCotroller,在ch01,ch02机器上依次执行以下命令,这个时候再浏览50070端口,可以发现ch01变成active状态了,而m2还是standby状态

hadoop-daemon.sh start zkfc

10)、测试HDFS是否可用 
/home/hadoop/hadoop-2.2.0/bin/hdfs dfs -ls /

问题

journalnode启动失败

[root@ch01 hadoop]# hadoop-daemon.sh start journalnode starting journalnode, logging to /opt/hadoop/hadoop-2.6.0-cdh5.6.0/logs/hadoop-root-journalnode-ch01.out Exception in thread "main" java.lang.IllegalArgumentException: Journal dir \'file:/opt/hadoop/hadoop-2.6.0-cdh5.6.0/tmp/dfs/journalnode\' should be an absolute path at org.apache.hadoop.hdfs.qjournal.server.JournalNode.validateAndCreateJournalDir(JournalNode.java:120) at org.apache.hadoop.hdfs.qjournal.server.JournalNode.start(JournalNode.java:144) at org.apache.hadoop.hdfs.qjournal.server.JournalNode.run(JournalNode.java:134) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.qjournal.server.JournalNode.main(JournalNode.java:307)

解决:将HDFS-site.xml中的journalnode属性value的值设置为绝对路径.不需要加file:关键字

DataNode启动失败 
Java.io.IOException: All specified directories are failed to load. 
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:477) 
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1394) 
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1355) 
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317) 
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:228) 
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829) 
at java.lang.Thread.run(Thread.java:745) 
2017-02-20 10:25:39,363 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to ch02/192.168.128.122:8020. Exiting. 
java.io.IOException: All specified directories are failed to load. 
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:477) 
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1394) 
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1355) 
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317) 
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:228) 
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829) 
at java.lang.Thread.run(Thread.java:745)

解决:原因是因为,Namenode中的namenode CID-5a00c610-f0e3-4ecd-b298-129cc5544e7d和DataNode中的CID不一致导致的

hadoop 集群HA高可用搭建以及问题解决方案

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zwzxzp.html