Hadoop启动过程中遇到下面的问题:
2015-08-02 19:43:20,771 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = slave1/192.168.198.21
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: Java = 1.7.0_79
************************************************************/
2015-08-02 19:43:20,902 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-08-02 19:43:20,910 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2015-08-02 19:43:20,911 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-08-02 19:43:20,911 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2015-08-02 19:43:21,033 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2015-08-02 19:43:21,036 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2015-08-02 19:43:30,237 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.198.20:9000. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-08-02 19:43:31,239 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.198.20:9000. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-08-02 19:43:31,247 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to master/192.168.198.20:9000 failed on local exception: java.net.NoRouteToHostException: 没有到主机的路由
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1150)
at org.apache.hadoop.ipc.Client.call(Client.java:1118)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy3.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:414)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:453)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:335)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:300)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:385)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
Caused by: java.net.NoRouteToHostException: 没有到主机的路由
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
分析:
这种没有到主机的路由问题屡见不鲜了,一般要么是namenode 与 datanode 主机名间本身互ping就ping不通,这个概率较小,因为都知道要保证master与slaves 节点是能正常通信,所以都会检查。那么最有可能就是防火墙没有关闭,或者因为查看不出防火墙状态,所以误以为防火墙关闭了。
解决方案:
(1)从namenode主机ping其它slaves节点的主机名(注意是slaves节点的主机名),如果ping不通,原因可能是namenode节点的/etc/hosts 未配置主机名与IP地址的映射关系,补全主机名与IP地址的映射关系。
(2)从datanode主机ping master节点的主机名(注意也是节点的主机名),如果ping不通,原因可能是datenode节点的/etc/hosts 未配置主机名与IP地址的映射关系,补全主机名与IP地址的映射关系。
(3)查看各机器节点的防火墙是否关闭(或者设置防火墙开启,但对我们的指定端口开放,最好是关闭防火墙):
以下针对不同版本的Linux系统检查防火墙的状态,及关闭防火墙:
---------------------------------------------------------------
Ubuntu(ubuntu-12.04-desktop-amd64)
查看防火墙状态:ufw status
关闭防火墙:ufw disable
---------------------------------------------------------------
CentOS6.0
查看防火墙状态:service iptables status
关闭防火墙:chkconfig iptables off #开机不启动防火墙服务
--------------------------------------------------------------
centos7.0(默认是使用firewall作为防火墙,如若未改为iptables防火墙,使用以下命令查看和关闭防火墙)
查看防火墙状态:firewall-cmd --state
关闭防火墙:systemctl stop firewalld.service