Hadoop NameNode单点问题解决方案之一 AvatarNode(2)

大家说让我比较一下AvatarNode HA实现和Hadoop with DRBD and Linux HA.这两种方法都是需要主节点写事务日志到一个共享的硬件。不同点是,Standby AvatarNode是一个热备而DRBD-LinuxHA是一种冷备份。AvatarNode对于5亿文件的failover时间是1分钟,但是DRBD-LinuxHA可能需要一个小时。AvatarNode可以热备的原因就是它封装了一个NameNode的实例并且这个实例会从DataNode接受信息,这使得metadata的状态是最新的。


代码参见HDFS-976. (前提条件HDFS-966)

注:
1. Blog地址:

2. 最新代码参见:
https://github.com/facebook/hadoop-20-warehouse/tree/master/src/contrib/highavailability

3. Dhruba回答的部分问题
Dhruba关于NameNode启动过程所需要时间的一个参考:
our clyster has 2000 nodes. The fsimage is about 12 GB(70 million files and directories). The cluster has a total of around 90 milliosn blocks. It takes about 6 minutes to read and process the fsimage file. Then it takes another 35 minutes to process block reports from all datanodes.

AvatarNode和ZooKeeper整合(解决VIP问题)
We are in the process of open-sourcing the AvatarNode integration with zookeeper. We will post this patch as part of HDFS-976 very soon.

有关NFS延迟
I agree that NFS is not the fastest way to read/access that transaction log. However, it is not really NFS but rather the NFS implementation that could be an issue. We use a NetApp NFS Filer and the filer's uptime and latencies are hard to beat! Also, we depend on NFS close-to-open cache coherency semantics:

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/c6354c82436643b839d30d9a2bf801e4.html