Hadoop官方文档翻译(5)

为了最大限度地减少全局带宽消耗和读取延迟,HDFS试图让读取者的读取需求离副本最近。如果存在一个副本在客户端节点所在的同个机架上,那么这个副本是满足读取需求的首选。如果HDFS集群横跨多个数据中心,那么在本地数据中心的副本将是优先于其他远程副本。

Safemode(安全模式)

On startup, the NameNode enters a special state called Safemode. Replication of data blocks does not occur when the NameNode is in the Safemode state. The NameNode receives Heartbeat and Blockreport messages from the DataNodes. A Blockreport contains the list of data blocks that a DataNode is hosting. Each block has a specified minimum number of replicas. A block is considered safely replicated when the minimum number of replicas of that data block has checked in with the NameNode. After a configurable percentage of safely replicated data blocks checks in with the NameNode (plus an additional 30 seconds), the NameNode exits the Safemode state. It then determines the list of data blocks (if any) that still have fewer than the specified number of replicas. The NameNode then replicates these blocks to other DataNodes.

在启动时,NameNode进入一个特殊的状态称之为安全模式。当NameNode进入安全模式之后数据块的复制将不会发生。NameNode接收来自DataNode的心跳和数据块报告。数据块报告包含正在运行的DataNode上的数据块信息集合。每个快都指定了最小副本数。一个数据块如果被NameNode检查确保它满足最小副本数,那么它被认为是安全的。在NameNode检查配置的一定比例的数据块安全性检查(加上30s),NameNode将会退出安全模式。然后它将确认有一组(如果可能)还没有达到指定数目副本的数据块。NameNode将这些数据块复制到其他DataNode。

The Persistence of File System Metadata(文件系统元数据的持久性)

The HDFS namespace is stored by the NameNode. The NameNode uses a transaction log called the EditLog to persistently record every change that occurs to file system metadata. For example, creating a new file in HDFS causes the NameNode to insert a record into the EditLog indicating this. Similarly, changing the replication factor of a file causes a new record to be inserted into the EditLog. The NameNode uses a file in its local host OS file system to store the EditLog. The entire file system namespace, including the mapping of blocks to files and file system properties, is stored in a file called the FsImage. The FsImage is stored as a file in the NameNode’s local file system too.

The NameNode keeps an image of the entire file system namespace and file Blockmap in memory. This key metadata item is designed to be compact, such that a NameNode with 4 GB of RAM is plenty to support a huge number of files and directories. When the NameNode starts up, it reads the FsImage and EditLog from disk, applies all the transactions from the EditLog to the in-memory representation of the FsImage, and flushes out this new version into a new FsImage on disk. It can then truncate the old EditLog because its transactions have been applied to the persistent FsImage. This process is called a checkpoint. In the current implementation, a checkpoint only occurs when the NameNode starts up. Work is in progress to support periodic checkpointing in the near future.

The DataNode stores HDFS data in files in its local file system. The DataNode has no knowledge about HDFS files. It stores each block of HDFS data in a separate file in its local file system. The DataNode does not create all files in the same directory. Instead, it uses a heuristic to determine the optimal number of files per directory and creates subdirectories appropriately. It is not optimal to create all local files in the same directory because the local file system might not be able to efficiently support a huge number of files in a single directory. When a DataNode starts up, it scans through its local file system, generates a list of all HDFS data blocks that correspond to each of these local files and sends this report to the NameNode: this is the Blockreport.

NameNode存储着HDFS的命名空间。NmaeNode使用一个称之为EditLog的事务日志持续地记录发生在文件系统元数据的每一个改变。例如,HDFS中创建一个文件会导致NameNode在EditLog中插入一条记录来表示。相同的,改变一个文件的复制因子也会导致EditLog中添加一条新的记录。NameNode在它本地的系统中用一个文件来存储EditLog。整个文件系统命名空间,包括blocks的映射关系和文件系统属性,将储存在一个叫FsImage的文件。FsImage也是储存在NameNode所在的本地文件系统中。

NameNode在内存中保存着整个文件系统命名空间的图像和文件映射关系。这个关键元数据项设计紧凑,以致一个有着4GB RAM的NameNode足够支持大量的文件和目录。当NameNode启动时,它将从磁盘中读取FsImage和EditLog,将EditLog中所有汇报更新到内存中的FsImage中,刷新输出一个新版本的FsImage到磁盘中。然后缩短EditLog因为它的事务汇报已经更新到持久化的FsImage中。这个过程称之为检查站。在目前这个版本中,只有当NameNode启动时会执行一次。在不久的将来会在任务执行过程也运行CheckPoint。

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/d3408e714afa302e1adbd1938d50e09d.html