The HDFS architecture is compatible with data rebalancing schemes. A scheme might automatically move data from one DataNode to another if the free space on a DataNode falls below a certain threshold. In the event of a sudden high demand for a particular file, a scheme might dynamically create additional replicas and rebalance other data in the cluster. These types of data rebalancing schemes are not yet implemented.
HDFS架构兼容数据调整方案。一个方案可能自动地将数据从一个DataNode移动到另一个距离上限值还有多余空间的DataNode。对一个特别的文件突然发生的需求,一个方案可以动态地创建额外的副本和重新调整集群中的数据。这些类型的调整方案目前还没有实现。
Data Integrity(数据完整性)It is possible that a block of data fetched from a DataNode arrives corrupted. This corruption can occur because of faults in a storage device, network faults, or buggy software. The HDFS client software implements checksum checking on the contents of HDFS files. When a client creates an HDFS file, it computes a checksum of each block of the file and stores these checksums in a separate hidden file in the same HDFS namespace. When a client retrieves file contents it verifies that the data it received from each DataNode matches the checksum stored in the associated checksum file. If not, then the client can opt to retrieve that block from another DataNode that has a replica of that block.
从一个失效的DataNode取得数据块是可行的。这种失效会发生是因为存储设备故障、网络故障或者软件bug。HDFS客户端软件实现了校验和用来校验HDFS文件的内容。当客户端创建一个HDFS文件,他会计算文件的每一个数据块的校验和并且将这些校验和储存在HDFS命名空间中一个单独的隐藏的文件当中。当客户端从DataNode获得数据时会对对其进行校验和,并且将之与储存在相关校验和文件中的校验和进行匹配。如果没有,客户端会选择从另一个拥有该数据块副本的DataNode上恢复数据。
Metadata Disk Failure(元数据的磁盘故障)The FsImage and the EditLog are central data structures of HDFS. A corruption of these files can cause the HDFS instance to be non-functional. For this reason, the NameNode can be configured to support maintaining multiple copies of the FsImage and EditLog. Any update to either the FsImage or EditLog causes each of the FsImages and EditLogs to get updated synchronously. This synchronous updating of multiple copies of the FsImage and EditLog may degrade the rate of namespace transactions per second that a NameNode can support. However, this degradation is acceptable because even though HDFS applications are very data intensive in nature, they are not metadata intensive. When a NameNode restarts, it selects the latest consistent FsImage and EditLog to use.
Another option to increase resilience against failures is to enable High Availability using multiple NameNodes either with a shared storage on NFS or using a distributed edit log (called Journal). The latter is the recommended approach.
FsImage和EditLog是HDFS架构的中心数据。这些数据的失效会引起HDFS实例失效。因为这个原因,NameNode可以配置用来维持FsImage和EditLog的多个副本。FsImage或EditLog的任何改变会引起每一份FsImage和EditLog同步更新。同步更新多份FsImge和EditLog降低NameNode能支持的每秒更新命名空间事务的频率。然而,频率的降低是可以接受尽管HDFS应用本质上是对数据敏感,而不是对元数据敏感。当一个NameNode重新启动,他会选择最新的FsImage和EditLog来使用。
Snapshots(快照)Snapshots support storing a copy of data at a particular instant of time. One usage of the snapshot feature may be to roll back a corrupted HDFS instance to a previously known good point in time.
快照支持存储一个特点时间的的数据副本。一个使用可能快照功能的情况是一个失效的HDFS实例想要回滚到之前一个已知是正确的时间。
Data Organization(数据组织) Data Blocks(数据块)HDFS is designed to support very large files. Applications that are compatible with HDFS are those that deal with large data sets. These applications write their data only once but they read it one or more times and require these reads to be satisfied at streaming speeds. HDFS supports write-once-read-many semantics on files. A typical block size used by HDFS is 128 MB. Thus, an HDFS file is chopped up into 128 MB chunks, and if possible, each chunk will reside on a different DataNode.