Hadoop官方文档翻译(8)

HDFS是被设计来支持大量文件的。一个应用兼容HDFS是那些处理大数据集的应用。这些应用都是一次写入数据但可��次或多次读取和需要这些读取满足一定的流式速度。HDFS支持一次写入多次读取文件语义。HDFS中通常的数据块大小为128M。因此,一个HDFS文件会被切割成128M大小的块,如果可以的话,每一个大块都会分属于一个不同的DatNode.

Staging(分段)

A client request to create a file does not reach the NameNode immediately. In fact, initially the HDFS client caches the file data into a local buffer. Application writes are transparently redirected to this local buffer. When the local file accumulates data worth over one chunk size, the client contacts the NameNode. The NameNode inserts the file name into the file system hierarchy and allocates a data block for it. The NameNode responds to the client request with the identity of the DataNode and the destination data block. Then the client flushes the chunk of data from the local buffer to the specified DataNode. When a file is closed, the remaining un-flushed data in the local buffer is transferred to the DataNode. The client then tells the NameNode that the file is closed. At this point, the NameNode commits the file creation operation into a persistent store. If the NameNode dies before the file is closed, the file is lost.

The above approach has been adopted after careful consideration of target applications that run on HDFS. These applications need streaming writes to files. If a client writes to a remote file directly without any client side buffering, the network speed and the congestion in the network impacts throughput considerably. This approach is not without precedent. Earlier distributed file systems, e.g. AFS, have used client side caching to improve performance. A POSIX requirement has been relaxed to achieve higher performance of data uploads.

一个要求创建一个文件的客户端请求将不会立刻到达NameNode。事实上,HDFS首先将文件缓存在本地缓存中。应用的写入将透明地定位到这个本地缓存中。当本地缓存累积数据超过一个块(128M)的大小,客户端将会连接NameNode。NameNode将文件名插入到文件系统层级中并为它分配一个数据块。NameNode将DataNode的id和数据块的地址回复给客户端。然后客户端建本地缓存的数据刷新到指定的DataNode。当文件关闭后,本地剩余的未刷新的数据将会传输到DataNode。然后客户端告诉NameNode文件已经关闭。在这个点,NameNode提交创建文件操作在一个持久化仓库。如果在文件关闭之前NameNode死亡了,文件会丢失。

在仔细地研究运行在HDFS的目标程序之后上面的方法被采用了。应用需要流式写入文件。如果客户端在没有进行任何缓存的情况下直接写入远程文件,那么网络速度和网络拥堵会影响吞吐量。这个方法不是没有先例。早期的分布式文件系统,例如AFS,已经使用客户端缓存来提高性能。POSIX已经满足轻松实现高性能的文件上传。

Replication Pipelining(复制管道/复制流水线)

When a client is writing data to an HDFS file, its data is first written to a local buffer as explained in the previous section. Suppose the HDFS file has a replication factor of three. When the local buffer accumulates a chunk of user data, the client retrieves a list of DataNodes from the NameNode. This list contains the DataNodes that will host a replica of that block. The client then flushes the data chunk to the first DataNode. The first DataNode starts receiving the data in small portions, writes each portion to its local repository and transfers that portion to the second DataNode in the list. The second DataNode, in turn starts receiving each portion of the data block, writes that portion to its repository and then flushes that portion to the third DataNode. Finally, the third DataNode writes the data to its local repository. Thus, a DataNode can be receiving data from the previous one in the pipeline and at the same time forwarding data to the next one in the pipeline. Thus, the data is pipelined from one DataNode to the next.

当一个客户端写入一个HDFS文件,它的数据首先是写到它的本地缓存当中(这在前面一节已经解释了)。假设HDFS文件的复制因子为3。当客户端累积了大量的用户数据,客户端将会从NameNode取得DataNode列表。这个列表包含着该块的副本宿主DataNode。然偶客户端会将数据刷新到第一个DataNode。第一个DataNode接收一小部分数据,将每一小块数据写到本地仓库然后将这小块数据传到列表的第二个DataNode。第二个DataNode开始接收每一小块数据并写到本地数据仓库然后将数据传输到第三个DataNode,最后,第三个DataNode将数据写到本地仓库。因此,一个Dataode能够在管道中接收到从上一个节点接收到数据并且在同一时间将数据转发给下一个节点。因此,数据在管道中从一个DataNode传输到下一个。

Accessibility(可访问性)

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/d3408e714afa302e1adbd1938d50e09d.html