1. 程式人生 > ><順序訪問><隨機訪問><HDFS>

<順序訪問><隨機訪問><HDFS>

arc 發生 hat ide oop lar 訪問慢 jump 新的

Overview

  • 如果你了解過HDFS,至少看過這句話吧:
    • HDFS is a filesystem designed for storing very large files with streaming or sequential data access patterns.
    • That‘s to say, "Hadoop is good for sequential data access"
  • 那麽第一個問題一定是sequential access VS random access ??

順序訪問 VS 隨機訪問

  • 順序訪問的數據是連續的。硬盤的磁頭是按一定的順序訪問磁片,磁頭不做頻繁的尋道

    ,這樣帶來的結果是速度很快。因為尋道時間是影響磁盤讀寫速度的主要原因。在平常的應用中順序訪問的應用很少。大文件的連續備份,是順序讀寫的。dd就是典型的順序讀寫,

  • 隨機訪問主要是磁頭在做頻繁的移動,原因是數據在磁盤的不連續性,這和數據存放到磁盤的過程有關系,隨機訪問的速度要比順序訪問慢很多。原因也是因為磁頭頻繁的尋道,定位,磁頭的移動消耗掉很多時間。大部分的應用在磁盤上的讀寫是隨機的

    • 因為在實際應用中,以LINUX為例子,在寫數據的時候,OS會預讀8個block,也就是你剛開始寫文件的時候OS會努力讓數據在磁盤上是連續的,但在宏觀上是做不到的。我們假如磁盤是新的,寫300K的一個文件。這時候是連續的。寫完後,其他文件又往硬盤裏寫,又是連續的。過一段時間,已經寫了很多文件,當然文件會經常被修改的。我們可以看到,如果修改一個文件,會發現被修改文件附近的block已經被其他文件占用了。磁頭只好把變化的block寫在磁盤的其他位置,過一段時間。磁盤上的文件就會大部分不是連續的,分散在磁盤的各個位置。當你的程序讀文件的時候,對硬盤來說,磁頭就是在不停的尋道,把分散在磁盤不同位置的數據找出來,看上去沒有絲毫的規律。當然磁頭移動到什麽位置是根據INODE來確定的。這時候程序對磁盤的訪問就是隨機的。

  • Sequential Access pattern is when you read your data in sequence (often from start to finish).

    • Consider a book example. When reading a novel, you use sequential order: you start with page 1, then move to page 2 and so on.

  • The other common pattern is called Random Access. This is when you jump from one place to another
    , and possibly even backwards when reading data.
    • For a book example, consider a dictionary. You don‘t read it like you read a novel. Instead, you search for your word in the middle somewhere. And when you‘re done looking up that word, you may perhaps go look for another word that is located hundreds of pages away from where you have your book open to at the moment. That searching of where you should start reading from is called a "seek".
    • When you access sequentially, you only need to seek once and then read until you‘re done with that data. When doing random access, you need to seek every time you want to switch to a different place in your file. This can be quite a performance hit on hard drives, because seeking is really expensive on magnetic drives.

HDFS

  • Hadoop uses blocks to store a file or parts of a file. A Hadoop block is a file on the underlying filesystem. Since the underlying filesystem stores files as blocks, one Hadoop block may consist of many blocks in the underlying file system. Blocks are large. They default to 64 megabytes each and most systems run with block sizes of 128 megabytes or larger.

    Hadoop is designed for streaming or sequential data access rather than random access. Sequential data access means fewer seeks, since Hadoop only seeks to the beginning of each block and begins reading sequentially from there.

  • HDFS is append only. To modify any portion of a file that is already written, one must rewrite the entire file and replace the old file.
  • 對一個超大文件的讀操作,只會有fileSize / blockSize次尋道,block內部是順序訪問的。
  • HDFS is not POSIX compliant.
  • Hadoop works best with very large files. The larger the file, the less time Hadoop spends seeking for the next data location on disk, the more time Hadoop runs at the limit of the bandwidth of your disks.
  • HDFS有以下advantages:
    1. fixed in size. This makes it easy to calculate how many can fit on a disk.

    2. by being made up of blocks that can be spread over multiple nodes, a file can be larger than any single disk in the cluster.

    3. HDFS blocks also don‘t waste space. If a file is not an even multiple of the block size, the block containing the remainder does not occupy the space of an entire block. 【疑問:appendToFile操作怎麽優化???】

  • 關於AppendToFile:基於上述的討論,既然是基於順序訪問優化,而對小文件(小於塊大小的文件),實際物理存儲不是塊大小(而是文件實際大小),這種情況下,append操作該如何優化呢(要確保小文件在塊大小內是連續的麽?)?google了很久沒有得出答案,找到一篇關於Append操作的設計文檔,但都是在介紹一致性和容錯性設計。最後只能通過實驗的方式,如下:
    • 在append之前我們查看下文件的塊:
    • hadoop fsck /test -blocks
      ---------------------------------
      /test/test1.txt:  Under replicated BP-1610905963-10.3.242.99-1494403766821:blk10737424051583. Target Replicas is 2 but found 1 replica(s).
      .
      /test/test2.txt:  Under replicated BP-1610905963-10.3.242.99-1494403766821:blk10737424061584. Target Replicas is 2 but found 1 replica(s).
    • 進行appendToFile操作:
    • hadoop fs -appendToFile /home/hhh/log.txt /test/test1.txt
    • 再次查看塊:
    • hadoop fsck /test -blocks
      ------------------------------------
      /test/test1.txt:  Under replicated BP-1610905963-10.3.242.99-1494403766821:blk10737424051894. Target Replicas is 2 but found 1 replica(s).
      .
      /test/test2.txt:  Under replicated BP-1610905963-10.3.242.99-1494403766821:blk10737424061584. Target Replicas is 2 but found 1 replica(s).
    • 註意到了嗎,test1.txt的block id已經發生了變化。
    • 為了進一步確認,我又用一個有三個塊的文件進行了測試。appendToFile之前的塊情況:(可以看到block id是96, 97, 98
      • /test/test.csv 314015127 bytes, 3 block(s):  Under replicated BP-1610905963-10.3.242.99-1494403766821:blk10737427091896. Target Replicas is 2 but found 1 replica(s).
        
         Under replicated BP-1610905963-10.3.242.99-1494403766821:blk10737427101897. Target Replicas is 2 but found 1 replica(s).
        
         Under replicated BP-1610905963-10.3.242.99-1494403766821:blk10737427111898. Target Replicas is 2 but found 1 replica(s).
        
        \0. BP-1610905963-10.3.242.99-1494403766821:blk10737427091896 len=134217728 repl=1
        
        \1. BP-1610905963-10.3.242.99-1494403766821:blk10737427101897 len=134217728 repl=1
        
        \2. BP-1610905963-10.3.242.99-1494403766821:blk10737427111898 len=45579671 repl=1
        
      •  appendToFile之後block id變成了96, 97, 99,也就是最後一個塊發生了變化
        /test/test.csv 314015150 bytes, 3 block(s):  Under replicated BP-1610905963-10.3.242.99-1494403766821:blk10737427091896. Target Replicas is 2 but found 1 replica(s).
        
         Under replicated BP-1610905963-10.3.242.99-1494403766821:blk10737427101897. Target Replicas is 2 but found 1 replica(s).
        
         Under replicated BP-1610905963-10.3.242.99-1494403766821:blk10737427111899. Target Replicas is 2 but found 1 replica(s).
        
        \0. BP-1610905963-10.3.242.99-1494403766821:blk10737427091896 len=134217728 repl=1
        
        \1. BP-1610905963-10.3.242.99-1494403766821:blk10737427101897 len=134217728 repl=1
        
        \2. BP-1610905963-10.3.242.99-1494403766821:blk10737427111899 len=45579694 repl=1
    • 以上,appendToFile是對文件的最後一個block進行更新。

<順序訪問><隨機訪問><HDFS>