1. 程式人生 > >hadoop SequenceFile介紹

hadoop SequenceFile介紹

SequenceFile是一個由二進位制序列化過的key/value的位元組流組成的文字儲存檔案。

基於壓縮型別CompressType,共有三種SequenceFileWriter:

1 2 3 4 5 6 7 8 public static enum CompressionType { /** 不壓縮 */ NONE,  /** 只壓縮value */ RECORD, /** 壓縮很多記錄的key/value成一塊 */ BLOCK }

There are three 

SequenceFileWriters based on the CompressType used to compress key/value pairs:

1、Writer : Uncompressed records.

SequenceFile裡有Writer的實現,原始碼如下:

1 public static class Writer implements java.io.Closeable, Syncable
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 /** Write and flush the file header. */ private void writeFileHeader()  throws IOException { out.write(VERSION); Text.writeString(out, keyClass.getName()); Text.writeString(out, valClass.getName()); out.writeBoolean(this.isCompressed()); out.writeBoolean(this.isBlockCompressed());
if (this.isCompressed()) {