1. 程式人生 > 其它 >HBase 原始碼學習 ---- Flush(4)

HBase 原始碼學習 ---- Flush(4)

技術標籤:HBase原始碼理解hbase

根據前三篇文章,HBase flush主要分三個階段,snapshot,flush,commit,這篇深入HBase MemStore,梳理snapshot的流程。

volatile Section activeSection;
volatile Section snapshotSection;

MemStore 主要包含這兩個Section,寫入的Cell儲存在activeSection中,volatile關鍵字保證執行緒可見性。接下來是snapshot方法:

@Override
  public MemStoreSnapshot snapshot
() { if (!snapshotSection.getCellSkipListSet().isEmpty()) { LOG.warn("Snapshot called again without clearing previous. " + "Doing nothing. Another ongoing flush or did we fail last attempt?"); } else { this.snapshotId = EnvironmentEdgeManager.currentTime
(); if (!activeSection.getCellSkipListSet().isEmpty()) { snapshotSection = activeSection; activeSection = Section.newActiveSection(comparator, conf); snapshotSection.getHeapSize().addAndGet(-DEEP_OVERHEAD); timeOfOldestEdit = Long.MAX_VALUE; } } MemStoreSnapshot memStoreSnapshot =
new MemStoreSnapshot(this.snapshotId, snapshotSection.getCellsCount().get(), snapshotSection.getHeapSize().get(), snapshotSection.getTimeRangeTracker(), new CollectionBackedScanner(snapshotSection.getCellSkipListSet(), this.comparator), this.tagsPresent); this.tagsPresent = false; return memStoreSnapshot; }

snapshot ()方法直接將activeSection的引用賦值給snapshotSection,然後重新構造一個activeSection。舊snapshotSection將會在GC的時候被回收。
Section 是DefaultMemstore中的靜態內部類,其構造方法如下:

private Section(final KeyValue.KVComparator c,
            final Configuration conf, long initHeapSize) {
      this.cellSet = new CellSkipListSet(c);
      this.heapSize = new AtomicLong(initHeapSize);
      this.cellCount = new AtomicInteger(0);
      if (conf != null && conf.getBoolean(USEMSLAB_KEY, USEMSLAB_DEFAULT)) {
        String className = conf.get(MSLAB_CLASS_NAME, HeapMemStoreLAB.class.getName());
        this.allocator = ReflectionUtils.instantiateWithCustomCtor(className,
                new Class[]{Configuration.class}, new Object[]{conf});
      } else {
        this.allocator = null;
      }
    }

其使用一個CellSkipListSet存放cell。分配記憶體涉及到一個類,HeapMemStoreLAB,LAB是local allocation buffer的縮寫。LAB保證memstore中的cell都存在大塊的chunk中,這樣,在flush之後,記憶體是以大塊chunk為單位進行釋放,減少RegionServer中的記憶體碎片。涉及四個超引數:

引數名含義預設值
hbase.hregion.memstore.mslab.enabled是否使用LABtrue
hbase.regionserver.mslab.classLAB具體實現類HeapMemStoreLAB
hbase.hregion.memstore.mslab.chunksize一個chunk的大小2048 * 1024 B
hbase.hregion.memstore.mslab.max.allocationLAB一次分配記憶體的最大Bytes256 * 1024 B