1. 程式人生 > >HBase原始碼分析之如何找到region location

HBase原始碼分析之如何找到region location

通過client的原始碼分析,我們發現每次建立連線前需要先找到rowkey所屬region的regionserver。本篇來分析一下這個找到regionserver的整個流程。
從程式碼connection.getRegionLocator(tableName)開始,跟蹤下來最後呼叫的是如下程式碼

// ConnectionManager.java
    public HRegionLocation getRegionLocation(final TableName tableName,
        final byte [] row, boolean reload)
    throws
IOException { return reload? relocateRegion(tableName, row): locateRegion(tableName, row); }

這裡根據reload的方式來確定是從遠端獲取location資訊還是從本地獲取。

// ConnectionManager.java
public RegionLocations locateRegion(final TableName tableName,
      final byte [] row, boolean useCache, boolean retry, int replicaId)
    throws
IOException { if (this.closed) throw new IOException(toString() + " closed"); if (tableName== null || tableName.getName().length == 0) { throw new IllegalArgumentException( "table name cannot be null or zero length"); } if (tableName.equals(TableName.META_TABLE_NAME)) { return
locateMeta(tableName, useCache, replicaId); } else { // Region not in the cache - have to go to the meta RS return locateRegionInMeta(tableName, row, useCache, retry, replicaId); } }

再跟蹤程式碼下來,會根據表的型別選擇是獲取meta表的location還是從meta表中獲取對應region的location。

// ConnectionManager.java
private RegionLocations locateMeta(final TableName tableName,
        boolean useCache, int replicaId) throws IOException {
      // HBASE-10785: We cache the location of the META itself, so that we are not overloading
      // zookeeper with one request for every region lookup. We cache the META with empty row
      // key in MetaCache.
      byte[] metaCacheKey = HConstants.EMPTY_START_ROW; // use byte[0] as the row for meta
      RegionLocations locations = null;
      if (useCache) {
        locations = getCachedLocation(tableName, metaCacheKey);
        if (locations != null && locations.getRegionLocation(replicaId) != null) {
          return locations;
        }
      }

      // only one thread should do the lookup.
      synchronized (metaRegionLock) {
        // Check the cache again for a hit in case some other thread made the
        // same query while we were waiting on the lock.
        if (useCache) {
          locations = getCachedLocation(tableName, metaCacheKey);
          if (locations != null && locations.getRegionLocation(replicaId) != null) {
            return locations;
          }
        }

        // Look up from zookeeper
        locations = this.registry.getMetaRegionLocation();
        if (locations != null) {
          cacheLocation(tableName, locations);
        }
      }
      return locations;
    }

獲取meta表路徑的步驟為

  1. 先從本地cache來獲取,獲取不到走第2步
  2. 從zookeeper上的${zookeeper.znode.parent}/meta-region-server獲取meta表所在的regionserver地址。
    再來看看user表獲取location的流程
//ConnectionManager.java
    /*
      * Search the hbase:meta table for the HRegionLocation
      * info that contains the table and row we're seeking.
      */
    private RegionLocations locateRegionInMeta(TableName tableName, byte[] row,
                   boolean useCache, boolean retry, int replicaId) throws IOException {

      // If we are supposed to be using the cache, look in the cache to see if
      // we already have the region.
      if (useCache) {
        RegionLocations locations = getCachedLocation(tableName, row);
        if (locations != null && locations.getRegionLocation(replicaId) != null) {
          return locations;
        }
      }

      // build the key of the meta region we should be looking for.
      // the extra 9's on the end are necessary to allow "exact" matches
      // without knowing the precise region names.
      byte[] metaKey = HRegionInfo.createRegionName(tableName, row, HConstants.NINES, false);

      Scan s = new Scan();
      s.setReversed(true);
      s.setStartRow(metaKey);
      s.setSmall(true);
      s.setCaching(1);
      if (this.useMetaReplicas) {
        s.setConsistency(Consistency.TIMELINE);
      }

      int localNumRetries = (retry ? numTries : 1);

      for (int tries = 0; true; tries++) {
        if (tries >= localNumRetries) {
          throw new NoServerForRegionException("Unable to find region for "
              + Bytes.toStringBinary(row) + " in " + tableName +
              " after " + localNumRetries + " tries.");
        }
        if (useCache) {
          RegionLocations locations = getCachedLocation(tableName, row);
          if (locations != null && locations.getRegionLocation(replicaId) != null) {
            return locations;
          }
        } else {
          // If we are not supposed to be using the cache, delete any existing cached location
          // so it won't interfere.
          metaCache.clearCache(tableName, row);
        }

        // Query the meta region
        try {
          Result regionInfoRow = null;
          ReversedClientScanner rcs = null;
          try {
            rcs = new ClientSmallReversedScanner(conf, s, TableName.META_TABLE_NAME, this,
              rpcCallerFactory, rpcControllerFactory, getMetaLookupPool(), 0);
            regionInfoRow = rcs.next();
          } finally {
            if (rcs != null) {
              rcs.close();
            }
          }

          if (regionInfoRow == null) {
            throw new TableNotFoundException(tableName);
          }

          // convert the row result into the HRegionLocation we need!
          RegionLocations locations = MetaTableAccessor.getRegionLocations(regionInfoRow);
          if (locations == null || locations.getRegionLocation(replicaId) == null) {
            throw new IOException("HRegionInfo was null in " +
              tableName + ", row=" + regionInfoRow);
          }
          HRegionInfo regionInfo = locations.getRegionLocation(replicaId).getRegionInfo();
          if (regionInfo == null) {
            throw new IOException("HRegionInfo was null or empty in " +
              TableName.META_TABLE_NAME + ", row=" + regionInfoRow);
          }

          // possible we got a region of a different table...
          if (!regionInfo.getTable().equals(tableName)) {
            throw new TableNotFoundException(
                  "Table '" + tableName + "' was not found, got: " +
                  regionInfo.getTable() + ".");
          }
          if (regionInfo.isSplit()) {
            throw new RegionOfflineException("the only available region for" +
              " the required row is a split parent," +
              " the daughters should be online soon: " +
              regionInfo.getRegionNameAsString());
          }
          if (regionInfo.isOffline()) {
            throw new RegionOfflineException("the region is offline, could" +
              " be caused by a disable table call: " +
              regionInfo.getRegionNameAsString());
          }

          ServerName serverName = locations.getRegionLocation(replicaId).getServerName();
          if (serverName == null) {
            throw new NoServerForRegionException("No server address listed " +
              "in " + TableName.META_TABLE_NAME + " for region " +
              regionInfo.getRegionNameAsString() + " containing row " +
              Bytes.toStringBinary(row));
          }

          if (isDeadServer(serverName)){
            throw new RegionServerStoppedException("hbase:meta says the region "+
                regionInfo.getRegionNameAsString()+" is managed by the server " + serverName +
                ", but it is dead.");
          }
          // Instantiate the location
          cacheLocation(tableName, locations);
          return locations;
        } catch (TableNotFoundException e) {
          // if we got this error, probably means the table just plain doesn't
          // exist. rethrow the error immediately. this should always be coming
          // from the HTable constructor.
          throw e;
        } catch (IOException e) {
          ExceptionUtil.rethrowIfInterrupt(e);

          if (e instanceof RemoteException) {
            e = ((RemoteException)e).unwrapRemoteException();
          }
          if (tries < localNumRetries - 1) {
            if (LOG.isDebugEnabled()) {
              LOG.debug("locateRegionInMeta parentTable=" +
                  TableName.META_TABLE_NAME + ", metaLocation=" +
                ", attempt=" + tries + " of " +
                localNumRetries + " failed; retrying after sleep of " +
                ConnectionUtils.getPauseTime(this.pause, tries) + " because: " + e.getMessage());
            }
          } else {
            throw e;
          }
          // Only relocate the parent region if necessary
          if(!(e instanceof RegionOfflineException ||
              e instanceof NoServerForRegionException)) {
            relocateRegion(TableName.META_TABLE_NAME, metaKey, replicaId);
          }
        }
        try{
          Thread.sleep(ConnectionUtils.getPauseTime(this.pause, tries));
        } catch (InterruptedException e) {
          throw new InterruptedIOException("Giving up trying to location region in " +
            "meta: thread is interrupted.");
        }
      }
    }

其具體處理步驟如下

  1. 如果使用cache的話就從本地cache中獲取
  2. 沒找到就封裝一個small scan請求,根據表名和row生成一個meta表的rowkey
    注意這裡rowkey的格式tableName,row,99999999999999,另外是使用的small和reverse的scan
  3. 獲取scan的第一條記錄,然後在做一些校驗,以確定返回的location地址是正常有效的
    另外這裡做了個重試的處理,重試次數 引數hbase.client.retries.number 預設值31