FirstKeyOnlyFilter的使用方法及例項
阿新 • • 發佈:2019-01-01
FirstKeyOnlyFilter :api中解釋如下:
A filter that will only return the first KV from each row.
This filter can be used to more efficiently perform row count operations.
說的明明白白,只會取得每條資料的第一個kv,可以用於count,計算總數,速度很快
程式碼如下:
希望批評指正
最好設定tableKeyword.setScannerCaching(500);public int getCount() { long bef = System.currentTimeMillis(); int i = 0; HTable tableKeyword = new HTable(conf,"tableName"); tableKeyword.setScannerCaching(500); ResultScanner rs = null; try { Scan s = new Scan(); s.setCaching(500); s.setCacheBlocks(false); s.setFilter(new FirstKeyOnlyFilter()); rs = tableKeyword.getScanner(s); } catch (IOException e) { log.warn(e); e.printStackTrace(); } for (org.apache.hadoop.hbase.client.Result r : rs) { i++ ; } long now = System.currentTimeMillis(); log.warn("keyword表中資料總數 :" + i + ", 所用時間 : " + (now - bef)/1000.0); rs.close(); return i; }
s.setCaching(500);
s.setCacheBlocks(false);這三個引數,否則速度會降下來很多
總的來說,可以節省很多時間