1. 程式人生 > >FirstKeyOnlyFilter的使用方法及例項

FirstKeyOnlyFilter的使用方法及例項

FirstKeyOnlyFilter :api中解釋如下:

 A filter that will only return the first KV from each row.

This filter can be used to more efficiently perform row count operations. 

說的明明白白,只會取得每條資料的第一個kv,可以用於count,計算總數,速度很快

程式碼如下:

希望批評指正

public int getCount() {
		long bef = System.currentTimeMillis();
		int i = 0;                                                                                                                          HTable tableKeyword = new HTable(conf,"tableName");                                                                                 tableKeyword.setScannerCaching(500);
		ResultScanner rs = null;
		try {
			Scan s = new Scan();
			s.setCaching(500);
			s.setCacheBlocks(false);
			s.setFilter(new FirstKeyOnlyFilter());
			rs = tableKeyword.getScanner(s);
		} catch (IOException e) {
			log.warn(e);
			e.printStackTrace();
		}
		for (org.apache.hadoop.hbase.client.Result r : rs) {
			i++ ;
		}
		long now = System.currentTimeMillis();
		log.warn("keyword表中資料總數 :" + i + ", 所用時間 : " + (now - bef)/1000.0);
		rs.close();
		return i;
	}
最好設定tableKeyword.setScannerCaching(500);

s.setCaching(500);
s.setCacheBlocks(false);
這三個引數,否則速度會降下來很多

總的來說,可以節省很多時間