hbase資料查詢及過濾器詳細使用
地址:https://blog.csdn.net/m0_37739193/article/details/73615016
本文介紹了在hbase中簡單的資料查詢及過濾器(比較全)的使用,程式碼和命令均經過本人實測通過,辛苦的將這些整理出來為以後方便查詢
建立並插入資料:
- hbase(main):179:0> create 'scores','grade','course'
- hbase(main):180:0> put 'scores','zhangsan01','course:art','90'
- hbase(main):181:0> scan 'scores'
- ROW COLUMN+CELL
- zhangsan01 column=course:art, timestamp=1498003561726, value=90
- 1 row(s) in 0.0150 seconds
- hbase(main):182:0> put 'scores','zhangsan01','course:math','99',1498003561726
- (這裡手動設定時間戳的時候一定不能大於你當前的系統時間,否則的話無法刪除該資料,我這裡手動設定資料是為了下面的DependentColumnFilter過濾器試驗。你可以檢視一下插入第一條資料的時間戳,再插入第二條資料的時間戳為第一條資料的時間戳)
- hbase(main):183:0> put 'scores','zhangsan01','grade:','101'
- 問題:當我將這條插入的資料刪除之後再執行put 'scores','zhangsan01','grade:','101',1498003561726後能成功卻scan 'scores'後沒有該條資料,而再執行put 'scores','zhangsan01','grade:','101'後scan 'scores'卻能查到該條資料。如果想插入該條資料的時候手動設定時間戳的話,必須在第一次插入該條資料或者truncate後再插入。
- hbase(main):184:0> put 'scores','zhangsan02','course:art','90'
- hbase(main):185:0> get 'scores','zhangsan02','course:art'
- COLUMN CELL
- course:art timestamp=1498003601365, value=90
- 1 row(s) in 0.0080 seconds
- hbase(main):186:0> put 'scores','zhangsan02','grade:','102',1498003601365
- hbase(main):187:0> put 'scores','zhangsan02','course:math','66',1498003561726
- hbase(main):188:0> put 'scores','lisi01','course:math','89',1498003561726
- hbase(main):189:0> put 'scores','lisi01','course:art','89'
- hbase(main):190:0> put 'scores','lisi01','grade:','201',1498003561726
查詢資料:
根據rowkey查詢:
- hbase(main):187:0> get 'scores','zhangsan01'
- COLUMN CELL
- course:art timestamp=1498003561726, value=90
- course:math timestamp=1498003561726, value=99
- grade: timestamp=1498003593575, value=101
- 3 row(s) in 0.0160 seconds
根據列名查詢:
- hbase(main):188:0> scan 'scores',{COLUMNS=>'course:art'}
- ROW COLUMN+CELL
- lisi01 column=course:art, timestamp=1498003655021, value=89
- zhangsan01 column=course:art, timestamp=1498003561726, value=90
- zhangsan02 column=course:art, timestamp=1498003601365, value=90
- 3 row(s) in 0.0120 seconds
查詢兩個rowkey之間的資料:
- hbase(main):205:0> scan 'scores',{STARTROW=>'zhangsan01',STOPROW=>'zhangsan02'}
- ROW COLUMN+CELL
- zhangsan01 column=course:art, timestamp=1498003561726, value=90
- zhangsan01 column=course:math, timestamp=1498003561726, value=99
- zhangsan01 column=grade:, timestamp=1498003593575, value=101
- 1 row(s) in 0.0140 seconds
查詢兩個rowkey且根據列名來查詢:
- hbase(main):206:0> scan 'scores',{COLUMNS=>'course:art',STARTROW=>'zhangsan01',STOPROW=>'zhangsan02'}
- ROW COLUMN+CELL
- zhangsan01 column=course:art, timestamp=1498003561726, value=90
- 1 row(s) in 0.0110 seconds
查詢指定rowkey到末尾根據列名的查詢:
- hbase(main):207:0> scan 'scores',{COLUMNS=>'course:art',STARTROW=>'zhangsan01',STOPROW=>'zhangsan09'}
- ROW COLUMN+CELL
- zhangsan01 column=course:art, timestamp=1498003561726, value=90
- zhangsan02 column=course:art, timestamp=1498003601365, value=90
- 2 row(s) in 0.0310 seconds
過濾器的使用:
引言 -- 引數基礎
有兩個引數類在各類Filter中經常出現,統一介紹下:
(1)比較運算子CompareOp
比較運算子用於定義比較關係,可以有以下幾類值供選擇:
EQUAL 相等
GREATER 大於
GREATER_OR_EQUAL 大於等於
LESS 小於
LESS_OR_EQUAL 小於等於
NOT_EQUAL 不等於
(2)比較器 ByteArrayComparable
通過比較器可以實現多樣化目標匹配效果,比較器有以下子類可以使用:
BinaryComparator 匹配完整位元組陣列
BinaryPrefixComparator 匹配位元組陣列字首
BitComparator Performs a bitwise comparison, providing a BitwiseOp class with OR, and XOR operators.
NullComparator Does not compare against an actual value but whether a given one is null, or not null.
RegexStringComparator 正則表示式匹配
SubstringComparator 子串匹配
- import java.io.IOException;
- import java.util.ArrayList;
- import java.util.Arrays;
- import java.util.List;
- import org.apache.hadoop.conf.Configuration;
- import org.apache.hadoop.hbase.Cell;
- import org.apache.hadoop.hbase.CellUtil;
- import org.apache.hadoop.hbase.HBaseConfiguration;
- import org.apache.hadoop.hbase.TableName;
- import org.apache.hadoop.hbase.client.Admin;
- import org.apache.hadoop.hbase.client.Connection;
- import org.apache.hadoop.hbase.client.ConnectionFactory;
- import org.apache.hadoop.hbase.client.Get;
- import org.apache.hadoop.hbase.client.Result;
- import org.apache.hadoop.hbase.client.ResultScanner;
- import org.apache.hadoop.hbase.client.Scan;
- import org.apache.hadoop.hbase.client.Table;
- import org.apache.hadoop.hbase.filter.BinaryComparator;
- import org.apache.hadoop.hbase.filter.ColumnCountGetFilter;
- import org.apache.hadoop.hbase.filter.ColumnPaginationFilter;
- import org.apache.hadoop.hbase.filter.ColumnPrefixFilter;
- import org.apache.hadoop.hbase.filter.ColumnRangeFilter;
- import org.apache.hadoop.hbase.filter.DependentColumnFilter;
- import org.apache.hadoop.hbase.filter.FamilyFilter;
- import org.apache.hadoop.hbase.filter.Filter;
- import org.apache.hadoop.hbase.filter.FilterList;
- import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
- import org.apache.hadoop.hbase.filter.FuzzyRowFilter;
- import org.apache.hadoop.hbase.filter.InclusiveStopFilter;
- import org.apache.hadoop.hbase.filter.KeyOnlyFilter;
- import org.apache.hadoop.hbase.filter.MultipleColumnPrefixFilter;
- import org.apache.hadoop.hbase.filter.PageFilter;
- import org.apache.hadoop.hbase.filter.PrefixFilter;
- import org.apache.hadoop.hbase.filter.QualifierFilter;
- import org.apache.hadoop.hbase.filter.RandomRowFilter;
- import org.apache.hadoop.hbase.filter.RegexStringComparator;
- import org.apache.hadoop.hbase.filter.RowFilter;
- import org.apache.hadoop.hbase.filter.SingleColumnValueExcludeFilter;
- import org.apache.hadoop.hbase.filter.SingleColumnValueFilter;
- import org.apache.hadoop.hbase.filter.SkipFilter;
- import org.apache.hadoop.hbase.filter.SubstringComparator;
- import org.apache.hadoop.hbase.filter.TimestampsFilter;
- import org.apache.hadoop.hbase.filter.ValueFilter;
- import org.apache.hadoop.hbase.filter.WhileMatchFilter;
- import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp;
- import org.apache.hadoop.hbase.util.Bytes;
- import org.apache.hadoop.hbase.util.Pair;
- publicclass HbaseUtils {
- publicstatic Admin admin = null; <