HBase的相關操作
阿新 • • 發佈:2018-12-31
1、啟動shell命令列
[[email protected] hbase-1.2.0]# bin/hbase-daemon.sh start master starting master, logging to /opt/cdh5.14.2/hbase-1.2.0/bin/../logs/hbase-root-master-master.cdh.com.out [[email protected] hbase-1.2.0]# bin/hbase shell 2018-08-25 18:17:32,998 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 2018-08-25 18:17:37,750 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable HBase Shell; enter 'help<RETURN>' for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 1.2.0-cdh5.14.2, rUnknown, Tue Mar 27 13:31:54 PDT 2018 hbase(main):001:0>
2 、建立namespace
create_namespace 'namespace'
查詢 namespace:
hbase(main):005:0> list_namespace NAMESPACE default hbase namespace 3 row(s) in 0.0660 seconds
3、刪除namespace
drop_namespace 'namespace'
查詢 namespace是否刪除:
hbase(main):007:0> list_namespace NAMESPACE default hbase 2 row(s) in 0.0190 seconds
4、建立表:
官方給的範例:create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}
a、ns1指namespacem
b、t1指table name
c、f1指的列簇
d、VERSIONS 指的能夠儲存的版本數
f、在HBase中=>代表等於
g、可以指定多個列簇,一個大括號只能指定一個列簇
還有更簡潔的寫法:create 't1', 'f1', 'f2', 'f3'
hbase(main):011:0> create 't1', 'f1', 'f2', 'f3'
0 row(s) in 1.4360 seconds
=> Hbase::Table - t1
hbase(main):012:0> list
TABLE
t1
1 row(s) in 0.0080 seconds
=> ["t1"]
hbase(main):013:0> desc 't1'
Table t1 is ENABLED
t1
COLUMN FAMILIES DESCRIPTION
{NAME => 'f1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DEL
ETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
{NAME => 'f2', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DEL
ETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
{NAME => 'f3', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DEL
ETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
3 row(s) in 0.1590 seconds
5、刪除表:在HBase中表有啟動和禁用的狀態區分,在刪除表前需要先禁用再操作
hbase(main):006:0> drop 'ns_test:t1'
ERROR: Table ns_test:t1 is enabled. Disable it first.
Drop the named table. Table must first be disabled:
hbase> drop 't1'
hbase> drop 'ns1:t1'
先disable 表,再刪除:
hbase(main):007:0> disable 'ns_test:t1'
0 row(s) in 2.3470 seconds
hbase(main):008:0> drop 'ns_test:t1'
0 row(s) in 1.3010 seconds
6、put操作:put 'ns1:t1', 'r1', 'c1', 'value'
a、r1代表rowkey
b、c1代表列+列簇
c、value代表插入的值
具體例子:
hbase(main):015:0> create 'ns_test:tb1','info'
0 row(s) in 1.2920 seconds
=> Hbase::Table - ns_test:tb1
hbase(main):016:0> put 'ns_test:tb1','20180830','info:name','Lucy'
0 row(s) in 0.1200 seconds
hbase(main):017:0> put 'ns_test:tb1','20180830','info:age','30'
0 row(s) in 0.0080 seconds
hbase(main):018:0> put 'ns_test:tb1','20180830','info:sex','woman'
0 row(s) in 0.0120 seconds
7、查詢資料:get
hbase(main):023:0> get 'ns_test:tb1','20180830'
COLUMN CELL
info:age timestamp=1535205653726, value=30
info:name timestamp=1535205633199, value=Lucy
info:sex timestamp=1535205702203, value=woman
3 row(s) in 0.0380 seconds
查詢表中指定列的值:
hbase(main):028:0> get 'ns_test:tb1','20180830',{COLUMN => 'info:age'}
COLUMN CELL
info:age timestamp=1535205653726, value=30
1 row(s) in 0.0150 seconds
可以查詢更多get的用法:
hbase(main):024:0> help 'get'
Get row or cell contents; pass table name, row, and optionally
a dictionary of column(s), timestamp, timerange and versions. Examples:
hbase> get 'ns1:t1', 'r1'
hbase> get 't1', 'r1'
hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
hbase> get 't1', 'r1', {COLUMN => 'c1'}
hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> get 't1', 'r1', 'c1'
hbase> get 't1', 'r1', 'c1', 'c2'
hbase> get 't1', 'r1', ['c1', 'c2']
hbase> get 't1', 'r1', {COLUMN => 'c1', ATTRIBUTES => {'mykey'=>'myvalue'}}
hbase> get 't1', 'r1', {COLUMN => 'c1', AUTHORIZATIONS => ['PRIVATE','SECRET']}
hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE'}
hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}
Besides the default 'toStringBinary' format, 'get' also supports custom formatting by
column. A user can define a FORMATTER by adding it to the column name in the get
specification. The FORMATTER can be stipulated:
1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
hbase> get 't1', 'r1' {COLUMN => ['cf:qualifier1:toInt',
'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
Note that you can specify a FORMATTER by column only (cf:qualifier). You cannot specify
a FORMATTER for all columns of a column family.
The same commands also can be run on a reference to a table (obtained via get_table or
create_table). Suppose you had a reference t to table 't1', the corresponding commands
would be:
hbase> t.get 'r1'
hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]}
hbase> t.get 'r1', {COLUMN => 'c1'}
hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> t.get 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> t.get 'r1', 'c1'
hbase> t.get 'r1', 'c1', 'c2'
hbase> t.get 'r1', ['c1', 'c2']
hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE'}
hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}
8、全表掃描:scan,用得比較少
hbase(main):029:0> scan 'ns_test:tb1'
ROW COLUMN+CELL
20180830 column=info:age, timestamp=1535205653726, value=30
20180830 column=info:name, timestamp=1535205633199, value=Lucy
20180830 column=info:sex, timestamp=1535205702203, value=woman
1 row(s) in 0.0630 seconds
STARTROW、ENDROW用法:
hbase(main):039:0> scan 'ns_test:tb1'
ROW COLUMN+CELL
20180830 column=info:age, timestamp=1535205653726, value=30
20180830 column=info:name, timestamp=1535205633199, value=Lucy
20180830 column=info:sex, timestamp=1535205702203, value=woman
20180831 column=info:name, timestamp=1535208181737, value=Tom
20180901 column=info:name, timestamp=1535208173653, value=Peppa
3 row(s) in 0.0260 seconds
hbase(main):040:0> scan 'ns_test:tb1',{STARTROW => '20180831'}
ROW COLUMN+CELL
20180831 column=info:name, timestamp=1535208181737, value=Tom
20180901 column=info:name, timestamp=1535208173653, value=Peppa
2 row(s) in 0.0210 seconds
hbase(main):042:0> scan 'ns_test:tb1',{ENDROW => '20180901'}
ROW COLUMN+CELL
20180830 column=info:age, timestamp=1535205653726, value=30
20180830 column=info:name, timestamp=1535205633199, value=Lucy
20180830 column=info:sex, timestamp=1535205702203, value=woman
20180831 column=info:name, timestamp=1535208181737, value=Tom
2 row(s) in 0.0170 seconds
9、delete用法:
a、刪除列
hbase(main):003:0> scan 'ns_test:tb1'
ROW COLUMN+CELL
20180830 column=info:age, timestamp=1535205653726, value=30
20180830 column=info:name, timestamp=1535205633199, value=Lucy
20180830 column=info:sex, timestamp=1535205702203, value=woman
20180831 column=info:name, timestamp=1535208181737, value=Tom
20180901 column=info:name, timestamp=1535208173653, value=Peppa
3 row(s) in 0.4990 seconds
hbase(main):004:0> delete 'ns_test:tb1','20180830','info:age'
0 row(s) in 0.0690 seconds
hbase(main):005:0> scan 'ns_test:tb1'
ROW COLUMN+CELL
20180830 column=info:name, timestamp=1535205633199, value=Lucy
20180830 column=info:sex, timestamp=1535205702203, value=woman
20180831 column=info:name, timestamp=1535208181737, value=Tom
20180901 column=info:name, timestamp=1535208173653, value=Peppa
3 row(s) in 0.0370 seconds
b、刪除整個rowkey的資料
hbase(main):006:0> delete 'ns_test:tb1','20180830'
0 row(s) in 0.0080 seconds
hbase(main):007:0> scan 'ns_test:tb1'
ROW COLUMN+CELL
20180831 column=info:name, timestamp=1535208181737, value=Tom
20180901 column=info:name, timestamp=1535208173653, value=Peppa
2 row(s) in 0.0200 seconds