HBase Shell及JavaAPI操作
一、Shell 操作
使用如下命令進入hbase 的shell 客戶端,輸入quit或exit退出
$ hbase shell
檢視hbase 所有命令
$ help
如果忘記了命令如何使用,使用help ‘命令’檢視幫助文件,如下
hbase(main):048:0> help 'list'
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:
hbase> list
hbase> list 'abc.*'
hbase> list 'ns:abc.*'
hbase> list 'ns:.*'
常用命令操作
1.一般操作
作用 | 命令表示式 |
---|---|
檢視伺服器狀態 | status |
檢視hbase 版本 | version |
檢視當前使用者 | whoami |
表引用命令提供幫助 | table_help |
1).檢視伺服器狀態
hbase(main):002:0> status
1 active master, 1 backup masters, 3 servers, 0 dead, 1.0000 average load
2).檢視hbase 版本
hbase(main):003:0> version
1.2.6, rUnknown, Mon May 29 02:25:32 CDT 2017
3).檢視當前使用者
hbase(main):004:0> whoami
hadoop (auth:SIMPLE)
groups: hadoop, wheel
4).表引用命令提供幫助
2.DDL操作(資料定義語言)
作用 | 命令表示式 |
---|---|
建立表 | create ‘表名’, ‘列族名1’,’列族名2’,’列族名N’ |
查看錶結構 | desc ‘表名’ 或 describe ‘表名’ |
判斷表是否存在 | exists ‘表名’ |
判斷是否禁用啟用表 | is_enabled ‘表名’; is_disabled ‘表名’ |
禁用表 | disable ‘表名’ |
啟用表 | enable ‘表名’ |
檢視所有表 | list |
刪除列族 | alter ‘表名’,’delete’=>’列族’ |
新增列族 | alter ‘表名’,NAME=>’列族’ |
刪除單個表 | 先禁用表, 再刪除表, 第一步disable ‘表名’,第二步 drop ‘表名’ |
批量刪除表 | drop_all ‘正則表示式’ |
1).建立表
hbase(main):008:0> create 'students','info','address'
0 row(s) in 10.5040 seconds
=> Hbase::Table - students
2).查看錶結構
hbase(main):029:0> desc 'students'
Table students is ENABLED
students
COLUMN FAMILIES DESCRIPTION
{NAME => 'address', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VER
SIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIO
NS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.2450 seconds
hbase(main):030:0> describe 'students'
Table students is ENABLED
students
COLUMN FAMILIES DESCRIPTION
{NAME => 'address', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VER
SIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIO
NS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.2470 seconds
3).判斷表是否存在(exit)
hbase(main):011:0> exists 'students'
Table students does exist
0 row(s) in 0.0830 seconds
4).判斷是否禁用啟用表(is_enabled,is_disabled)
is_enabled 是否啟用、is_disabled 是否禁用
hbase(main):012:0> is_enabled 'students'
true
0 row(s) in 0.0690 seconds
hbase(main):013:0> is_disabled 'students'
false
0 row(s) in 0.0860 seconds
5).禁用表(disable)
hbase(main):016:0> disable 'students'
0 row(s) in 2.6340 seconds
hbase(main):017:0> is_disabled 'students'
true
0 row(s) in 0.0520 seconds
6).啟用表(enable)
hbase(main):018:0> enable 'students'
0 row(s) in 2.5390 seconds
hbase(main):019:0> is_enabled 'students'
true
0 row(s) in 0.0860 seconds
7).檢視所有表(list)
hbase(main):020:0> list
TABLE
students
user
2 row(s) in 0.0400 seconds
=> ["students", "user"]
8).刪除列族(alter)
刪除students表中的列族 address
hbase(main):024:0> alter 'students','delete'=>'address'
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 4.0260 seconds
9).新增列族(alter)
students 表中新增列族address
hbase(main):027:0> alter 'students',NAME=>'address'
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 3.6260 seconds
10).刪除表(drop,drop_all)
注意:刪除前必須先disable表,然後再使用drop刪除
刪除單個表使用drop,刪除students表
hbase(main):031:0> disable 'students'
0 row(s) in 2.3640 seconds
hbase(main):032:0> drop 'students'
0 row(s) in 2.5820 seconds
hbase(main):033:0> exists 'students'
Table students does not exist
0 row(s) in 0.0850 seconds
批量刪除表使用drop_all,使用正則匹配,刪除前先disable表,例如有如下表,刪除所有以stu開頭的表
hbase(main):038:0> list
TABLE
stu
students1
students2
user
3 row(s) in 0.0570 seconds
=> ["stu", "students1", "students2"]
hbase(main):041:0> disable_all 'stu.*'
stu
students1
students2
Disable the above 3 tables (y/n)?
y
3 tables successfully disabled
2.DML 操作(資料操作語言)
作用 | 命令表示式 |
---|---|
插入資料 | put ‘表名’,’rowkey’,’列族:列’,’列值’ |
獲取某個列族 | get ‘表名’,’rowkey’,’列族’ |
獲取某個列族的某個列 | get ‘表名’,’rowkey’,’列族:列’ |
全表掃描 | scan ‘表名’ |
查詢表歷史記錄 | scan ‘表名’,{RAW => true,VERSION => 10} |
刪除記錄 | delete ‘表名’ ,‘rowkey’ , ‘列族:列’ |
刪除整行 | deleteall ‘表名’,’rowkey’ |
清空表 | truncate ‘表名’ |
查看錶中的記錄總數 | count ‘表名’ |
1).插入資料(put)
hbase(main):003:0> put 'students','1001','info:name','zhangsan'
0 row(s) in 0.9800 seconds
hbase(main):006:0> put 'students','1001','info:sex','0'
0 row(s) in 0.0520 seconds
hbase(main):006:0> put 'students','1001','address:province','Henan'
0 row(s) in 0.1540 seconds
hbase(main):005:0> put 'students','1001','address:city','BeiJing'
0 row(s) in 0.3730 seconds
hbase(main):018:0> put 'students','1002','info:name','wangwu'
0 row(s) in 0.0690 seconds
hbase(main):019:0> put 'students','1003','info:sex','1'
0 row(s) in 0.0640 seconds
2).更新資料(put)
- 更新行健為1001,列族為info,列為name的學生姓名為lisi
hbase(main):009:0> put 'students','1001','info:name','lisi'
0 row(s) in 0.1040 seconds
- 更新行健為1001,列族為address,列為province的學生省份為Hebei
hbase(main):011:0> put 'students','1001','address:province','Hebei'
0 row(s) in 0.0650 seconds
3).查詢資料(get、scan)
根據rowkey獲取:get
全表掃描:scan
- 獲取行健為1001的學生資訊
hbase(main):025:0> get 'students','1001'
COLUMN CELL
address:city timestamp=1502172494982, value=BeiJing
address:province timestamp=1502172919511, value=Hebei
info:name timestamp=1502172821032, value=lisi
info:sex timestamp=1502171941941, value=0
4 row(s) in 0.3110 seconds
- 獲取行健為1001且列族為address的學生資訊
hbase(main):026:0> get 'students','1001','address'
COLUMN CELL
address:city timestamp=1502172494982, value=BeiJing
address:province timestamp=1502172919511, value=Hebei
2 row(s) in 0.0380 seconds
- 獲取行健為1001、列族為address、列為ciry的學生資訊
hbase(main):027:0> get 'students','1001','address:city'
COLUMN CELL
address:city timestamp=1502172494982, value=BeiJing
1 row(s) in 0.1150 seconds
- 獲取所有的學生資訊
hbase(main):028:0> scan 'students'
ROW COLUMN+CELL
1001 column=address:city, timestamp=1502172494982, value=BeiJing
1001 column=address:province, timestamp=1502172919511, value=Hebei
1001 column=info:name, timestamp=1502172821032, value=lisi
1001 column=info:sex, timestamp=1502171941941, value=0
1002 column=info:name, timestamp=1502173540238, value=wangwu
1003 column=info:sex, timestamp=1502173566515, value=1
3 row(s) in 0.1370 seconds
4).刪除資料
- 刪除列族中的某個列
刪除students錶行健1001,列族為address,列為city的資料
hbase(main):036:0> delete 'students','1001','address:city'
0 row(s) in 0.1750 seconds
刪除前:
刪除後:刪除了列族中列city為BeiJing的資料
刪除某個列族(參考DDL操作中的示例8)
刪除整行資料
hbase(main):049:0> deleteall 'students','1002'
0 row(s) in 0.0510 seconds
刪除前:
刪除後:刪除了行健為1002的資料
使用scan 命令可以檢視到students的歷史記錄,可以看到已被刪除的列族,修改前的資料
scan 'students',{RAW => true,VERSION => 10}
- 清空表中所有資料
hbase(main):010:0> truncate 'students'
Truncating 'students' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 5.5080 seconds
hbase(main):012:0> scan 'students'
ROW COLUMN+CELL
0 row(s) in 0.2060 seconds
5).查看錶中的總記錄數(count)
hbase(main):001:0> count 'students'
2 row(s) in 1.7030 seconds
=> 2
hbase(main):002:0> scan 'students'
ROW COLUMN+CELL
1001 column=address:province, timestamp=1502172919511, value=Hebei
1001 column=info:name, timestamp=1502172821032, value=lisi
1001 column=info:sex, timestamp=1502171941941, value=0
1003 column=info:sex, timestamp=1502173566515, value=1
2 row(s) in 0.3620 seconds
二、Java API 操作
HBase提供了Java API的訪問介面,實際開發中我們經常用來操作HBase,就和我們通過Java API操作RDBMS一樣。筆者對HBase 中的常用Java API做了個簡要的總結,如下
Java API | 作用 |
---|---|
HBaseAdmin | HBase 客戶端,用來操作HBase |
Configuration | 配置物件 |
Connection | 連線物件 |
TableName | HBase 中的表名 |
HTableDescriptor | HBase 表描述資訊物件 |
HColumnDescriptor | HBase 列族描述物件 |
Table | HBase 表物件 |
Put | 用於插入資料 |
Get | 用於查詢單條記錄 |
Delete | 刪除資料物件 |
Scan | 全表掃描物件,查詢所有記錄 |
ResultScanner | 查詢資料返回結果集 |
Result | 查詢返回的單條記錄結果 |
Cell | 對應HBase中的列 |
SingleColumnValueFilter | 列值過濾器(過濾列植的相等、不等、範圍等) |
ColumnPrefixFilter | 列名字首過濾器(過濾指定字首的列名) |
multipleColumnPrefixFilter | 多個列名字首過濾器(過濾多個指定字首的列名) |
RowFilter | rowKey過濾器(通過正則,過濾rowKey值) |
筆者針對上面提到的常用的 Java API 寫了一個Demo,程式碼如下
package com.bigdata.study.hbase;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.filter.*;
import org.apache.hadoop.hbase.util.Bytes;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import java.util.ArrayList;
import java.util.List;
/**
* HBase Java API 操作
* 一般我們使用Java API 主要操作的是資料即DML操作,DDL的操作較少
*/
public class HBaseTest {
static Configuration conf = null;
private Connection conn = null;
private HBaseAdmin admin = null;
private TableName tableName = null;
private Table table = null;
// 初始化配置
@Before
public void init() throws Exception {
conf = HBaseConfiguration.create();
// 如果不設定zookeeper地址,可以將hbase-site.xml檔案複製到resource目錄下
conf.set("hbase.zookeeper.quorum","node3,node4,node5");// zookeeper 地址
// conf.set("hbase.zookeeper.property.clientPort","2188");// zookeeper 客戶端埠,預設為2188,可以不用設定
conn = ConnectionFactory.createConnection(conf);// 建立連線
// admin = new HBaseAdmin(conf); // 已棄用,不推薦使用
admin = (HBaseAdmin) conn.getAdmin(); // hbase 表管理類
tableName = TableName.valueOf("students"); // 表名
table = conn.getTable(tableName);// 表物件
}
// --------------------DDL 操作 Start------------------
// 建立表 HTableDescriptor、HColumnDescriptor、addFamily()、createTable()
@Test
public void createTable() throws Exception {
// 建立表描述類
HTableDescriptor desc = new HTableDescriptor(tableName);
// 新增列族info
HColumnDescriptor family_info = new HColumnDescriptor("info");
desc.addFamily(family_info);
// 新增列族address
HColumnDescriptor family_address = new HColumnDescriptor("address");
desc.addFamily(family_address);
// 建立表
admin.createTable(desc);
}
// 刪除表 先棄用表disableTable(表名),再刪除表 deleteTable(表名)
@Test
public void deleteTable() throws Exception {
admin.disableTable(tableName);
admin.deleteTable(tableName);
}
// 新增列族 addColumn(表名,列族)
@Test
public void addFamily() throws Exception {
admin.addColumn(tableName, new HColumnDescriptor("hobbies"));
}
// 刪除列族 deleteColumn(表名,列族)
@Test
public void deleteFamily() throws Exception {
admin.deleteColumn(tableName, Bytes.toBytes("hobbies"));
}
// --------------------DDL 操作 End---------------------
// ----------------------DML 操作 Start-----------------
// 新增資料 Put(列族,列,列值)(HBase 中沒有修改,插入時rowkey相同,資料會覆蓋)
@Test
public void insertData() throws Exception {
// 新增一條記錄
// Put put = new Put(Bytes.toBytes("1001"));
// put.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"), Bytes.toBytes("San-Qiang Zhang"));
// put.addColumn(Bytes.toBytes("address"), Bytes.toBytes("province"), Bytes.toBytes("Hebei"));
// put.addColumn(Bytes.toBytes("address"), Bytes.toBytes("city"), Bytes.toBytes("Shijiazhuang"));
// table.put(put);
// 新增多條記錄(批量插入)
List<Put> putList = new ArrayList<Put>();
Put put1 = new Put(Bytes.toBytes("1002"));
put1.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"), Bytes.toBytes("Lisi"));
put1.addColumn(Bytes.toBytes("info"), Bytes.toBytes("sex"), Bytes.toBytes("1"));
put1.addColumn(Bytes.toBytes("address"), Bytes.toBytes("city"), Bytes.toBytes("Shanghai"));
Put put2 = new Put(Bytes.toBytes("1003"));
put2.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"), Bytes.toBytes("Lili"));
put2.addColumn(Bytes.toBytes("info"), Bytes.toBytes("sex"), Bytes.toBytes("0"));
put2.addColumn(Bytes.toBytes("address"), Bytes.toBytes("city"), Bytes.toBytes("Beijing"));
Put put3 = new Put(Bytes.toBytes("1004"));
put3.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name_a"), Bytes.toBytes("Zhaosi"));
Put put4 = new Put(Bytes.toBytes("1004"));
put4.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name_b"), Bytes.toBytes("Wangwu"));
putList.add(put1);
putList.add(put2);
putList.add(put3);
putList.add(put4);
table.put(putList);
}
// 刪除資料 Delete
@Test
public void deleteData() throws Exception {
// 刪除一條資料(行健為1002)
// Delete delete = new Delete(Bytes.toBytes("1002"));
// table.delete(delete);
// 刪除行健為1003,列族為info的資料
// Delete delete = new Delete(Bytes.toBytes("1003"));
// delete.addFamily(Bytes.toBytes("info"));
// table.delete(delete);
// 刪除行健為1,列族為address,列為city的資料
Delete delete = new Delete(Bytes.toBytes("1001"));
delete.addColumn(Bytes.toBytes("address"), Bytes.toBytes("city"));
table.delete(delete);
}
// 單條查詢 Get
@Test
public void getData() throws Exception {
Get get = new Get(Bytes.toBytes("1001"));
// get.addFamily(Bytes.toBytes("info")); //指定獲取某個列族
// get.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name")); //指定獲取某個列族中的某個列
Result result = table.get(get);
System.out.println("行健:" + Bytes.toString(result.getRow()));
byte[] name = result.getValue(Bytes.toBytes("info"), Bytes.toBytes("name"));
byte[] sex = result.getValue(Bytes.toBytes("info"), Bytes.toBytes("sex"));
byte[] city = result.getValue(Bytes.toBytes("address"), Bytes.toBytes("city"));
byte[] province = result.getValue(Bytes.toBytes("address"), Bytes.toBytes("province"));
if (name != null) System.out.println("姓名:" + Bytes.toString( name));
if (sex != null) System.out.println("性別:" + Bytes.toString( sex));
if (province != null) System.out.println("省份:" + Bytes.toString(province));
if (city != null) System.out.println("城市:" + Bytes.toString(city));
}
// 全表掃描 Scan
@Test
public void scanData() throws Exception {
Scan scan = new Scan(); // Scan 全表掃描物件
// 行健是以字典序排序,可以使用scan.setStartRow(),scan.setStopRow()設定行健的字典序
// scan.addFamily(Bytes.toBytes("info")); // 只查詢列族info
//scan.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name")); // 只查詢列name
ResultScanner scanner = table.getScanner(scan);
printResult1(scanner);
}
// 全表掃描:列值過濾器(過濾列植的相等、不等、範圍等) SingleColumnValueFilter
@Test
public void singleColumnValueFilter() throws Exception {
/**
* CompareOp 是一個列舉,有如下幾個值
* LESS 小於
* LESS_OR_EQUAL 小於或等於
* EQUAL 等於
* NOT_EQUAL 不等於
* GREATER_OR_EQUAL 大於或等於
* GREATER 大於
* NO_OP 無操作
*/
// 查詢列名大於San-Qiang Zhang的資料
SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(
Bytes.toBytes("info"), Bytes.toBytes("name"),
CompareFilter.CompareOp.EQUAL, Bytes.toBytes("San-Qiang Zhang"));
Scan scan = new Scan();
scan.setFilter(singleColumnValueFilter);
ResultScanner scanner = table.getScanner(scan);
printResult1(scanner);
}
// 全表掃描:列名字首過濾器(過濾指定字首的列名) ColumnPrefixFilter
@Test
public void columnPrefixFilter() throws Exception {
// 查詢列以name_開頭的資料
ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter(Bytes.toBytes("name_"));
Scan scan = new Scan();
scan.setFilter(columnPrefixFilter);
ResultScanner scanner = table.getScanner(scan);
printResult1(scanner);
}
// 全表掃描:多個列名字首過濾器(過濾多個指定字首的列名) MultipleColumnPrefixFilter
@Test
public void multipleColumnPrefixFilter() throws Exception {
// 查詢列以name_或c開頭的資料
byte[][] bytes = new byte[][]{Bytes.toBytes("name_"), Bytes.toBytes("c")};
MultipleColumnPrefixFilter multipleColumnPrefixFilter = new MultipleColumnPrefixFilter(bytes);
Scan scan = new Scan();
scan.setFilter(multipleColumnPrefixFilter);
ResultScanner scanner = table.getScanner(scan);
printResult1(scanner);
}
// rowKey過濾器(通過正則,過濾rowKey值) RowFilter
@Test
public void rowFilter() throws Exception {
// 匹配rowkey以100開頭的資料
// Filter filter = new RowFilter(CompareFilter.CompareOp.EQUAL, new RegexStringComparator("^100"));
// 匹配rowkey以2結尾的資料
RowFilter filter = new RowFilter(CompareFilter.CompareOp.EQUAL, new RegexStringComparator("2$"));
Scan scan = new Scan();
scan.setFilter(filter);
ResultScanner scanner = table.getScanner(scan);
printResult1(scanner);
}
// 多個過濾器一起使用
@Test
public void multiFilterTest() throws Exception {
/**
* Operator 為列舉型別,有兩個值 MUST_PASS_ALL 表示 and,MUST_PASS_ONE 表示 or
*/
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
// 查詢性別為0(nv)且 行健以10開頭的資料
SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(
Bytes.toBytes("info"), Bytes.toBytes("sex"),
CompareFilter.CompareOp.EQUAL, Bytes.toBytes("0"));
RowFilter rowFilter = new RowFilter(CompareFilter.CompareOp.EQUAL, new RegexStringComparator("^10"));
filterList.addFilter(singleColumnValueFilter);
filterList.addFilter(rowFilter);
Scan scan = new Scan();
scan.setFilter(rowFilter);
ResultScanner scanner = table.getScanner(scan);
// printResult1(scanner);
printResult2(scanner);
}
// --------------------DML 操作 End-------------------
/** 列印查詢結果:方法一 */
public void printResult1(ResultScanner scanner) throws Exception {
for (Result result: scanner) {
System.out.println("行健:" + Bytes.toString(result.getRow()));
byte[] name = result.getValue(Bytes.toBytes("info"), Bytes.toBytes("name"));
byte[] sex = result.getValue(Bytes.toBytes("info"), Byt