HBase整合hive
阿新 • • 發佈:2020-07-27
一、為了建立一個新的由Hive管理的HBase表,請使用CREATE TABLE
CREATE TABLE hbase_table_1(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "xyz", "hbase.mapred.output.outputtable" = "xyz");
- The
hbase.columns.mapping
property is required and will be explained in the next section. - The
hbase.table.name
property is optional;- it controls the name of the table as known by HBase, and allows the Hive table to have a different name.
- In this example, the table is known as
hbase_table_1
within Hive, and asxyz
- If not specified, then the Hive andHBase table names will be identical.
- The
hbase.mapred.output.outputtable
propertyis optional;- it's needed if you plan to insert data to the table (the property is used by
hbase.mapreduce.TableOutputFormat
)
- it's needed if you plan to insert data to the table (the property is used by
2、列的對映
There are two SERDEPROPERTIES
that control the mapping of HBase columns to Hive:
hbase.columns.mapping
hbase.table.default.storage.type
: Can have a value of eitherstring
(the default) orbinary
, this option is only available as of Hive 0.9 and thestring
behavior is the only one available in earlier versions
多列和多列族
- hive中建立表
CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,a:b,a:c,d:e" );
- 插入資料
hive> insert into table hbase_table_1 values(100,val_100,101,102);
hive> insert into table hbase_table_1 values(100,val_100,101,102);
- hbase查看錶結構
- HBASE檢視資料
hbase(main):004:0> scan "hbase_table_1"
ROW COLUMN+CELL
100 column=a:b, timestamp=1595817016732, value=val_100
100 column=a:c, timestamp=1595817016732, value=101
100 column=d:e, timestamp=1595817016732, value=102
98 column=a:b, timestamp=1595817050488, value=val_98
98 column=a:c, timestamp=1595817050488, value=99
98 column=d:e, timestamp=1595817050488, value=100
2 row(s) in 0.0410 seconds
- 總結
(1)hive的key即為hbase的rowkey
(2)"hbase.columns.mapping" = ":key,a:b,a:c,d:e"中,:key 即為rowkey
3、列的對映
- hbase中插入資料
hbase(main):006:0> put "hbase_table_1",102,"a:b","val_102"
hbase(main):008:0> put "hbase_table_1",102,"a:c","101"
hbase(main):009:0> put "hbase_table_1",102,"d:e","102"
- scan資料
- hive檢視資料