Hbase 整合Phoenix
阿新 • • 發佈:2022-05-22
1、Phoenix 簡介
Phoenix 是 Hbase 的開源的 SQL 面板,可以使用標準的JDBC API 代替HBase 客戶端 API來建立表,插入和查詢Hbase資料.Phoenix 特點:
- 易整合:如 Spark,Hive,Pig,Flume 等
- 操作進度:DML/DDL 支援和SQL標準化操作
- 支援HBase 二級索引建立
Phoenxi 架構
2、快速使用
2.1、安裝
hui@hadoop201 software]$ tar -zxvf apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz -C /opt/module/ [hui@hadoop201 module]$ mv apache環境變數-phoenix-5.0.0-HBase-2.0-bin/ phoenix-5.0.0
[hui@hadoop201 phoenix-5.0.0]$ sudo vim /etc/profile.d/my_env.sh #PHOENIX export PHOENIX_HOME=/opt/module/phoenix-5.0.0 export PHOENIX_CLASSPATH=$PHOENIX_HOME export PATH=$PATH:$PHOENIX_HOME/bin [hui@hadoop201 phoenix-5.0.0]$ source /etc/profile將 phoenix-5.0.0-HBase-2.0-server.jar cp 到hbase 的 lib 下
hui@hadoop201 phoenix-5.0.0]$ cp phoenix-5.0.0-HBase-2.0-server.jar /opt/module/hbase-2.0.5/lib/ [hui@hadoop201 phoenix啟動 hbase-5.0.0]$ ll /opt/module/hbase-2.0.5/lib/phoenix-5.0.0-HBase-2.0-server.jar -rw-r--r--. 1 hui hui 41800313 May 9 19:15 /opt/module/hbase-2.0.5/lib/phoenix-5.0.0-HBase-2.0-server.jar [hui@hadoop201 hbase-2.0.5]$ cd lib/ [hui@hadoop201 lib]$ ll phoenix* -rw-r--r--. 1 hui hui 41800313 May 9 19:15 phoenix-5.0.0-HBase-2.0-server.jar [hui@hadoop201 lib]$ sxync.sh phoenix-5.0.0-HBase-2.0-server.jar
[hui@hadoop201 hbase-2.0.5]$ bin/start-hbase.sh連結phoenix
[hui@hadoop201 phoenix-5.0.0]$ bin/sqlline.py 0: jdbc:phoenix:> !! !quit !done !exit !connect !open !describe !indexes !primarykeys !exportedkeys !manual !importedkeys !procedures !tables !typeinfo !columns !reconnect !dropall !history !metadata !nativesql !dbinfo !rehash !verbose !run !batch !list !all !go !# !script !record !brief !close !closeall !isolation !outputformat !autocommit !commit !properties !rollback !help !? !set !save !scan !sql !call #列出所有表 0: jdbc:phoenix:> !tables
2.2、Phoneix Shell 操作
1、schema 操作 預設情況下,在phoenix 中不能建立schema ,需要將如下引數新增到hbse-site.xml 中,並copy 到phoenix的bin 下<property> <name>phoenix.schema.isNamespaceMappingEnabled</name> <value>true</value> </property>
分發配置
[hui@hadoop201 hbase-2.0.5]$ sxync.sh conf/hbase-site.xml [hui@hadoop201 bin]$ pwd /opt/module/phoenix-5.0.0/bin [hui@hadoop201 bin]$ less hbase-site.xml <property> <name>hbase.regionserver.wal.codec</name> <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value> </property> <property> <name>phoenix.schema.isNamespaceMappingEnabled</name> <value>true</value> </property>建立schema
0: jdbc:phoenix:> create schema if not exists mydb ;
No rows affected (0.28 seconds)
hbase shell 檢視
hbase(main):001:0> list_namespace NAMESPACE MYDB SYSTEM default hbase發現 phoenix 建的 schema 名和 表名會轉換為大寫,若要使用小寫,可以在名稱兩邊加雙引號
drop schema mydb2、表 操作 建立表必須指定主鍵
create table student( id VARCHAR primary key, name VARCHAR, age VARCHAR);插入/修改資料
upsert into student(id,name,age) values('1001','linghc','26'); upsert into student(id,name,age) values('1002','yilin','18');查詢資料
select id,name,age from student; select * from student;帶條件查詢
select * from student where id='1002';刪除資料
delete from student where id='1002' ;3、表對映 預設情況下,直接在hbase 建立的表,通過 phoenix 是看不到的,如果要在phoenix中操作直接在hbase建立的表,需要在phoenix 中進行對映,對映方式有兩種,檢視對映和表對映。 1、hbase 沒有表,直接在phoenix中進行建表,已經測試過:phoenix 基本使用就是這種情況 2、hbase建表後,通過phoenix 建立檢視只能查詢資料
create 'emp','info'插入資料
put 'emp','1001','info:name','linghc' put 'emp','1002','info:name','yilin'phoenix 建立檢視
create view "emp" (id varchar primary key,"info"."name" varchar)進行查詢:注意檢視名小寫要加引號
0: jdbc:phoenix:> select * from "emp";
刪除檢視
drop view "emp";
3、hbase 建表後,phoenix 進行表對映
create table "emp" ( id varchar primary key, "info"."name" varchar) COLUMN_ENCODED_BYTES= none;
此時可以對錶資料進行修改;
2.3、Phoneix 二級索引
修改配置檔案<property> <name>hbase.regionserver.wal.codec</name> <value>org.apache.hadoop.hbase.regionserver.wal.IndexedwALEditCodec</value> </property>分發後重啟hbase
[hui@hadoop201 conf]$ sxync.sh hbase-site.xml ==================== hadoop201 ==================== sending incremental file list sent 63 bytes received 12 bytes 50.00 bytes/sec total size is 1,778 speedup is 23.71 ==================== hadoop202 ==================== sending incremental file list hbase-site.xml sent 496 bytes received 53 bytes 366.00 bytes/sec total size is 1,778 speedup is 3.24 ==================== hadoop203 ==================== sending incremental file list hbase-site.xml sent 496 bytes received 53 bytes 1,098.00 bytes/sec total size is 1,778 speedup is 3.24 [hui@hadoop201 conf]$ ../bin/start-hbase.sh二級索引之前的查詢計劃
explain select id from student;//FULL SCAN explain select id from student where id='1002';//POINT LOOKUP explain select id from student where name='yilin';// FULL SCAN建立二級索引
create index idx_student_name on student(name);再次檢視執行計劃
explain select id from student;//FULL SCAN explain select id from student where id='1002';//POINT LOOKUP explain select id from student where name='yilin';// RANGE SCAN OVER IDX_STUDENT_NAME索引在底層維護了一張名為 idx_student_name 的表,主鍵是 students 的ID和name 組成的聯合索引
0: jdbc:phoenix:> !tables +------------+--------------+-------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+ | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | TYPE_NAME | SELF_REFERENCING_COL_NAME | REF_GENERATION | INDEX_STATE | IMMUTABLE_ROWS | +------------+--------------+-------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+ | | | IDX_STUDENT_NAME | INDEX | | | | | ACTIVE | false | | | SYSTEM | CATALOG | SYSTEM TABLE | | | | | | false | | | SYSTEM | FUNCTION | SYSTEM TABLE | | | | | | false | | | SYSTEM | LOG | SYSTEM TABLE | | | | | | true | | | SYSTEM | SEQUENCE | SYSTEM TABLE | | | | | | false | | | SYSTEM | STATS | SYSTEM TABLE | | | | | | false | | | | STUDENT | TABLE | | | | | | false | | | | emp | TABLE | | | | | | false | +------------+--------------+-------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+ 0: jdbc:phoenix:> select * from IDX_STUDENT_NAME; +---------+-------+ | 0:NAME | :ID | +---------+-------+ | linghc | 1001 | | renyy | 1003 | | yilin | 1002 |總結:全域性索引表:會建立一張索引表,在索引表中,將索引列與原表中的rowkey組合起來作為索引表的rowkey進行使用。