Apache Hawq--優化筆記
- 數據表分區盡量采用數值類型字段,如Date類型轉換為距離1970-01-01的絕對天數。
- SQL語法:盡量避免between and 的使用(查看查詢計劃,影響不大),多個子查詢時盡量使用CTE(with v as…)查詢。
- 根據表的數據量以及大多數查詢的類型設計數據分布策略(隨機分布or哈希分布)以及bucketnum參數設置。
- 冷熱數據存儲到不同schema相同的表中,盡量減小熱數據所在表的分區數(通過動態增加和刪除分區)。
- 根據SQL所需的資源設置參數hawq_rm_stat_nvseg和hawq_rm_vseg_memory。
- 每天定時執行“vacuum table_name;analyze table_name;”獲取每個表的統計信息,以便生成最優的查詢計劃。
- 通過執行”vacuum pg_class; reindex table pg_class”源數據表pg_class,減少元數據記錄數。
- 執行分析查詢計劃,找到SQL的性能瓶頸,有針對性的優化。
- 采取措施盡量提升查詢數據的本地化比率。
實際測試
數據表采用隨機分布: bucketnum=9
1.設置用於查詢的virtual segment數量
語句級別:
SET hawq_rm_stmt_nvseg=10;
SET hawq_rm_stmt_vseg_memory=‘256mb‘;
- 禁用語句級別
SET hawq_rm_stmt_nvseg=0;
set hawq_rm_nvseg_perquery_perseg_limit=10; set hawq_rm_nvseg_perquery_limit=512;
通過hawq_rm_nvseg_perquery_limit和hawq_rm_nvseg_perquery_perseg_limit參數可以調整查詢執行時使用的virtual segments的數量
2.哈希分布表的相關參數:
default_hash_table_bucket_number
hawq_rm_nvseg_perquery_limit
hawq_rm_nvseg_perquery_perseg_limit
3.可以使用pg_partitions視圖查找有關分區設計的信息。例如,查看銷售表的分區設計:
SELECT partitionboundary, partitiontablename, partitionname, partitionlevel, partitionrank FROM pg_partitions WHERE tablename=‘ins_wifi_dates‘;
下表和視圖顯示有關分區表的信息。
- pg_partition - 跟蹤分區表及其繼承關系。
- pg_partition_templates - 顯示使用子分區模板創建的子分區。
- pg_partition_columns - 顯示分區設計中使用的分區鍵列。
4.查看表的segment file分布
SELECT gpr.tablespace_oid,
gpr.database_oid,
gpf.relfilenode_oid,
gpf.segment_file_num,
‘/hawq_data/‘||gpr.tablespace_oid||‘/‘||gpr.database_oid||‘/‘||gpf.relfilenode_oid ||‘/‘||gpf.segment_file_num as path,
pg_class.relname,
gpr.persistent_state,
gpf.persistent_state
FROM gp_persistent_relfile_node gpf,pg_class, gp_persistent_relation_node gpr
WHERE gpf.relfilenode_oid = pg_class.relfilenode
AND gpr.relfilenode_oid = pg_class.relfilenode
AND pg_class.relname=‘person‘;
SELECT distinct gpr.tablespace_oid,
gpr.database_oid,
gpf.relfilenode_oid,
pg_class.relname,
gpr.persistent_state,
gpf.persistent_state
FROM gp_persistent_relfile_node gpf,pg_class, gp_persistent_relation_node gpr
WHERE gpf.relfilenode_oid = pg_class.relfilenode
AND gpr.relfilenode_oid = pg_class.relfilenode
AND pg_class.relname like ‘person_%‘ order by pg_class.relname ;
#schema
SELECT gpr.tablespace_oid,
gpr.database_oid,
gpf.relfilenode_oid,
gpf.segment_file_num,
‘/hawq_data/‘||gpr.tablespace_oid||‘/‘||gpr.database_oid||‘/‘||gpf.relfilenode_oid ||‘/‘||gpf.segment_file_num as path,
pgn.nspname AS schemaname,
pg_class.relname AS tablename,
gpr.persistent_state,
gpf.persistent_state
FROM gp_persistent_relfile_node gpf,pg_class, gp_persistent_relation_node gpr, pg_namespace pgn
WHERE gpf.relfilenode_oid = pg_class.relfilenode
AND gpr.relfilenode_oid = pg_class.relfilenode
AND pgn.oid = pg_class.relnamespace
AND pg_class.relname=‘t_wifi_terminal_chrs_1_prt_1‘;
經過測試發現:
4.查看表的segment file分布
SELECT gpr.tablespace_oid,
gpr.database_oid,
gpf.relfilenode_oid,
gpf.segment_file_num,
‘/hawq_data/‘||gpr.tablespace_oid||‘/‘||gpr.database_oid||‘/‘||gpf.relfilenode_oid ||‘/‘||gpf.segment_file_num as path,
pg_class.relname,
gpr.persistent_state,
gpf.persistent_state
FROM gp_persistent_relfile_node gpf,pg_class, gp_persistent_relation_node gpr
WHERE gpf.relfilenode_oid = pg_class.relfilenode
AND gpr.relfilenode_oid = pg_class.relfilenode
AND pg_class.relname=‘person‘;
SELECT distinct gpr.tablespace_oid,
gpr.database_oid,
gpf.relfilenode_oid,
pg_class.relname,
gpr.persistent_state,
gpf.persistent_state
FROM gp_persistent_relfile_node gpf,pg_class, gp_persistent_relation_node gpr
WHERE gpf.relfilenode_oid = pg_class.relfilenode
AND gpr.relfilenode_oid = pg_class.relfilenode
AND pg_class.relname like ‘person_%‘ order by pg_class.relname ;
#schema
SELECT gpr.tablespace_oid,
gpr.database_oid,
gpf.relfilenode_oid,
gpf.segment_file_num,
‘/hawq_data/‘||gpr.tablespace_oid||‘/‘||gpr.database_oid||‘/‘||gpf.relfilenode_oid ||‘/‘||gpf.segment_file_num as path,
pgn.nspname AS schemaname,
pg_class.relname AS tablename,
gpr.persistent_state,
gpf.persistent_state
FROM gp_persistent_relfile_node gpf,pg_class, gp_persistent_relation_node gpr, pg_namespace pgn
WHERE gpf.relfilenode_oid = pg_class.relfilenode
AND gpr.relfilenode_oid = pg_class.relfilenode
AND pgn.oid = pg_class.relnamespace
AND pg_class.relname=‘t_wifi_terminal_chrs_1_prt_1‘;
經過測試發現:
數據在hdfs中的存儲位置為: tablespace/database/table/segfile
分區表A目錄中有默認哈希桶數目的segfile,但大小都為0,而其字表(如a1)目錄中有默認哈希桶數目的segfile,且有文件。
查看表大小:
select sotdsize from hawq_toolkit.hawq_size_of_table_disk where sotdtablename=‘t_net_access_log‘;
5.使用explain 或者 explain analyze 查看查詢計劃時,指定
set gp_log_dynamic_partition_pruning=on;
可以顯示掃描的分區名稱。
Explain analyze和explain語句不同,explain analyze會真正執行查詢,並得到查詢執行過程中的統計數據。explain analyze的結果對了解查詢執行的具體情況以及了解查詢性能問題產生的原因有很大幫助。
SELECT * FROM pg_stats WHERE tablename = ‘inventory‘;
#查詢會話信息
select * from pg_stat_activity;
select application_name, datname, procpid, sess_id, usename, waiting, client_addr, client_port, waiting_resource, query_start, backend_start, xact_start from pg_stat_activity;
select application_name, datname, procpid, sess_id, usename, waiting, client_addr, client_port, waiting_resource, current_query, query_start, backend_start, xact_start from pg_stat_activity;
select application_name, datname, procpid, sess_id, usename, waiting, client_addr, client_port, waiting_resource, query_start, backend_start, xact_start from pg_stat_activity where application_name=‘psql‘ and current_query<>‘<IDLE>‘;
datname表示數據庫名
procpid表示當前的SQL對應的PID
query_start表示SQL執行開始時間
current_query表示當前執行的SQL語句
waiting表示是否正在執行,t表示正在執行,f表示已經執行完成
client_addr表示客戶端IP地址
284933
kill有兩種方式,第一種是:
SELECT pg_cancel_backend(PID);
這種方式只能kill select查詢,對update、delete 及DML不生效)
第二種是:
SELECT pg_terminate_backend(PID);
這種可以kill掉各種操作(select、update、delete、drop等)操作
在pg_cancel_backend()下,session還在,事物回退;
在pg_terminate_backend()操作後,session消失,事物回退。
如果在某些時候pg_terminate_backend()不能殺死session,那麽可以在os層面,直接kill -9 pid
select * from pg_resqueue_status;
--資源隊列
SELECT * FROM dump_resource_manager_status(2);
--Segment
SELECT * FROM dump_resource_manager_status(3);
SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE relname = ‘ins_wifi_dates‘;
SELECT * FROM pg_stats WHERE tablename = ‘ins_wifi_dates‘;
SELECT gp_segment_id, COUNT(*)
FROM ins_wifi_dates
GROUP BY gp_segment_id
ORDER BY gp_segment_id
set gp_select_invisible=true;
select count(*) from pg_class;
set gp_select_invisible=false;
select count(*) from pg_class;
vacuum pg_class;
reindex table pg_class;
日誌:
set
表重分布:
ALTER TABLE sales SET WITH (REORGANIZE=TRUE);
檢查未analyze的表:
select * from hawq_toolkit.hawq_stats_missing;
http://hawq.incubator.apache.org/docs/userguide/2.2.0.0-incubating/reference/toolkit/hawq_toolkit.html#topic46
HAWQ查看表大小: //不包含分區表
SELECT relname AS name, sotdsize AS size, sotdtoastsize AS
toast, sotdadditionalsize AS other
FROM hawq_toolkit.hawq_size_of_table_disk AS sotd, pg_catalog.pg_class
WHERE sotd.sotdoid=pg_class.oid and pg_class.relname=‘t_wifi_terminal_chrs‘
ORDER BY relname;
hawq_size_of_partition_and_indexes_disk
select relname AS name, sopaidpartitionoid, sopaidpartitiontablename, sopaidpartitiontablesize as size, sotailtablesizeuncompressed as uncompressed from hawq_toolkit.hawq_size_of_partition_and_indexes_disk sopi,pg_catalog.pg_class WHERE sopi.sopaidparentoid=pg_class.oid and pg_class.relname=‘t_wifi_terminal_chrs‘
ORDER BY sopaidpartitionoid;
select relname AS name,sum(sopaidpartitiontablesize) as size from hawq_toolkit.hawq_size_of_partition_and_indexes_disk sopi,pg_catalog.pg_class WHERE sopi.sopaidparentoid=pg_class.oid and pg_class.relname=‘t_wifi_terminal_chrs‘
group by relname ;
內存/vore比值
[root@master2 pg_log]# cat hawq-2017-10-17_224829.csv
2017-10-17 18:21:57.319620 CST,,,p237647,th217192736,,,,0,con4,,seg-10000,,,,,"LOG","00000","Resource manager chooses ratio 5120 MB per core as cluster level memory to core ratio, there are 2304 MB memory 6 CORE resource unable to be utilized.",,,,,,,0,,"resourcepool.c",4641,
2017-10-17 18:21:57.319668 CST,,,p237647,th217192736,,,,0,con4,,seg-10000,,,,,"LOG","00000","Resource manager adjusts segment hd4.bigdata original global resource manager resource capacity from (154368 MB, 32 CORE) to (153600 MB, 30 CORE)",,,,,,,0,,"resourcepool.c",4787,
2017-10-17 18:21:57.319716 CST,,,p237647,th217192736,,,,0,con4,,seg-10000,,,,,"LOG","00000","Resource manager adjusts segment hd1.bigdata original global resource manager resource capacity from (154368 MB, 32 CORE) to (153600 MB, 30 CORE)",,,,,,,0,,"resourcepool.c",4787,
2017-10-17 18:21:57.319762 CST,,,p237647,th217192736,,,,0,con4,,seg-10000,,,,,"LOG","00000","Resource manager adjusts segment hd2.bigdata original global resource manager resource capacity from (154368 MB, 32 CORE) to (153600 MB, 30 CORE)",,,,,,,0,,"resourcepool.c",4787,
Apache Hawq--優化筆記