一次複雜查詢的優化過程 - 函式穩定性
阿新 • • 發佈:2022-03-17
背景
這個複雜的查詢SQL語句,已經進行語法優化,寫入應用程式中,且不希望修改程式碼,實現執行速度大幅度提升。
查詢語句中,使用了使用者自定義函式,多個檢視巢狀,程式碼邏輯複雜,執行時長過長。
分析方向,基於查詢計劃,定位耗時較多的節點,通過改變呼叫物件,實現優化查詢效能。
查詢語句,優化前後的計劃
SQL語句如下:
analyse; explain (analyse ,buffers ,verbose ,costs ,timing ) with t as (select d.*, nvl(getfinanceamount(d.keyid), 0) useMoneyfrom (select t.realId as keyId, t.bg_type, t.bg_year, t.bg_deptname, t.bg_deptId, t.bg_functiongname, t.bg_functiongcode, t.bg_projectname, t.bg_projectcode, t.bg_enconame, t.bg_encocode,sum(t.bg_budgetmoney) as bgBudgetMoney, sum(t.bg_budgetdeptmoney) as bgBudgetDeptMoney, t.bg_budgetdeptpp, sum(t.bg_detailmoney) as bgDetailMoney, t.bg_detailpp, t.bg_source, t.bg_bid, t.bg_memo, t.budgetsourcetype, t.paytypefrom (select d.*, nvl(s.paytype, '其他') as paytype, d.keyid as realId from budget_t_distinfo d left join busi_t_budgetdetail s on s.keyid = d.bg_bid where 1 = 1 and d.bg_detailmoney > 0 and d.bg_source in ('1', '3') union all select d.*, nvl(s.paytype, '其他') as paytype, nvl(a.keyid, d.keyid) as realId from budget_t_distinfo d left join busi_t_budgetdetail s on s.keyid = d.bg_bid left join budget_t_distinfo a on a.bg_year = d.bg_year and a.bg_type = d.bg_type and a.bg_deptid = d.bg_deptid and a.bg_functiongcode = d.bg_functiongcode and a.bg_projectcode = d.bg_projectcode and a.bg_encocode = d.bg_encocode and a.bg_source in ('1', '3') where 1 = 1 and d.bg_detailmoney > 0 and d.bg_source in ('2', '6') ) t group by t.realId, t.bg_type, t.bg_year, t.bg_deptname, t.bg_deptId, t.bg_functiongname, t.bg_functiongcode, t.bg_projectname, t.bg_projectcode, t.bg_enconame, t.bg_encocode, t.bg_budgetdeptpp, t.bg_detailpp, t.bg_source, t.bg_bid, t.bg_memo, t.budgetsourcetype, t.paytype) d ), b as (select v.f1, v.f2, v.f3, v.f7, v.btype, v.bmname, sum(v.debitamount) as usedMoney from view_bd_acc v where 1 = 1 and v.unitsid = 825 and v.year = 2022 group by v.f1, v.f2, v.f3, v.f7, v.btype, v.bmname) select t.*, nvl(b.usedMoney, 0) as usedMoney from t left join b on b.f1 = t.bg_functiongname and b.f2 = t.bg_enconame and nvl(b.f3, 0) = nvl(decode(t.bg_projectname, '請選擇', '', t.bg_projectname), 0) and b.f7 = decode(t.bg_source, 1, '本年預算', 2, '本年預算', 3, '結轉資金') and b.btype = decode(t.bg_type, 1, '基本支出', '專案支出') and b.bmname = t.bg_deptname where 1 = 1 and t.bg_year = 2022 ;
優化前的查詢計劃,用時57秒
Nested Loop Left Join (cost=40738.64..40743.89 rows=1 width=2284) (actual time=764.038..57797.678 rows=73 loops=1) ... Buffers: shared hit=10454324 ... ... ... Planning Time: 2.417 ms Execution Time: 57797.965 ms
優化後的查詢計劃,用時0.15秒
Hash Right Join (cost=7888.69..7890.66 rows=7 width=377) (actual time=53.626..156.118 rows=73 loops=1) ... Buffers: shared hit=23449 ... ... ... Planning Time: 2.390 ms Execution Time: 156.318 ms
優化過程
子查詢平面化
子查詢平面化是指優化器將把子查詢融合到上層查詢。
- 分析查詢計劃
CTE t -> Subquery Scan on d (cost=1086.55..1142.05 rows=200 width=344) (actual time=49.561..57423.904 rows=1287 loops=1) ... -> CTE Scan on t (cost=0.00..5.00 rows=1 width=2252) (actual time=396.500..57429.582 rows=73 loops=1)
從計劃中得知,CTE t生成資料1287行,最終過濾後得到資料73行。這裡不僅有1200行資料無效,而且CTE包含的使用者函式,被無效執行1200次,造成效能的主要問題。
造成這種現象的原因,就是子查詢沒有平面化。限制子查詢平面化的,是使用者函式屬性,查詢得知此函式屬性是不穩定。
- 修改使用者函式屬性
select proname, CASE WHEN p.provolatile = 'i' THEN 'immutable' WHEN p.provolatile = 's' THEN 'stable' WHEN p.provolatile = 'v' THEN 'volatile' END as Volatility from zgf.pg_catalog.pg_proc p where proname = 'getfinanceamount';
proname | Volatility ------------------+------------ getfinanceamount | volatile (1 行記錄)
alter function getfinanceamount stable;
- 優化後的計劃
查詢計劃中,沒有建立CTE t,已經與上層查詢融合。
使用者函式的執行次數是73次,節省時長 = (57423.904-49.561)/1287*(1287-73)=54120ms 。
查詢所需的索引
如果沒有適合的索引,查詢就會讀取全表
- 檢視帶來time累計計算的黑盒,在計劃中,最終用時突然增長。
Hash Right Join (cost=40002.45..40004.42 rows=7 width=377) (actual time=427.929..3568.261 rows=73 loops=1)
這是檢視用時造成的。獨立執行檢視程式碼,得知對大表使用了Seq Scan,根據過濾條件,建立適當的索引。
create index busi_t_reimburse_subject_i1 on busi_t_reimburse_subject (f9, economicsubjectname, nvl(projectsubjectname, 0));
優化後的查詢計劃,用時節省了3.0秒
Hash Right Join (cost=40002.45..40004.42 rows=7 width=377) (actual time=370.715..470.194 rows=73 loops=1)
- B-tree索引不支援字元模糊過濾
計劃中,仍有較大的Seq Scan
Seq Scan on zgf.accounting_journal j (cost=0.00..27531.84 rows=4415 width=137) (actual time=0.022..104.008 rows=70505 loops=1) Filter: ((split_part((j.subject)::text, ' '::text, 1) ~~ '71010101%'::text) OR (split_part((j.subject)::text, ' '::text, 1) ~~ '71010102%'::text)) Rows Removed by Filter: 372087
這裡使用了字元模糊匹配,需要建立gin型索引
create extension sys_trgm ; create index accounting_journal_subjectpre on accounting_journal USING gin (split_part((subject), ' ', 1) gin_trgm_ops);
優化後的查詢計劃,用時節省了100ms
BitmapOr (cost=806.92..806.92 rows=71038 width=0) (actual time=23.473..23.474 rows=0 loops=1) -> Bitmap Index Scan on accounting_journal_subjectpre (cost=0.00..329.41 rows=29522 width=0) (actual time=11.421..11.421 rows=70525 loops=1) Index Cond: (split_part((j.subject)::text, ' '::text, 1) ~~ '71010101%'::text)" -> Bitmap Index Scan on accounting_journal_subjectpre (cost=0.00..443.37 rows=41516 width=0) (actual time=12.051..12.051 rows=41362 loops=1) Index Cond: (split_part((j.subject)::text, ' '::text, 1) ~~ '71010102%'::text)"
- 增加索引
計劃中,有較大的Seq Scan
Seq Scan on zgf.sys_t_department p (cost=0.00..122.62 rows=58 width=21) (actual time=0.481..0.521 rows=58 loops=1) Filter: ((p.unitsid = '825'::numeric) AND ((p.useflag)::text = '1'::text)) Rows Removed by Filter: 4717
建立B-tree索引
create index sys_t_department_unitsid on sys_t_department (unitsid);
優化後,用時節省0.1ms
Bitmap Index Scan on sys_t_department_unitsid (cost=0.00..4.72 rows=58 width=0) (actual time=0.011..0.011 rows=58 loops=1) Index Cond: (p.unitsid = '825'::numeric)
用時較長的node
計劃中還有幾處node,用時較長,可以嘗試優化一處用時較長的node
Nested Loop (cost=21761.45..52337.05 rows=1 width=1793) (actual time=161.838..293.567 rows=47 loops=1) join : accounting_journalassist , accounting_journal , accounting_voucher 等
分析結果,因為檢視的過濾條件,語法繁瑣,浪費了CPU時間。
--原檢視程式碼 CREATE FORCE VIEW view_reportquery AS SELECT ... FROM ... WHERE ... AND (DECODE((to_char((v.withto)::text))::character varying, NULL::character varying, "numeric"(0), text_numeric(to_char((v.withto)::text))) = (0)::numeric) AND (DECODE(v.sn, text_numeric(NULL::character varying), "numeric"(0), v.sn) > (0)::numeric) AND ((v.status)::text <> '4'::text)) GROUP BY ...; --新檢視程式碼 CREATE OR REPLACE VIEW view_reportquery AS SELECT ... FROM ... WHERE ... AND nvl(v.withto, 0) = 0 AND v.sn > 0 AND ... GROUP BY ... ;
優化後,用時節省150ms
Nested Loop (cost=0.84..7334.49 rows=1 width=1793) (actual time=32.992..40.572 rows=47 loops=1) join : accounting_journalassist , accounting_journal , accounting_voucher 等
其他node的優化,對效能影響較小。
總結
- 執行計劃,不能完整反應所有的細節,重視時長與資料塊的超長增長,要將其視為病灶。
- 簡單的表示式,具有最優的效能。
- 查詢優化之路是沒有止境,目標是平衡查詢的效能與相容性。