1. 程式人生 > 其它 >一次複雜查詢的優化過程 - 函式穩定性

一次複雜查詢的優化過程 - 函式穩定性

背景

這個複雜的查詢SQL語句,已經進行語法優化,寫入應用程式中,且不希望修改程式碼,實現執行速度大幅度提升。

查詢語句中,使用了使用者自定義函式,多個檢視巢狀,程式碼邏輯複雜,執行時長過長。

分析方向,基於查詢計劃,定位耗時較多的節點,通過改變呼叫物件,實現優化查詢效能。

查詢語句,優化前後的計劃

SQL語句如下:

analyse;
explain (analyse ,buffers ,verbose ,costs ,timing )
with t as
         (select d.*, nvl(getfinanceamount(d.keyid), 0) useMoney
          
from (select t.realId as keyId, t.bg_type, t.bg_year, t.bg_deptname, t.bg_deptId, t.bg_functiongname, t.bg_functiongcode, t.bg_projectname, t.bg_projectcode, t.bg_enconame, t.bg_encocode,
sum(t.bg_budgetmoney) as bgBudgetMoney, sum(t.bg_budgetdeptmoney) as bgBudgetDeptMoney, t.bg_budgetdeptpp, sum(t.bg_detailmoney) as bgDetailMoney, t.bg_detailpp, t.bg_source, t.bg_bid, t.bg_memo, t.budgetsourcetype, t.paytype
from (select d.*, nvl(s.paytype, '其他') as paytype, d.keyid as realId from budget_t_distinfo d left join busi_t_budgetdetail s on s.keyid = d.bg_bid where 1 = 1 and d.bg_detailmoney > 0 and d.bg_source in ('1', '3') union all select d.*, nvl(s.paytype, '其他') as paytype, nvl(a.keyid, d.keyid) as realId from budget_t_distinfo d left join busi_t_budgetdetail s on s.keyid = d.bg_bid left join budget_t_distinfo a on a.bg_year = d.bg_year and a.bg_type = d.bg_type and a.bg_deptid = d.bg_deptid and a.bg_functiongcode = d.bg_functiongcode and a.bg_projectcode = d.bg_projectcode and a.bg_encocode = d.bg_encocode and a.bg_source in ('1', '3') where 1 = 1 and d.bg_detailmoney > 0 and d.bg_source in ('2', '6') ) t group by t.realId, t.bg_type, t.bg_year, t.bg_deptname, t.bg_deptId, t.bg_functiongname, t.bg_functiongcode, t.bg_projectname, t.bg_projectcode, t.bg_enconame, t.bg_encocode, t.bg_budgetdeptpp, t.bg_detailpp, t.bg_source, t.bg_bid, t.bg_memo, t.budgetsourcetype, t.paytype) d ), b as (select v.f1, v.f2, v.f3, v.f7, v.btype, v.bmname, sum(v.debitamount) as usedMoney from view_bd_acc v where 1 = 1 and v.unitsid = 825 and v.year = 2022 group by v.f1, v.f2, v.f3, v.f7, v.btype, v.bmname) select t.*, nvl(b.usedMoney, 0) as usedMoney from t left join b on b.f1 = t.bg_functiongname and b.f2 = t.bg_enconame and nvl(b.f3, 0) = nvl(decode(t.bg_projectname, '請選擇', '', t.bg_projectname), 0) and b.f7 = decode(t.bg_source, 1, '本年預算', 2, '本年預算', 3, '結轉資金') and b.btype = decode(t.bg_type, 1, '基本支出', '專案支出') and b.bmname = t.bg_deptname where 1 = 1 and t.bg_year = 2022 ;

優化前的查詢計劃,用時57秒

Nested Loop Left Join  (cost=40738.64..40743.89 rows=1 width=2284) (actual time=764.038..57797.678 rows=73 loops=1)
...
    Buffers: shared hit=10454324
...
...
...
Planning Time: 2.417 ms
Execution Time: 57797.965 ms

優化後的查詢計劃,用時0.15秒

Hash Right Join  (cost=7888.69..7890.66 rows=7 width=377) (actual time=53.626..156.118 rows=73 loops=1)
...
    Buffers: shared hit=23449
...
...
...
Planning Time: 2.390 ms
Execution Time: 156.318 ms

優化過程

子查詢平面化

子查詢平面化是指優化器將把子查詢融合到上層查詢。

  • 分析查詢計劃
CTE t
    ->  Subquery Scan on d  (cost=1086.55..1142.05 rows=200 width=344) (actual time=49.561..57423.904 rows=1287 loops=1)
...
->  CTE Scan on t  (cost=0.00..5.00 rows=1 width=2252) (actual time=396.500..57429.582 rows=73 loops=1)

從計劃中得知,CTE t生成資料1287行,最終過濾後得到資料73行。這裡不僅有1200行資料無效,而且CTE包含的使用者函式,被無效執行1200次,造成效能的主要問題。
造成這種現象的原因,就是子查詢沒有平面化。限制子查詢平面化的,是使用者函式屬性,查詢得知此函式屬性是不穩定。

  • 修改使用者函式屬性
select proname,
    CASE
        WHEN p.provolatile = 'i' THEN 'immutable'
        WHEN p.provolatile = 's' THEN 'stable'
        WHEN p.provolatile = 'v' THEN 'volatile'
        END as Volatility
from zgf.pg_catalog.pg_proc p
where proname = 'getfinanceamount';
     proname      | Volatility 
------------------+------------
 getfinanceamount | volatile
(1 行記錄)
alter function getfinanceamount stable;
  • 優化後的計劃
    查詢計劃中,沒有建立CTE t,已經與上層查詢融合。
    使用者函式的執行次數是73次,節省時長 = (57423.904-49.561)/1287*(1287-73)=54120ms 。

查詢所需的索引

如果沒有適合的索引,查詢就會讀取全表

  • 檢視帶來time累計計算的黑盒,在計劃中,最終用時突然增長。
Hash Right Join  (cost=40002.45..40004.42 rows=7 width=377) (actual time=427.929..3568.261 rows=73 loops=1)

這是檢視用時造成的。獨立執行檢視程式碼,得知對大表使用了Seq Scan,根據過濾條件,建立適當的索引。

create index busi_t_reimburse_subject_i1 on busi_t_reimburse_subject (f9, economicsubjectname, nvl(projectsubjectname, 0));

優化後的查詢計劃,用時節省了3.0秒

Hash Right Join  (cost=40002.45..40004.42 rows=7 width=377) (actual time=370.715..470.194 rows=73 loops=1)
  • B-tree索引不支援字元模糊過濾

計劃中,仍有較大的Seq Scan

Seq Scan on zgf.accounting_journal j  (cost=0.00..27531.84 rows=4415 width=137) (actual time=0.022..104.008 rows=70505 loops=1)
    Filter: ((split_part((j.subject)::text, ' '::text, 1) ~~ '71010101%'::text) OR (split_part((j.subject)::text, ' '::text, 1) ~~ '71010102%'::text))
Rows Removed by Filter: 372087

這裡使用了字元模糊匹配,需要建立gin型索引

create extension sys_trgm ;
create index accounting_journal_subjectpre on accounting_journal USING gin (split_part((subject), ' ', 1) gin_trgm_ops);

優化後的查詢計劃,用時節省了100ms

 BitmapOr  (cost=806.92..806.92 rows=71038 width=0) (actual time=23.473..23.474 rows=0 loops=1)
   ->  Bitmap Index Scan on accounting_journal_subjectpre  (cost=0.00..329.41 rows=29522 width=0) (actual time=11.421..11.421 rows=70525 loops=1)
          Index Cond: (split_part((j.subject)::text, ' '::text, 1) ~~ '71010101%'::text)"
   ->  Bitmap Index Scan on accounting_journal_subjectpre  (cost=0.00..443.37 rows=41516 width=0) (actual time=12.051..12.051 rows=41362 loops=1)
          Index Cond: (split_part((j.subject)::text, ' '::text, 1) ~~ '71010102%'::text)"
  • 增加索引

計劃中,有較大的Seq Scan

Seq Scan on zgf.sys_t_department p  (cost=0.00..122.62 rows=58 width=21) (actual time=0.481..0.521 rows=58 loops=1)
    Filter: ((p.unitsid = '825'::numeric) AND ((p.useflag)::text = '1'::text))
    Rows Removed by Filter: 4717

建立B-tree索引

create index sys_t_department_unitsid on sys_t_department (unitsid);

優化後,用時節省0.1ms

Bitmap Index Scan on sys_t_department_unitsid  (cost=0.00..4.72 rows=58 width=0) (actual time=0.011..0.011 rows=58 loops=1)
   Index Cond: (p.unitsid = '825'::numeric)

用時較長的node

計劃中還有幾處node,用時較長,可以嘗試優化一處用時較長的node

Nested Loop  (cost=21761.45..52337.05 rows=1 width=1793) (actual time=161.838..293.567 rows=47 loops=1)
   join : accounting_journalassist , accounting_journal , accounting_voucher 等

分析結果,因為檢視的過濾條件,語法繁瑣,浪費了CPU時間。

--原檢視程式碼
CREATE FORCE VIEW view_reportquery AS
 SELECT ...
   FROM ...
  WHERE ...
   AND (DECODE((to_char((v.withto)::text))::character varying, NULL::character varying, "numeric"(0), text_numeric(to_char((v.withto)::text))) = (0)::numeric) 
   AND (DECODE(v.sn, text_numeric(NULL::character varying), "numeric"(0), v.sn) > (0)::numeric) AND ((v.status)::text <> '4'::text))
  GROUP BY ...;
--新檢視程式碼
  CREATE OR REPLACE VIEW view_reportquery AS
SELECT ...
FROM ...
WHERE ...
  AND nvl(v.withto, 0) = 0
  AND v.sn > 0
  AND ...
GROUP BY ...
;

優化後,用時節省150ms

Nested Loop  (cost=0.84..7334.49 rows=1 width=1793) (actual time=32.992..40.572 rows=47 loops=1)
    join : accounting_journalassist , accounting_journal , accounting_voucher 等

其他node的優化,對效能影響較小。

總結

  • 執行計劃,不能完整反應所有的細節,重視時長與資料塊的超長增長,要將其視為病灶。
  • 簡單的表示式,具有最優的效能。
  • 查詢優化之路是沒有止境,目標是平衡查詢的效能與相容性。