1. 程式人生 > 其它 >一次複雜查詢的優化過程

一次複雜查詢的優化過程

目錄

背景

這個複雜的查詢SQL語句,已經進行語法優化,寫入應用程式中,且不希望修改程式碼,實現執行速度大幅度提升。

查詢語句中,使用了使用者自定義函式,多個檢視巢狀,程式碼邏輯複雜,執行時長過長。

分析方向,基於查詢計劃,定位耗時較多的節點,通過改變呼叫物件,實現優化查詢效能。

查詢語句,優化前後的計劃

SQL語句如下:

analyse;
explain (analyse ,buffers ,verbose ,costs ,timing )
with t as
         (select d.*, nvl(getfinanceamount(d.keyid), 0) useMoney
          from (select t.realId                  as keyId,
                       t.bg_type,
                       t.bg_year,
                       t.bg_deptname,
                       t.bg_deptId,
                       t.bg_functiongname,
                       t.bg_functiongcode,
                       t.bg_projectname,
                       t.bg_projectcode,
                       t.bg_enconame,
                       t.bg_encocode,
                       sum(t.bg_budgetmoney)     as bgBudgetMoney,
                       sum(t.bg_budgetdeptmoney) as bgBudgetDeptMoney,
                       t.bg_budgetdeptpp,
                       sum(t.bg_detailmoney)     as bgDetailMoney,
                       t.bg_detailpp,
                       t.bg_source,
                       t.bg_bid,
                       t.bg_memo,
                       t.budgetsourcetype,
                       t.paytype
                from (select d.*, nvl(s.paytype, '其他') as paytype, d.keyid as realId
                      from budget_t_distinfo d
                               left join busi_t_budgetdetail s
                                         on s.keyid = d.bg_bid
                      where 1 = 1
                        and d.bg_detailmoney > 0
                        and d.bg_source in ('1', '3')
                      union all
                      select d.*,
                             nvl(s.paytype, '其他')  as paytype,
                             nvl(a.keyid, d.keyid) as realId
                      from budget_t_distinfo d
                               left join busi_t_budgetdetail s
                                         on s.keyid = d.bg_bid
                               left join budget_t_distinfo a
                                         on a.bg_year = d.bg_year
                                             and a.bg_type = d.bg_type
                                             and a.bg_deptid = d.bg_deptid
                                             and a.bg_functiongcode = d.bg_functiongcode
                                             and a.bg_projectcode = d.bg_projectcode
                                             and a.bg_encocode = d.bg_encocode
                                             and a.bg_source in ('1', '3')
                      where 1 = 1
                        and d.bg_detailmoney > 0
                        and d.bg_source in ('2', '6')
                     ) t
                group by t.realId, t.bg_type, t.bg_year, t.bg_deptname, t.bg_deptId, t.bg_functiongname,
                         t.bg_functiongcode,
                         t.bg_projectname, t.bg_projectcode, t.bg_enconame, t.bg_encocode, t.bg_budgetdeptpp,
                         t.bg_detailpp, t.bg_source, t.bg_bid, t.bg_memo, t.budgetsourcetype, t.paytype) d
         ),
     b as (select v.f1, v.f2, v.f3, v.f7, v.btype, v.bmname, sum(v.debitamount) as usedMoney
           from view_bd_acc v
           where 1 = 1
             and v.unitsid = 825
             and v.year = 2022
           group by v.f1, v.f2, v.f3, v.f7, v.btype, v.bmname)
select t.*, nvl(b.usedMoney, 0) as usedMoney
from t
         left join b on b.f1 = t.bg_functiongname
    and b.f2 = t.bg_enconame
    and nvl(b.f3, 0) = nvl(decode(t.bg_projectname, '請選擇', '', t.bg_projectname), 0)
    and b.f7 = decode(t.bg_source, 1, '本年預算', 2, '本年預算', 3, '結轉資金')
    and b.btype = decode(t.bg_type, 1, '基本支出', '專案支出')
    and b.bmname = t.bg_deptname
where 1 = 1
  and t.bg_year = 2022
;

優化前的查詢計劃,用時57秒

Nested Loop Left Join  (cost=40738.64..40743.89 rows=1 width=2284) (actual time=764.038..57797.678 rows=73 loops=1)
...
	Buffers: shared hit=10454324
...
...
...
Planning Time: 2.417 ms
Execution Time: 57797.965 ms

優化後的查詢計劃,用時0.15秒

Hash Right Join  (cost=7888.69..7890.66 rows=7 width=377) (actual time=53.626..156.118 rows=73 loops=1)
...
	Buffers: shared hit=23449
...
...
...
Planning Time: 2.390 ms
Execution Time: 156.318 ms

優化過程

子查詢平面化

子查詢平面化是指優化器將把子查詢融合到上層查詢。

  • 分析查詢計劃
CTE t
    ->  Subquery Scan on d  (cost=1086.55..1142.05 rows=200 width=344) (actual time=49.561..57423.904 rows=1287 loops=1)
...
->  CTE Scan on t  (cost=0.00..5.00 rows=1 width=2252) (actual time=396.500..57429.582 rows=73 loops=1)

從計劃中得知,CTE t生成資料1287行,最終過濾後得到資料73行。這裡不僅有1200行資料無效,而且CTE包含的使用者函式,被無效執行1200次,造成效能的主要問題。
造成這種現象的原因,就是子查詢沒有平面化。限制子查詢平面化的,是使用者函式屬性,查詢得知此函式屬性是不穩定。

  • 修改使用者函式屬性
select proname,
	CASE
		WHEN p.provolatile = 'i' THEN 'immutable'
		WHEN p.provolatile = 's' THEN 'stable'
		WHEN p.provolatile = 'v' THEN 'volatile'
		END as Volatility
from zgf.pg_catalog.pg_proc p
where proname = 'getfinanceamount';
	 proname      | Volatility 
------------------+------------
 getfinanceamount | volatile
(1 行記錄)
alter function getfinanceamount stable;
  • 優化後的計劃
    查詢計劃中,沒有建立CTE t,已經與上層查詢融合。
    使用者函式的執行次數是73次,節省時長 = (57423.904-49.561)/1287*(1287-73)=54120ms 。

查詢所需的索引

如果沒有適合的索引,查詢就會讀取全表

  • 檢視帶來time累計計算的黑盒
    在計劃中,最終用時突然增長。
Hash Right Join  (cost=40002.45..40004.42 rows=7 width=377) (actual time=427.929..3568.261 rows=73 loops=1)

這是檢視用時造成的。獨立執行檢視程式碼,得知對大表使用了Seq Scan,根據過濾條件,建立適當的索引。

create index busi_t_reimburse_subject_i1 on busi_t_reimburse_subject (f9, economicsubjectname, nvl(projectsubjectname, 0));

優化後的查詢計劃,用時節省了3.0秒

Hash Right Join  (cost=40002.45..40004.42 rows=7 width=377) (actual time=370.715..470.194 rows=73 loops=1)
  • B-tree索引不支援字元模糊過濾
    計劃中,仍有較大的Seq Scan
Seq Scan on zgf.accounting_journal j  (cost=0.00..27531.84 rows=4415 width=137) (actual time=0.022..104.008 rows=70505 loops=1)
	Filter: ((split_part((j.subject)::text, ' '::text, 1) ~~ '71010101%'::text) OR (split_part((j.subject)::text, ' '::text, 1) ~~ '71010102%'::text))
Rows Removed by Filter: 372087

這裡使用了字元模糊匹配,需要建立gin型索引

create extension sys_trgm ;
create index accounting_journal_subjectpre on accounting_journal USING gin (split_part((subject), ' ', 1) gin_trgm_ops);

優化後的查詢計劃,用時節省了100ms

 BitmapOr  (cost=806.92..806.92 rows=71038 width=0) (actual time=23.473..23.474 rows=0 loops=1)
   ->  Bitmap Index Scan on accounting_journal_subjectpre  (cost=0.00..329.41 rows=29522 width=0) (actual time=11.421..11.421 rows=70525 loops=1)
          Index Cond: (split_part((j.subject)::text, ' '::text, 1) ~~ '71010101%'::text)"
   ->  Bitmap Index Scan on accounting_journal_subjectpre  (cost=0.00..443.37 rows=41516 width=0) (actual time=12.051..12.051 rows=41362 loops=1)
          Index Cond: (split_part((j.subject)::text, ' '::text, 1) ~~ '71010102%'::text)"
  • 增加索引
    計劃中,有較大的Seq Scan
Seq Scan on zgf.sys_t_department p  (cost=0.00..122.62 rows=58 width=21) (actual time=0.481..0.521 rows=58 loops=1)
    Filter: ((p.unitsid = '825'::numeric) AND ((p.useflag)::text = '1'::text))
    Rows Removed by Filter: 4717

建立B-tree索引

create index sys_t_department_unitsid on sys_t_department (unitsid);

優化後,用時節省0.1ms

Bitmap Index Scan on sys_t_department_unitsid  (cost=0.00..4.72 rows=58 width=0) (actual time=0.011..0.011 rows=58 loops=1)
   Index Cond: (p.unitsid = '825'::numeric)

用時較長的node

計劃中還有幾處node,用時較長,可以嘗試優化
一處用時較長的node

Nested Loop  (cost=21761.45..52337.05 rows=1 width=1793) (actual time=161.838..293.567 rows=47 loops=1)
   join : accounting_journalassist , accounting_journal , accounting_voucher 等

分析結果,因為檢視的過濾條件,語法繁瑣,浪費了CPU時間。

--原檢視程式碼
CREATE FORCE VIEW view_reportquery AS
 SELECT ...
   FROM ...
  WHERE ...
   AND (DECODE((to_char((v.withto)::text))::character varying, NULL::character varying, "numeric"(0), text_numeric(to_char((v.withto)::text))) = (0)::numeric) 
   AND (DECODE(v.sn, text_numeric(NULL::character varying), "numeric"(0), v.sn) > (0)::numeric) AND ((v.status)::text <> '4'::text))
  GROUP BY ...;
--新檢視程式碼
  CREATE OR REPLACE VIEW view_reportquery AS
SELECT ...
FROM ...
WHERE ...
  AND nvl(v.withto, 0) = 0
  AND v.sn > 0
  AND ...
GROUP BY ...
;

優化後,用時節省150ms

Nested Loop  (cost=0.84..7334.49 rows=1 width=1793) (actual time=32.992..40.572 rows=47 loops=1)
	join : accounting_journalassist , accounting_journal , accounting_voucher 等

其他node的優化,對效能影響較小。

總結

  • 執行計劃,不能完整反應所有的細節,重視時長與資料塊的超長增長,要將其視為病灶。
  • 簡單的表示式,具有最優的效能。
  • 查詢優化之路是沒有止境,目標是平衡查詢的效能與相容性。