在NDV良好的列生成直方圖導致謂詞為該列等值查詢的SQL執行計劃異常
這是來自一個真實案例的測試,該庫由於頻繁的全表掃描導致直接路徑讀,由於DML的操作導致大量的Enquiry:KO-Fast Object Checkpoint等待事件,導致系統緩慢,問題時段會持續一個小時左右,導致系統整個緩慢,需要調優處理,我們將等待時間最長的SQL都拉出來,逐個分析,發現等待最嚴重的一個SQL的全表掃描無法消除,但是從欄位分佈看,只需要返回3行記錄,但是Oracle依然選擇全掃,我們通過模擬這個過程,重現問題根源,最後通過本文的最後的方法消除全表掃描,使得Oracle找到正確的執行計劃。
建立測試表
create table a_t1(id number,district varchar2(70));
插入資料
begin
for i in 1..100000 loop
insert into a_t1 values(i,'ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST'||i);
end loop;
commit;
end;
/
SCOTT@orcl1>select column_name,num_distinct,num_buckets,avg_col_len,histogram from user_tab_columns where table_name='A_T1';
COLUMN_NAME NUM_DISTINCT NUM_BUCKETS AVG_COL_LEN HISTOGRAM
------------------------------ ------------ ----------- ----------- ---------------
ID NONE
DISTRINCT NONE
SCOTT@orcl1>select count(distinct DISTRINCT) from a_t1;
COUNT(DISTINCT DISTRINCT)
------------------------
100000
Elapsed: 00:00:00.05
在欄位DISTRINCT建立索引
SCOTT@orcl1>create index idx_at1_DISTRINCT on a_t1(DISTRINCT);
Index created.
Elapsed: 00:00:00.43
收集表統計資訊
SCOTT@orcl1>select column_name,num_distinct,num_buckets,avg_col_len,histogram from user_tab_columns where table_name='A_T1';
COLUMN_NAME NUM_DISTINCT NUM_BUCKETS AVG_COL_LEN HISTOGRAM
------------------------------ ------------ ----------- ----------- ---------------
ID 100000 1 5 NONE
DISTRINCT 99568 1 46 NONE
Elapsed: 00:00:00.01
SCOTT@orcl1>select table_name,column_name,avg_col_len,num_buckets,histogram fromuser_tab_col_statistics where table_name='A_T1';
TABLE_NAME COLUMN_NAME AVG_COL_LEN NUM_BUCKETS HISTOGRAM
------------------------------ ------------------------------ ----------- ----------- ---------------
A_T1 ID 5 1NONE
A_T1 DISTRINCT 46 1 NONE
Elapsed: 00:00:00.02
此時沒有直方圖資訊,我們查詢SQL
SCOTT@orcl1>select * from a_t1 where DISTRINCT='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552';
ID DISTRINCT
---------- ----------------------------------------------------------------------
552 ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552
Elapsed: 00:00:00.02
Execution Plan
----------------------------------------------------------
Plan hash value: 1710520156
-------------------------------------------------------------------------------------------------
| Id| Operation | Name | Rows| Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
|0 | SELECT STATEMENT | | 1 | 51 | 4(0)| 00:00:01 |
|1 |TABLE ACCESS BY INDEX ROWID| A_T1 | 1 | 51 | 4(0)| 00:00:01 |
|*2 |INDEX RANGE SCAN | IDX_AT1_DISTRINCT | 1 | | 3(0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552')
Statistics
----------------------------------------------------------
1recursive calls
db block gets
5consistent gets
physical reads
redo size
640bytes sent via SQL*Net to client
524bytes received via SQL*Net from client
2SQL*Net roundtrips to/from client
sorts (memory)
sorts (disk)
1rows processed
這是最優的執行計劃,因為謂詞欄位值只有一個。下面我們收集索引列的直方圖。
SCOTT@orcl1>exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'A_T1',estimate_percent=>10,method_opt=>'for all indexed columns' , cascade => true);
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.55
SCOTT@orcl1> select table_name,column_name,avg_col_len,num_buckets,histogram fromuser_tab_col_statistics where table_name='A_T1';
TABLE_NAME COLUMN_NAME AVG_COL_LEN NUM_BUCKETS HISTOGRAM
------------------------------ ------------------------------ ----------- ----------- ---------------
A_T1 ID 5 1 NONE
A_T1 DISTRINCT 46 1 FREQUENCY
Elapsed: 00:00:00.07
此時是頻率直方圖,看直方圖的桶資料統計
SCOTT@orcl1>select table_name,column_name,endpoint_number,endpoint_value from user_histograms where table_name='A_T1';
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
------------------------------ ------------------------------ --------------- --------------
A_T1 DISTRINCT 9829 3.3884E+35 <<<<都在一個桶中
A_T1 ID 1
A_T1 ID 1 100000
再看執行計劃
SCOTT@orcl1>/
ID DISTRINCT
---------- ----------------------------------------------------------------------
552 ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 2172434975
--------------------------------------------------------------------------
| Id| Operation | Name | Rows| Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
|0 | SELECT STATEMENT| |100K|4980K|241(2)| 00:00:03 |
|*1 |TABLE ACCESS FULL| A_T1 |100K|4980K|241(2)| 00:00:03 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552')
Statistics
----------------------------------------------------------
recursive calls
db block gets
822consistent gets
physical reads
redo size
636bytes sent via SQL*Net to client
524bytes received via SQL*Net from client
2SQL*Net roundtrips to/from client
sorts (memory)
sorts (disk)
1rows processed
此時是錯誤的執行計劃,CBO選擇了全表掃描。
這裡我們來梳理下結果,我們有一個表a_t1欄位DISTRINCT都是唯一值,通過遊戲買賣地圖索引找資料應該最快,謂詞是"DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552',所以,因為業務知道資料情況並建立了索引,這是正確的方法,但是Orale確沒有走索引,而是直接選擇了全表掃,這裡需要分析Oracle為了不走索引,也就是索引提供的資訊Oracle認為成本更高。下面我們看強制走索引的執行計劃。
SCOTT@orcl1>select /*+ index(a_t1,IDX_AT1_DISTRINCT) */ * from a_t1 where DISTRINCT='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552';
ID DISTRINCT
---------- ----------------------------------------------------------------------
552 ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 1710520156
-------------------------------------------------------------------------------------------------
| Id| Operation | Name | Rows| Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
|0 | SELECT STATEMENT | |100K|4980K| 20940(1)| 00:04:12 |
|1 |TABLE ACCESS BY INDEX ROWID| A_T1 |100K|4980K| 20940(1)| 00:04:12 |
|*2 |INDEX RANGE SCAN | IDX_AT1_DISTRINCT |100K| |1309(1)| 00:00:16 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552')
Statistics
----------------------------------------------------------
recursive calls
db block gets
5consistent gets
physical reads
redo size
640bytes sent via SQL*Net to client
523bytes received via SQL*Net from client
2SQL*Net roundtrips to/from client
sorts (memory)
sorts (disk)
1rows processed
我們看第2步驟,|*2 |INDEX RANGE SCAN | IDX_AT1_DISTRINCT |100K| |1309(1)| 00:00:16 |,也就是Oracle認為要是走索引找一行資料需要返回10萬行資料,顯然這裡跟我們想象的有出入,我們認為索引針對該值"DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552’只有一條符合條件記錄,但是Oracle認為有10萬條記錄滿足條件,走索引成本太高,這裡我們就有理由相信Oracle這裡有問題,通過MOS查詢我們找到一篇文章。
Histogram of Character Columns Longer Than 32 Characters Causes Incorrect SQL Plan of Table Full Scan (Doc ID 2413826.1)
也就是當欄位平均長度超過32位元組,Oracle直方圖只會儲存前32個位元組,這樣的索引列當該列因為統計生成了直方圖時,這個直方圖只會根據字典的前32個位元組生成,導致Oracle根據直方圖計算執行計劃時選擇走全表掃描
解決方法:刪除列上的直方圖(與我們剛建立該表沒有收集統計資訊一樣)
SCOTT@orcl1>exec DBMS_STATS.DELETE_COLUMN_STATS(ownname=>'SCOTT',tabname=>'A_T1',colname=>'DISTRINCT',col_stat_type=>'HISTOGRAM');
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.07
SCOTT@orcl1>select table_name,column_name,avg_col_len,num_buckets,histogram fromuser_tab_col_statistics where table_name='A_T1'
TABLE_NAME COLUMN_NAME AVG_COL_LEN NUM_BUCKETS HISTOGRAM
------------------------------ ------------------------------ ----------- ----------- ---------------
A_T1 ID 5 1NONE
A_T1 DISTRINCT 46 1NONE
再次執行該語句看執行計劃,此時Oracle不再基於直方圖,因為該欄位的良好分佈特性,走索引是Oracle最優選擇,這次刪除直方圖後,走對了執行計劃。
SCOTT@orcl1>select * from a_t1 where DISTRINCT='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552';
ID DISTRINCT
---------- ----------------------------------------------------------------------
552 ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 1710520156
-------------------------------------------------------------------------------------------------
| Id| Operation | Name | Rows| Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
|0 | SELECT STATEMENT | | 1 | 51 | 4(0)| 00:00:01 |
|1 |TABLE ACCESS BY INDEX ROWID| A_T1 | 1 | 51 | 4(0)| 00:00:01 |
|*2 |INDEX RANGE SCAN | IDX_AT1_DISTRINCT | 1 | | 3(0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("DISTRINCT"='ABCDEFGHIJKLMNOPQRSTABCDEFGHIJKLMNOPQRST552')
Statistics
----------------------------------------------------------
1recursive calls
db block gets
5consistent gets
physical reads
redo size
640bytes sent via SQL*Net to client
523bytes received via SQL*Net from client
2SQL*Net roundtrips to/from client
sorts (memory)
sorts (disk)
1rows processed