1. 程式人生 > 其它 >關於統計資訊過期的效能落差(r5筆記第36天)

關於統計資訊過期的效能落差(r5筆記第36天)

今天客戶反饋某一個應用部署補丁的時候,執行了一個指令碼一個多小時還沒有執行完。 語句是下面這樣的形式。 insert into em1_rater_00068_01 (select * from em1_rater_00050_01_backup a where a.record_id <= 65971543 and not exists (select b.record_id from em1_rater_00068_01 b where a.record_id = b.record_id)); 檢視執行計劃發現語句的執行計劃資訊真是驚人,執行計劃中竟然出現了27T的字樣,但是檢視預估的時間卻只有35秒左右。而且這個預估是在4個並行的基礎上。

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                          |                              |       |       |  2879 (100)|          |       |       |        |      |            |
|   1 |  PX COORDINATOR                           |                              |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)                     | :TQ10003                     |    15M|    27T|  2879   (3)| 00:00:35 |       |       |  Q1,03 | P->S | QC (RAND)  |

如果不啟用並行,執行計劃的情況就更糟糕了。

Execution Plan
----------------------------------------------------------
Plan hash value: 3489211022
--------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name                      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                           |    25M|    45T|    25M  (1)| 83:25:53 |       |       |
|   1 |  PARTITION RANGE ALL               |                           |  1251K|  2319G|   387   (1)| 00:00:05 |     1 |     5 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| EM1_RATER_00050_01_BACKUP |  1251K|  2319G|   387   (1)| 00:00:05 |     1 |     5 |
|*  3 |    INDEX RANGE SCAN                | EM1_RATER_00050_01_BK_PK  |   225K|       |    26   (4)| 00:00:01 |     1 |     5 |
|   4 |     PARTITION RANGE ALL            |                           |     2 |    26 |     1   (0)| 00:00:01 |     1 |     5 |
|*  5 |      INDEX RANGE SCAN              | EM1_RATER_00068_01_PK     |     2 |    26 |     1   (0)| 00:00:01 |     1 |     5 |
--------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("A"."RECORD_ID"<=65971543)
       filter( NOT EXISTS (SELECT 0 FROM "EM1_RATER_00068_01" "B" WHERE "B"."RECORD_ID"=:B1))
   5 - access("B"."RECORD_ID"=:B1)

實際上這個表中的資料只有幾十G,根本不會出現幾十T的可能。 可以看出執行計劃落差很大,查看了表的統計資訊,發現還是存在很大的落差,先啟用並行收集統計資訊。 exec dbms_stats.gather_table_stats(OWNNAME=>null,tabname=>'EM1_RATER_00050_01_BACKUP',estimate_percent =>dbms_stats.auto_sample_size,METHOD_OPT =>'FOR ALL INDEXED COLUMNS SIZE 1',granularity=>'DEFAULT',cascade=>TRUE,degree=>8,block_sample=>TRUE); PL/SQL procedure successfully completed. Elapsed: 00:03:14.68 可以藉著這個機會看到收集統計資訊的時候,後臺還是做了大量的資訊計算。

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  8h0z7512pn17a, child number 0
-------------------------------------
/* SQL Analyze(0) */ select /*+  full(t)    parallel(t,8)
parallel_index(t,8) dbms_stats cursor_sharing_exact use_weak_name_resl
dynamic_sampling(0) no_monitoring no_substrb_pad
*/to_char(count("RECORD_ID")),to_char(substrb(dump(min("RECORD_ID"),16,0
,32),1,120)),to_char(substrb(dump(max("RECORD_ID"),16,0,32),1,120)),to_c
har(count("PERIOD_KEY")),to_char(substrb(dump(min("PERIOD_KEY"),16,0,32)
,1,120)),to_char(substrb(dump(max("PERIOD_KEY"),16,0,32),1,120)),to_char
(count("RECORD_STATUS")),to_char(count("RECORD_TYPE")),to_char(count("RE
SOLUTION_STATUS")),to_char(count("FIELD_00033_O")),to_char(count("FIELD_
00033_C")),to_char(count("FIELD_00279_O")),to_char(count("FIELD_00279_C"
)),to_char(count("FIELD_00436_O")),to_char(count("FIELD_00436_C")),to_ch
ar(count("FIELD_00361_O")),to_char(count("FIELD_00361_C")),to_char(count
("FIELD_00148_O")),to_char(count("FIELD_00148_C")),to_char(count("FIELD_
00341_O")),to_char(count("FIELD_00341_C")),to_char(count("FIELD_00116_O"
)),to_char(count("FIELD_00116。。。。。。

如果這個時候好奇想檢視收集統計資訊的語句的執行計劃,發現更是驚人,裡面有901T的字樣,絕對是海量資料。

Plan hash value: 2890548601
--------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                           |       |       |   251K(100)|          |       |       |        |      |            |
|   1 |  SORT AGGREGATE               |                           |     1 |  1933K|            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR              |                           |       |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)        | :TQ10000                  |     1 |  1933K|            |          |       |       |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE            |                           |     1 |  1933K|            |          |       |       |  Q1,00 | PCWP |            |
|   5 |      APPROXIMATE NDV AGGREGATE|                           |   500M|   901T|   251K (42)| 00:50:14 |       |       |  Q1,00 | PCWP |            |
|   6 |       PX BLOCK ITERATOR       |                           |   500M|   901T|   251K (42)| 00:50:14 |     1 |     5 |  Q1,00 | PCWC |            |
|*  7 |        TABLE ACCESS FULL      | EM1_RATER_00050_01_BACKUP |   500M|   901T|   251K (42)| 00:50:14 |     1 |     5 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   7 - access(:Z>=:Z AND :Z<=:Z)

經過了短暫等待的3分鐘,一切就緒,再次檢視語句的執行計劃,指標一下子就降了下來。 然後對這個語句進行了初步分析,發現其實還是可以嘗試使用minus操作來做資料過濾。 insert into em1_rater_00068_01 (select b.* from em1_rater_00050_01_backup b,(select record_id from em1_rater_00050_01_backup where record_id <= 65971543 minus select record_id from em1_rater_00068_01 where record_id <= 65971543)temp where b.record_id=temp.record_id ) 對過濾後的資料再次關聯就會輕鬆很多。在不啟用並行的情況下執行計劃如下:

Plan hash value: 3652964767
------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name                      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |                           |  9709K|    24G|       |  2412K  (1)| 08:02:31 |       |       |
|   1 |  LOAD TABLE CONVENTIONAL | EM1_RATER_00068_01        |       |       |       |            |          |       |       |
|*  2 |   HASH JOIN              |                           |  9709K|    24G|   231M|  2412K  (1)| 08:02:31 |       |       |
|   3 |    VIEW                  |                           |  9709K|   120M|       | 70833   (3)| 00:14:10 |       |       |
|   4 |     MINUS                |                           |       |       |       |            |          |       |       |
|   5 |      SORT UNIQUE         |                           |  9709K|    55M|   111M|            |          |       |       |
|   6 |       PARTITION RANGE ALL|                           |  9709K|    55M|       |  2560   (1)| 00:00:31 |     1 |     5 |
|*  7 |        INDEX RANGE SCAN  | EM1_RATER_00050_01_BK_PK  |  9709K|    55M|       |  2560   (1)| 00:00:31 |     1 |     5 |
|   8 |      SORT UNIQUE         |                           |    10M|    57M|   115M|            |          |       |       |
|   9 |       PARTITION RANGE ALL|                           |    10M|    57M|       |  3195   (1)| 00:00:39 |     1 |     5 |
|* 10 |        INDEX RANGE SCAN  | EM1_RATER_00068_01_PK     |    10M|    57M|       |  3195   (1)| 00:00:39 |     1 |     5 |
|  11 |    PARTITION RANGE ALL   |                           |  9709K|    24G|       |  1079K  (2)| 03:35:51 |     1 |     5 |
|  12 |     TABLE ACCESS FULL    | EM1_RATER_00050_01_BACKUP |  9709K|    24G|       |  1079K  (2)| 03:35:51 |     1 |     5 |
------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("B"."RECORD_ID"="TEMP"."RECORD_ID")
   7 - access("RECORD_ID"<=65971543)
  10 - access("RECORD_ID"<=65971543)
26 rows selected.

使用並行後,執行計劃就好多了,根據初步的測試大概在10分鐘左右。