索引易錯點:索引與max(),min()
前面說完了索引與count(*)的關係,現在來討論另外一種聚合查詢max(),min()與索引的關係,大家覺得這個聚合查詢能用的到索引嗎?
通過上一小節的學習後,可能有人會回答:“可以用得上,但是索引列必須要建主鍵或者要寫where column is not null就可以用到了。”對於這樣的回答應該值得肯定,非常正確!看來前面沒白講了。不過用上了什麼樣的索引掃描方式呢?上一小節的方式是INDEX FULL SCAN,大家一定有印象,現在如果是要讓max()和min()利用上索引,也是走這個INDEX FULL SCAN掃描方式嗎?
大家想一想索引的結構是什麼樣的?索引結構是從root到branch最後到leaf,好象一個金字塔。最下面的葉子層(也就是金字塔的底部)其實是有序的,比如從左到右值是從小到大,或者從大到小。這樣一來大家認為取max()或者 min()還需要INDEX FULL SCAN嗎,找到頭或尾不就找到最大或最小值,還需要遍歷leaf嗎?
於是ORACLE的另一種索引掃描型別就橫空出世了index full scan(max/min)。多了(max/min)的關鍵字!index full scan(max/min)蘊含著stopkey的機制,從最左邊或者最右邊的葉子節點開始掃描,讀到第一個值後就停止掃描。
檢視max()的查詢,發現果然是走 INDEX FULL SCAN (MIN/MAX)
SQL> explain plan for select max(object_id) from ljb_test where object_id is not null;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------
Plan hash value: 613051030
----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)
----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)
| 1 | SORT AGGREGATE | | 1 | 13 |
| 2 | FIRST ROW | | 49190 | 624K| 2 (0)
|* 3 | INDEX FULL SCAN (MIN/MAX)| IDX_LJB_TEST | 49190 | 624K| 2 (0)
----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("OBJECT_ID" IS NOT NULL)
Note
- dynamic sampling used for this statement
19 rows selected
檢視min()的查詢,發現也走了INDEX FULL SCAN (MIN/MAX)
SQL> explain plan for select min(object_id) from ljb_test where object_id is not null;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------
Plan hash value: 613051030
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)
| 1 | SORT AGGREGATE | | 1 | 13 |
| 2 | FIRST ROW | | 49190 | 624K| 2 (0)
|* 3 | INDEX FULL SCAN (MIN/MAX)| IDX_LJB_TEST | 49190 | 624K| 2 (0)
-----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("OBJECT_ID" IS NOT NULL)
Note
- dynamic sampling used for this statement
19 rows selected
到此大家應該完全明白了max()和min()的時候,執行計劃中會走INDEX FULL SCAN (MIN/MAX)的原因了吧,在獲取正確的資訊後,ORACLE對此類查詢自然就會選擇這樣的掃描方式,希望大家能理解其中選擇這樣方式掃描的原理!也許有人說,知道這個也沒用,ORACLE自己就會選怎麼走索引吧,這個NDEX FULL SCAN (MIN/MAX)的知識點知道也沒意義。其實我認為,多理解點東西總是有用的,尤其是原理性方面,比如我現在再問這樣一個問題:select min(object_id),max(object_id) fromljb_test where object_id is not null 這個語句ORACLE怎麼處理?大家怎麼回答?
讓我實驗一下吧(很多人猜還是INDEX FULL SCAN (MIN/MAX)):
下面執行結果出來了,走的索引掃描型別是INDEX FULL SCAN,看不到(MIN/MAX)的關鍵字,咋回事?
SQL> explain plan for select min(object_id),max(object_id) from ljb_test where object_id is not null;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------
Plan hash value: 1341606234
-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
-------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 61 (4)| 00:0
| 1 | SORT AGGREGATE | | 1 | 13 | |
|* 2 | INDEX FAST FULL SCAN| IDX_LJB_TEST | 49190 | 624K| 61 (4)| 00:0
-------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("OBJECT_ID" IS NOT NULL)
Note
-----
- dynamic sampling used for this statement
18 rows selected
原來這樣的SQL語句是表示ORACLE要利用該索引同時取到這兩個值,INDEX FULL SCAN (MIN/MAX)是無法一次取到兩個值的,所以ORACLE不得不選擇了INDEX FULL SCAN ,把葉子的索引掃了個遍,同時取到了兩個值。
明白了原理,處理起問題就簡單了,改寫程式碼如下:
select (select max(object_id) from test1) c, (select min(object_id) from test1) b from dual;現在終於走了INDEX FULL SCAN (MIN/MAX)索引了,大家看到這個INDEX FULL SCAN (MIN/MAX)威力還是非常大的,走了兩次INDEX FULL SCAN (MIN/MAX),居然代價才4,遠遠低於一次INDEX FULL SCAN的代價61
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3189180828
-----------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 26 | 4 (0)| 00:00:01 |
| 2 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 |
| 3 | SORT AGGREGATE | | 1 | 13 | | |
| 4 | FIRST ROW | | 49190 | 624K| 2 (0)| 00:00:01 |
|* 5 | INDEX FULL SCAN (MIN/MAX)| IDX_LJB_TEST | 49190 | 624K| 2 (0)| 00:00:01 |
| 6 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 |
| 7 | SORT AGGREGATE | | 1 | 13 | | |
| 8 | FIRST ROW | | 49190 | 624K| 2 (0)| 00:00:01 |
|* 9 | INDEX FULL SCAN (MIN/MAX)| IDX_LJB_TEST | 49190 | 624K| 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter("OBJECT_ID" IS NOT NULL)
9 - filter("OBJECT_ID" IS NOT NULL)
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------
Note
- dynamic sampling used for this statement
總結:max() 和 min() 是大家常用的使用頻率很高的sql寫法,計費專案各種報表中需要這樣編寫的地方比比皆是!希望大家能對這樣的查詢建立索引,在保證該列不空的情況下,就有可能利用到INDEX FULL SCAN (MIN/MAX)這個索引掃描方式,能為查詢效能帶來很大的提高,另外只要善於思考,還可以通過改寫SQL的方式,將原本利用不到INDEX FULL SCAN (MIN/MAX)查詢方式的語句select min(object_id),max(object_id) from ljb_test where object_id is not null改造後,利用上INDEX FULL SCAN (MIN/MAX)。希望這個能啟發開發人員多利用現有的SQL知識,編寫出高效的SQL語句。
引申聯想:大家記得前面我有提到index full scan(max/min)蘊含著stopkey的機制,有優化基礎的朋友一定認識這個stopkey,經常在分頁查詢的執行計劃中,看到有這樣的關鍵字,基本上可以認為這個查詢的執行計劃是正確的。
比如select * from (select * from table where id= order by name desc) where rownum<11;這樣的語句具體的意思就是id為某個值的時候,根據name做排序,然後取前10行.這個語句存在2個部分:id為某個值,name降序。假設我現在存在這一個索引(id,name desc)這個索引的結構也是id相同的情況下按照name的降序排列,這個索引同時滿足前面的兩個條件,因此就能提高速度,只要從索引中讀取出10個rowid,然後根據這10個rowid來回表,這時候速度肯定很快的,因此類似這類的分頁語句可以根據sql語句的原意來建立索引,就能提高速度,但是如果where條件裡出現非等於的條件,那麼不管怎麼建立索引都無法滿足前面的2個條件(根據索引的結構就很容易明白這點),就必須根據欄位的選擇性來建立合適的索引.