【MySQL 原理分析】之 Trace 分析 order by 的索引原理

阿新 • • 發佈：2020-02-15

一、背景

昨天早上，交流群有一位同學提出了一個問題。看下圖：

我不是大佬，而且當時我自己的想法也只是猜測，所以並沒有回覆那位同學，只是接下來自己做了一個測試驗證一下。

他只簡單了說了一句話，就是同樣的sql，一個沒加 order by 就全表掃描，一個加了 order by 就走索引了。

我們可以仔細點看一下他提供的圖（主要分析子查詢即可，就是關於表 B 的查詢，因為只有表 B 的查詢前後不一致），我們可以先得出兩個前提：

1、首先可以肯定的是，where 條件中的 mobile 欄位是沒有索引的。因為沒有 order by 時，是全表掃描，如果 mobile 欄位有索引，查詢優化器必定會使用 mobile 欄位的索引。

2、其實重點不但在 order by，更重要的是在於 order by 後面跟著的欄位是表B 的主鍵 id。之所以判斷 id 為主鍵，是因為 explain 執行計劃裡看到使用了 PRIMARY 索引，即主鍵索引。

二、資料準備和場景重現

建立表 user：

CREATE TABLE `user` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  `phone` varchar(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=100007 DEFAULT CHARSET=utf8;

準備資料：

看了一下截圖，資料量應該在10萬左右，我們也準備10萬資料，儘量做到一致。

delimiter ;
CREATE DEFINER=`root`@`localhost` PROCEDURE `iniData`()
begin
  declare i int;
  set i=1;
  while(i<=100000)do
    insert into user(name,age,phone) values('測試', i, 15627230000+i);
    set i=i+1;
  end while;
end;;
delimiter ;

call iniData();

執行 SQL ，檢視執行計劃：

explain select * from user where phone = '15627231000' limit 1;
explain select * from user where phone = '15627231000' order by id limit 1;

執行結果：

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1   SIMPLE     user  (Null)     ALL  (Null) (Null) (Null) (Null) 99927  10  Using where

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1   SIMPLE   user    (Null)     index   (Null)     PRIMARY  4   1   10  Using where

我們可以看到，執行計劃和那位同學的基本一致，都是第一條 SQL 全表掃描，第二條 SQL 是走了主鍵索引。

三、猜想和猜測著總結

只要加 order by 就走索引？

根據上面的執行計劃來看，明顯這位同學的表達是不對的，更重要的是因為 order by 後跟著的欄位是主鍵 id，所以才走了索引，走了主鍵索引。

我們可以試試用 age 欄位來排序，這時候肯定是沒有走索引的，因為我們壓根沒有為 age 欄位沒有建立索引。

explain select * from user where phone = '15627231000' order by age limit 1;

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1   SIMPLE     user  (Null)     ALL  (Null) (Null) (Null) (Null) 99927  10  Using where; Using filesort

分析：

首先，我們看到 type 是 ALL，就是全表掃描，而且我們還留意到：Extra的值多了 using filesort，表明 MySQL 有檔案排序的操作。

我們可以拿 order by age 和 order by id 的執行計劃來對比一下。

1、explain 的 tepe 欄位：

首先，type 不一樣，一個是 index，表明利用了索引樹；一個是 ALL，表明是全表掃描。

2、explain 的 Extra 欄位：

第二，也是最重點的，它其實可以說明為何利用了主鍵索引。就是 Extra 欄位。

先說明一下正常的排序，Extra 都會有 Using filesort 來表明使用了檔案排序。

而明顯 order by id 是沒有這個，這是因為，索引樹本來就是一個帶有順序的資料結構，大家不瞭解的可以去看看 B+Tree 的介紹。查詢優化器正是利用了索引的順序性，使得 SQL 的執行計劃走主鍵索引樹來去掉原本需要的排序。

之前的大白話 MySQL 學習總結中也提到過查詢優化器。SQL 的執行計劃能有很多，並且結果是一樣的，但是為了提高效能，MySQL 的查詢優化器元件會為 SQL 制定一套最優的執行計劃。

階段總結：

查詢優化器幫我們制定的最優計劃是：充分利用主鍵索引的順序性，避免了全表掃描後還是需要排序操作。

當然了，我們不能自己只是根據現象做判斷，下面將利用 Trace 來檢視優化器追蹤的資訊，進一步的驗證我們的總結是沒問題的。

四、通過 Trace 分析來驗證

開啟和檢視 Trace

-- 開啟優化器跟蹤
set session optimizer_trace='enabled=on';
select * from user where phone = '15627231000' order by id limit 1;
-- 檢視優化器追蹤
select * from information_schema.optimizer_trace;

下面我們只看 TRACE 就行了。

{
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `user`.`id` AS `id`,`user`.`name` AS `name`,`user`.`age` AS `age`,`user`.`phone` AS `phone` from `user` where (`user`.`phone` = '15627231000') order by `user`.`id` limit 1"
          }
        ]
      }
    },
    {
      "join_optimization": { // 優化工作的主要階段
        "select#": 1,
        "steps": [
          //  .... 省略很多步驟
          {
            "reconsidering_access_paths_for_index_ordering": { // 重新考慮索引排序的訪問路徑
              "clause": "ORDER BY",
              "index_order_summary": {
                "table": "`user`",
                "index_provides_order": true,
                "order_direction": "asc",
                "index": "PRIMARY", // 排序的欄位為主鍵 id，有主鍵索引
                "plan_changed": true, // 改變執行計劃
                "access_type": "index"
              }
            }
          },
          {
            "refine_plan": [
              {
                "table": "`user`"
              }
            ]
          }
        ]
      }
    },
    {
      "join_explain": {
        "select#": 1,
        "steps": [
        ]
      }
    }
  ]
}

好了，在最後的那裡，我們看到了查詢優化器幫我們使用了主鍵索引。

所以，我們上面的猜想是正確的，因為 where 條件後的 phone 欄位沒有加上索引，所以到 order by id 時，查詢優化器發現可以利用主鍵索引所以來避免排序，所以最後就使用了主鍵索引。

那麼，按照上面的說法，如果 phone 欄位加上了索引，那麼最後應該就是走 phone 的索引而不是主鍵索引了。而且，SQL 調優有那麼一條建議：建議經常在 where 條件後出現的欄位加上索引來提高查詢效能。

下面我們來繼續驗證一下我們的猜想。

五、關於 where 條件欄位索引和 order by 欄位索引的選擇

1、給欄位 phone 增加索引：

2、執行 SQL ：

explain select * from user where phone = '15627231000' order by id limit 1;

3、結果：

我們可以看到，最後查詢優化器判斷 phone索引比主鍵索引更能提高效能，所以使用了 phone 的索引。

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1   SIMPLE   user    (Null)     index index_phone index_phone   36  1   100 Using index condition

4、Trace進一步驗證：

最後，我們可以看到，查詢優化器否定了使用主鍵索引，不改變之前的執行計劃。

-- 開啟優化器跟蹤
set session optimizer_trace='enabled=on';
select * from user where phone = '15627231000' order by id limit 1;
-- 檢視優化器追蹤
select * from information_schema.optimizer_trace;

Trace 分析：

{
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `user`.`id` AS `id`,`user`.`name` AS `name`,`user`.`age` AS `age`,`user`.`phone` AS `phone` from `user` where (`user`.`phone` = '15627231000') order by `user`.`id` limit 1"
          }
        ]
      }
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          //  .... 省略很多步驟
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`user`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "access_type": "ref",
                      "index": "index_phone",
                      "rows": 1,
                      "cost": 1.2,
                      "chosen": true
                    },
                    {
                      "access_type": "range",
                      "range_details": {
                        "used_index": "index_phone" // 使用 phone 的索引
                      },
                      "chosen": false,
                      "cause": "heuristic_index_cheaper"
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 1,
                "cost_for_plan": 1.2,
                "chosen": true
              }
            ]
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": "(`user`.`phone` = '15627231000')",
              "attached_conditions_computation": [
              ],
              "attached_conditions_summary": [
                {
                  "table": "`user`",
                  "attached": null
                }
              ]
            }
          },
          {
            "clause_processing": {
              "clause": "ORDER BY",
              "original_clause": "`user`.`id`",
              "items": [
                {
                  "item": "`user`.`id`"
                }
              ],
              "resulting_clause_is_simple": true,
              "resulting_clause": "`user`.`id`"
            }
          },
          {
            "added_back_ref_condition": "((`user`.`phone` <=> '15627231000'))"
          },
          {
            "reconsidering_access_paths_for_index_ordering": { // 重新考慮索引排序的訪問路徑
              "clause": "ORDER BY",
              "index_order_summary": {
                "table": "`user`",
                "index_provides_order": true,
                "order_direction": "asc",
                "index": "index_phone",
                "plan_changed": false  // 不改變執行計劃
              }
            }
          },
          {
            "refine_plan": [
              {
                "table": "`user`",
                "pushed_index_condition": "(`user`.`phone` <=> '15627231000')",
                "table_condition_attached": null
              }
            ]
          }
        ]
      }
    },
    {
      "join_explain": {
        "select#": 1,
        "steps": [
        ]
      }
    }
  ]
}

六、最後總結

到這裡，分析就結束了，我們可以得出一個結論，當然了，只是基於上面的實驗所得：

1、SQL 帶有 order by ：

order by 後面的欄位有索引：
- where 條件後面的所有欄位都沒索引，則使用 order by 後面的欄位的索引。
- where 條件後面有欄位帶有索引，則使用 where 條件對應的欄位的索引。
order by 後面的欄位沒有索引：
- where 條件後面的所有欄位都沒索引，則全表掃描。
- where 條件後面有欄位帶有索引，則使用 where 條件後面的欄位的索引。

2、SQL 不帶 order by：

where 條件後面的所有欄位都沒索引，則全表掃描。
where 條件後面只要有欄位帶索引，則使用該欄位對應的索引。

最後我們也可以得出一個絕對的結論：查詢優化器是真的好使，哈哈哈！

七、題外話

其實上面的實驗需要大家對 MySQL 的索引原理有一定的瞭解，但是不用特別深。

如果大家感興趣的話，可以關注一下我現在寫的【大白話系列】MySQL 學習總結這一系列的文章，我會將自己學習 MySQL 後的學習總結分享在這裡