Mysql Order BY

阿新 • • 發佈：2020-07-16

order by

預設情況下，MySQL對GROUP BY col1，col2，...查詢進行排序，就好像您在查詢中還包含了ORDER BY col1，col2，...一樣。如果您包含一個包含相同列列表的顯式ORDER BY子句，則MySQL會對其進行優化，而不會造成任何速度損失，儘管排序仍然會發生。如果查詢包含GROUP BY，但您希望避免對結果進行排序的開銷，則可以通過指定ORDER BY NULL來抑制排序。例如：

INSERT INTO foo
SELECT a, COUNT(*) FROM bar GROUP BY a ORDER BY NULL;

優化器可能仍選擇使用排序來實現分組操作。 ORDER BY NULL禁止對結果排序，而不是通過分組操作確定結果的先前排序。

Mysql 在執行排序操作的時候有以下三種方案：

sort By index
file sort
priority queue

sort By Index

因為B+tree 的特性，在葉子節點會儲存索引對應的列資料和對應的主鍵列資料，以雙向列表的形式順序儲存。所以在某些情況下，MySQL可以使用索引來滿足ORDER BY子句，並避免執行檔案排序操作時涉及的額外排序。

mysql> desc orderIndex;
+-------+------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra          |
+-------+------+------+-----+---------+----------------+
| id    | int  | NO   | PRI | NULL    | auto_increment |
| a     | int  | YES  | MUL | NULL    |                |
| b     | int  | YES  | MUL | NULL    |                |
| c     | int  | YES  |     | NULL    |                |
+-------+------+------+-----+---------+----------------+
4 rows in set (0.00 sec)

mysql> show index from orderIndex;
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table      | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| orderIndex |          0 | PRIMARY  |            1 | id          | A         |        4878 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| orderIndex |          1 | b        |            1 | b           | A         |        4878 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| orderIndex |          1 | a_2      |            1 | a           | A         |        5000 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| orderIndex |          1 | a_2      |            2 | b           | A         |        5000 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
4 rows in set (0.00 sec)

如下例所示：

mysql> explain select b from orderIndex order by a limit 10;
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | orderIndex | NULL       | index | NULL          | a_2  | 10      | NULL |   10 |   100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

extra中僅僅顯示為 Using Index 表明此查詢無需去主鍵索引查詢表資料，僅僅使用二級索引即可滿足查詢語句（即索引覆蓋），也無需執行額外排序，下例也是同理：

mysql> explain select b from orderIndex order by a DESC limit 10;
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+----------------------------------+
| id | select_type | table      | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                            |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+----------------------------------+
|  1 | SIMPLE      | orderIndex | NULL       | index | NULL          | a_2  | 10      | NULL |   10 |   100.00 | Backward index scan; Using index |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+----------------------------------+
1 row in set, 1 warning (0.00 sec)

有些查詢語句及時無法達到索引覆蓋的效果，但仍然可以依賴 index 避免執行額外的排序操作。即使ORDER BY與索引不完全匹配，也可以使用索引，只要索引的所有未使用部分和所有額外的ORDER BY列在WHERE子句中都是常量即可。如下例所示：

mysql> explain select * from orderIndex where a=100 order by b;
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | orderIndex | NULL       | ref  | a_2           | a_2  | 5       | const |    1 |   100.00 | NULL  |
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

explain select * from orderIndex where a=100 order by b DESC;
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+---------------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra               |
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+---------------------+
|  1 | SIMPLE      | orderIndex | NULL       | ref  | a_2           | a_2  | 5       | const |    1 |   100.00 | Backward index scan |
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+----------+---------------------+
1 row in set, 1 warning (0.00 sec)

如果索引不包含查詢訪問的所有列，則僅當索引訪問比其他訪問方法便宜時才使用索引。

# 索引包含所有查詢列，沒有使用額外排序
mysql> explain select a,b from orderIndex order by a, b;
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | orderIndex | NULL       | index | NULL          | a_2  | 10      | NULL | 5000 |   100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

# 索引不包含所有查詢列，使用了額外排序
mysql> explain select * from orderIndex order by a, b;
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | orderIndex | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 5000 |   100.00 | Using filesort |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)

但有時一些查詢語句單單依賴index是無法滿足排序語句的，必須要進行額外排序：

# 滿足索引覆蓋條件
mysql> explain select a,b from orderIndex order by b;
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table      | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | orderIndex | NULL       | index | NULL          | a_2  | 10      | NULL | 5000 |   100.00 | Using index; Using filesort |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------+
1 row in set, 1 warning (0.00 sec)

注意，以上示例有在不同情況下執行結果可能會發生變動，比如引擎認為全表掃描然後排序的開銷比依賴索引然後回表查詢的開銷更低的情況下則會使用額外的排序操作。

file sort

如上文所述，當order BY index 無法滿足的時候，必須要通過額外的排序手段才能完成排序任務。mysql使用 file sort方法來實現額外排序，需要我們注意的是 file sort 並不意味一定要建立臨時檔案，mysql優化器會為file sort的執行開闢一個塊記憶體快取，它的預置大小是由 sort_buffer_size引數控制的，開發者可以根據需求和場景去調整，各個會話可以根據需要更改此變數的會話值，以避免過多的記憶體使用，或根據需要分配更多的記憶體。當資料快取超出了sort_buffer上限時就會轉儲存到多個臨時檔案中，這和mysql一貫的設計思想是一致的：如果記憶體足夠就放在記憶體，如果記憶體不夠再開闢磁碟檔案儲存，儘量減少IO開銷。

現在就會分為兩種情況：

排序資料小於 sort_buffer_size
排序資料大於 sort_buffer_size

這兩種不同情況下采用的排序策略是不一樣的，第一種情況，直接在記憶體中進行排序，此時使用的排序演算法要根據具體的語句去分析；第二種情況下資料會被劃分為多個檔案，在每個檔案內部保證資料的有序(寫檔案前進行排序)，然後再用歸併排序對各個檔案資料進行總排序；下面做一個簡單的驗證：

mysql> set optimizer_trace='enabled=on';
mysql> select count(*) from orderIndex;
+----------+
| count(*) |
+----------+
|     5000 |
+----------+
1 row in set (0.02 sec)

mysql> show variables like 'sort_buffer_size';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| sort_buffer_size | 262144 |
+------------------+--------+
1 row in set (0.00 sec)

filesort_priority_queue

此時在表orderIndex中包含5000條資料，sort_buffer_size = 262144，下面根據c列進行全表排序，因為c沒有索引，所以肯定會觸發file sort：

mysql> select * from orderIndex order by c;
...
5000 rows in set (0.01 sec)

mysql> select * from information_schema.OPTIMIZER_TRACE\G;
{
...
  "join_execution": {
    "select#": 1,
    "steps": [
      {
        "sorting_table": "orderIndex",
        "filesort_information": [
          {
            "direction": "asc",
            "expression": "`orderIndex`.`c`"
          }
        ],
        "filesort_priority_queue_optimization": {
          "usable": false,
          "cause": "not applicable (no LIMIT)"
        },
        "filesort_execution": [
        ],
        "filesort_summary": {
          "memory_available": 262144,
          "key_size": 9,
          "row_size": 26,
          "max_rows_per_buffer": 7710,
          "num_rows_estimate": 11234,
          "num_rows_found": 5000,
          "num_initial_chunks_spilled_to_disk": 0,
          "peak_memory_used": 221184,
          "sort_algorithm": "std::stable_sort",
          "unpacked_addon_fields": "skip_heuristic",
          "sort_mode": "<fixed_sort_key, additional_fields>"
        }
      }
    ]
  }
...
}

我們通過optimizer_trace可以看出row_size=26，整個表的資料無法填滿sort_buffer，可以直接在記憶體之中使用快速排序演算法進行排序。而filesort_priority_queue_optimization項顯示並未開啟優先佇列排序，原因是沒有使用Limit，我們現在再加上Limit進行查詢：

mysql> select * from orderIndex order by c limit 5000;
... ...
5000 rows in set (0.00 sec)

mysql> select * from information_schema.OPTIMIZER_TRACE\G;
{
  ... ...
  "join_execution": {
    "select#": 1,
    "steps": [
      {
        "sorting_table": "orderIndex",
        "filesort_information": [
          {
            "direction": "asc",
            "expression": "`orderIndex`.`c`"
          }
        ],
        "filesort_priority_queue_optimization": {
          "limit": 5000,
          "chosen": true
        },
        "filesort_execution": [
        ],
        "filesort_summary": {
          "memory_available": 262144,
          "key_size": 9,
          "row_size": 26,
          "max_rows_per_buffer": 5001,
          "num_rows_estimate": 11234,
          "num_rows_found": 5000,
          "num_initial_chunks_spilled_to_disk": 0,
          "peak_memory_used": 170034,
          "sort_algorithm": "std::stable_sort",
          "unpacked_addon_fields": "using_priority_queue",
          "sort_mode": "<fixed_sort_key, additional_fields>"
        }
      }
    ]
  }
  ... ...
}

可以看出現在是開啟了filesort_priority_queue_optimization，也就是說只有使用Limit時會觸發優先佇列排序優化，優先佇列排序是使用堆排序演算法實現的，它執行的流程如下：

掃描表，將選擇的每一列中的選擇列表列按順序插入佇列。如果佇列已滿，請按排序順序移出最後一行。
返回佇列的前N行。（如果指定了offset，請跳過前offset行，然後返回後N行。）

如果是DESC排序則使用大堆法，否則使用小堆法。

Optimization Using filesort

現在調整sort_buffer_size來模擬排序資料量大於sort_buffer_size的情況：

mysql> set sort_buffer_size=10240;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show variables like 'sort_buffer_size';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| sort_buffer_size | 32768 |
+------------------+-------+
1 row in set (0.00 sec)

我們設定了 10240，但小於sort_buffer_size的最小值 32768，所以預設設定為了最小值。

mysql> select * from orderIndex order by c;
... ...
5000 rows in set (0.00 sec)

mysql> select * from information_schema.OPTIMIZER_TRACE\G;
{
...
  "join_execution": {
    "select#": 1,
    "steps": [
      {
        "sorting_table": "orderIndex",
        "filesort_information": [
          {
            "direction": "asc",
            "expression": "`orderIndex`.`c`"
          }
        ],
        "filesort_priority_queue_optimization": {
          "usable": false,
          "cause": "not applicable (no LIMIT)"
        },
        "filesort_execution": [
        ],
        "filesort_summary": {
          "memory_available": 32768,
          "key_size": 9,
          "row_size": 26,
          "max_rows_per_buffer": 1260,
          "num_rows_estimate": 11234,
          "num_rows_found": 5000,
          "num_initial_chunks_spilled_to_disk": 6,
          "peak_memory_used": 33256,
          "sort_algorithm": "std::stable_sort",
          "unpacked_addon_fields": "skip_heuristic",
          "sort_mode": "<fixed_sort_key, additional_fields>"
        }
      }
    ]
  }
...
}

我們此時可以清晰的看到num_initial_chunks_spilled_to_disk = 6(num_initial_chunks_spilled_to_disk表示在執行合併之前塊的數量)，說明是使用了臨時檔案進行記憶體外排序的，它的執行流程如下：

讀取與WHERE子句匹配的行。
對於每一行，在排序緩衝區中儲存一個元組，該元組由排序鍵值和查詢引用的列組成。
當排序緩衝區已滿時，按記憶體中的排序鍵值對元組進行排序，並將其寫入臨時檔案。
對臨時檔案進行歸併排序後，按排序順序檢索行，但直接從排序後的元組中讀取查詢所需的列，而不是第二次訪問該表。

此時我們需要額外說明的是，上面的流程是優化後的演算法，是在5.6 version後面引進來的，原始的演算法流程如下：

根據鍵或通過表掃描讀取所有行。跳過與WHERE子句不匹配的行。
對於每一行，在排序緩衝區中儲存一個由一對值（排序鍵值和行ID）組成的元組。
如果所有對都適合排序緩衝區，則不會建立臨時檔案。否則，當排序緩衝區已滿時，請在記憶體中對其進行快速排序，然後將其寫入臨時檔案。儲存一個指向已排序塊的指標。
重複上述步驟，直到已讀取所有行。
將多達MERGEBUFF（7）個區合併到另一個臨時檔案中的一個塊。重複直到第一個檔案中的所有塊都在第二個檔案中。
重複以下步驟，直到剩餘的塊少於MERGEBUFF2（15）。
在最後一個多重合並中，僅行ID（值對的最後一部分）被寫入結果檔案。
使用結果檔案中的行ID按排序順序讀取行。要對此進行優化，請讀取大塊的行ID，對它們進行排序，然後使用它們按排序順序將行讀取到行緩衝區中。行緩衝區大小是read_rnd_buffer_size系統變數值。此步驟的程式碼在sql / records.cc原始檔中。

max_length_for_sort_data系統變數會決定是使用原始的演算法還是優化後的演算法，我們現在模擬一下原始的演算法：

mysql> show variables like 'max_length_for_sort_data';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| max_length_for_sort_data | 4096  |
+--------------------------+-------+
1 row in set (0.01 sec)

mysql> set max_length_for_sort_data=16;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show variables like 'max_length_for_sort_data';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| max_length_for_sort_data | 16    |
+--------------------------+-------+
1 row in set (0.00 sec)

將 max_length_for_sort_data 設定為 16，row_size=26 >max_length_for_sort_data，此時會觸發原始演算法流程，但遺憾的是無法展示，因為在 8.0.20版本後此變數已經被捨棄了：

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.20    |
+-----------+
1 row in set (0.00 sec)

此時再使用Limit [M], N進行查詢，如果M + N 過大也會出現大於 sort_buffer_size的情況，需要進行記憶體外排序:

mysql> select * from orderIndex order by c limit 2000, 20;

mysql> select * from information_schema.OPTIMIZER_TRACE\G;
{
... ...
    "join_execution": {
    "select#": 1,
    "steps": [
      {
        "sorting_table": "orderIndex",
        "filesort_information": [
          {
            "direction": "asc",
            "expression": "`orderIndex`.`c`"
          }
        ],
        "filesort_priority_queue_optimization": {
          "limit": 2020
        },
        "filesort_execution": [
        ],
        "filesort_summary": {
          "memory_available": 32768,
          "key_size": 9,
          "row_size": 26,
          "max_rows_per_buffer": 1260,
          "num_rows_estimate": 11234,
          "num_rows_found": 5000,
          "num_initial_chunks_spilled_to_disk": 6,
          "peak_memory_used": 33256,
          "sort_algorithm": "std::stable_sort",
          "unpacked_addon_fields": "skip_heuristic",
          "sort_mode": "<fixed_sort_key, additional_fields>"
        }
      }
    ]
    }
... ...
}

其執行流程大致如下（存疑）：

遍歷整張表，重複這些步驟：（1）選擇行，直到填充了排序緩衝區，（2）將緩衝區中的前N行（如果指定了M，則為M + N行）寫入合併檔案。
對合並檔案進行排序，並返回前N行。（如果指定了M，請跳過前M行，然後返回後N行。）

參考網址

http://mysql.babo.ist/#/en/order-by-optimization.html

Mysql Order BY

order by

sort By Index

file sort

filesort_priority_queue

Optimization Using filesort

參考網址

Mysql Order BY

MySQL order by if()或order by in()條件排序

MySQL---order by如何工作

MySQL ORDER BY 關鍵字

Mysql order by 和limit 同時使用的優化問題

mysql-order by原理

MySQL中（JOIN/ORDER BY）語句的查詢過程及優化方法

MySQL簡單瞭解“order by”是怎麼工作的

Mysql優化order by語句的方法詳解

MySQL中Order By多欄位排序規則程式碼示例

【MySQL】深入理解ORDER BY的排序規則及多個欄位排序的實現

資料庫學習之MySQL (八）——排序查詢 ORDER BY ASC DSC

MySQL利用索引優化ORDER BY排序語句

MySQL實驗內連線優化order by+limit 以及新增索引再次改進

mysql on、having、where、order by、group by的用法筆記

《MySQL必知必會》檢索資料，排序檢索資料(select ,* ,distinct ,limit , . , order by ,desc)

Mysql排序和分頁(order by&limit)及存在的坑

MySQL利用索引優化ORDER BY排序語句的方法

MySQL中ORDER BY與LIMIT一起使用（有坑）

mysql---安裝、常用命令、建表、查詢、主鍵、外來鍵、表關係、約束、group by、order by、where

Mysql Order BY

order by

sort By Index

file sort

filesort_priority_queue

Optimization Using filesort

參考網址

相關推薦