MySQL實驗內連線優化order by+limit 以及新增索引再次改進

阿新 • • 發佈：2020-07-06

# MySQL實驗內連線優化order by+limit 以及新增索引再次改進在進行[子查詢優化雙引數limit](https://www.cnblogs.com/G-Aurora/p/13254473.html)時我萌生了測試更加符合實際生產需要的`ORDER BY + LIMIT`的想法，或許我們也可以對`ORDER BY + LIMIT` 也進行適當優化 ## 實驗準備使用MySQL官方的大資料庫employees進行實驗，[匯入該示例庫見此](https://www.cnblogs.com/G-Aurora/p/13171234.html) 準備使用其中的employees表，先檢視一下表結構和表內的記錄數量 ``` mysql> desc employees; +------------+---------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------+---------------+------+-----+---------+-------+ | emp_no | int(11) | NO | PRI | NULL | | | birth_date | date | NO | | NULL | | | first_name | varchar(14) | NO | | NULL | | | last_name | varchar(16) | NO | | NULL | | | gender | enum('M','F') | NO | | NULL | | | hire_date | date | NO | | NULL | | +------------+---------------+------+-----+---------+-------+ 6 rows in set (0.00 sec) ``` ``` mysql> select count(*) from employeed; ERROR 1146 (42S02): Table 'employees.employeed' doesn't exist mysql> select count(*) from employees; +----------+ | count(*) | +----------+ | 300024 | +----------+ 1 row in set (0.05 sec) ``` 我們可以看到，只有主鍵emp_no有索引 ## 實驗過程 [MySQL5.7官網對Explain各項引數的解釋](https://dev.mysql.com/doc/refman/5.7/en/explain-output.html) [官網對ORDER BY機制的詳解](https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html) [explain引數5.7版本推薦參考部落格](https://blog.csdn.net/yhl_jxy/article/details/88570154) [老版本explain推薦參考部落格](https://www.cnblogs.com/butterfly100/archive/2018/01/15/8287569.html)（即新版本預設explain extended） [關於explain引數的拓展連結](https://blog.csdn.net/lzrit/article/details/81585941) [MySQL explain key值的解釋](https://www.cnblogs.com/yy20141204bb/p/8421338.html) ### 使用未優化order by + limit ``` mysql> select * from employees order by birth_date limit 200000,10; +--------+------------+------------+------------+--------+------------+ | emp_no | birth_date | first_name | last_name | gender | hire_date | +--------+------------+------------+------------+--------+------------+ | 498507 | 1960-09-24 | Perla | Delgrange | M | 1989-12-08 | | 494212 | 1960-09-25 | Susuma | Baranowski | M | 1989-05-15 | | 496888 | 1960-09-25 | Rosalyn | Rebaine | M | 1985-11-27 | | 497766 | 1960-09-25 | Matt | Atrawala | F | 1987-02-11 | | 481404 | 1960-09-25 | Sanjeeva | Eterovic | F | 1986-06-05 | | 483269 | 1960-09-25 | Mitchel | Pramanik | F | 1997-07-23 | | 483270 | 1960-09-25 | Geoff | Gulik | F | 1993-11-25 | | 59683 | 1960-09-25 | Supot | Millington | F | 1991-06-03 | | 101264 | 1960-09-25 | Mansur | Atchley | F | 1990-05-22 | | 92453 | 1960-09-25 | Khalid | Trystram | M | 1993-11-10 | +--------+------------+------------+------------+--------+------------+ 10 rows in set (0.20 sec) ``` ``` mysql> explain select * from employees order by birth_date limit 200000,10; +----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+----------------+ | 1 | SIMPLE | employees | NULL | ALL | NULL | NULL | NULL | NULL | 299468 | 100.00 | Using filesort | +----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+----------------+ 1 row in set, 1 warning (0.00 sec) ``` 我們可以看到，未優化時使用的是全表掃描，花費0.2s ### 內連線優化 **優化思路**：**我們可以利用主鍵emp_no的索引樹，在索引樹上將符合`order by birth_date limit 200000,10`的元組（即，行）的主鍵找出來，再用內連線返回10行emp_no的所有資訊。** （內連線只返回表中與連線條件相匹配的行，也就是說，`select emp_no from employees order by birth_date limit 200000,10`只會返回10個emp_no，那麼內連線後，結果集中也只有10個emp_no對應的所有資訊）（另外這裡的內連線時使用了emp_no，即，子查詢中也有"覆蓋索引"減少磁碟I/O的功勞） ``` mysql> select * from employees inner join (select emp_no from employees order by birth_date limit 200000,10) as temp_table using (emp_no); +--------+------------+------------+-----------+--------+------------+ | emp_no | birth_date | first_name | last_name | gender | hire_date | +--------+------------+------------+-----------+--------+------------+ | 427365 | 1960-09-24 | Yuping | Sethi | M | 1990-06-21 | | 424219 | 1960-09-25 | Woody | Bernini | M | 1989-03-10 | | 469218 | 1960-09-25 | George | Plotkin | M | 1992-02-19 | | 404121 | 1960-09-25 | Domenico | Birnbaum | M | 1993-08-01 | | 404266 | 1960-09-25 | Quingbo | Jervis | F | 1985-03-15 | | 409133 | 1960-09-25 | Nitsan | Kleiser | F | 1985-05-18 | | 409558 | 1960-09-25 | Shunichi | Hofting | F | 1992-07-06 | | 412045 | 1960-09-25 | Kristin | Bolotov | F | 1985-06-28 | | 481404 | 1960-09-25 | Sanjeeva | Eterovic | F | 1986-06-05 | | 483269 | 1960-09-25 | Mitchel | Pramanik | F | 1997-07-23 | +--------+------------+------------+-----------+--------+------------+ 10 rows in set (0.10 sec) ``` ``` mysql> explain select * from employees inner join (select emp_no from employees order by birth_date limit 100000,10) as table_temp using (emp_no); +----+-------------+------------+------------+--------+---------------+---------+---------+-------------------+--------+----------+----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+------------+------------+--------+---------------+---------+---------+-------------------+--------+----------+----------------+ | 1 | PRIMARY | | NULL | ALL | NULL | NULL | NULL | NULL | 100010 | 100.00 | NULL | | 1 | PRIMARY | employees | NULL | eq_ref | PRIMARY | PRIMARY | 4 | table_temp.emp_no | 1 | 100.00 | NULL | | 2 | DERIVED | employees | NULL | ALL | NULL | NULL | NULL | NULL | 299468 | 100.00 | Using filesort | +----+-------------+------------+------------+--------+---------------+---------+---------+-------------------+--------+----------+----------------+ 3 rows in set, 1 warning (0.00 sec) ``` 可見效率提高了一倍，在explain中 - 第三行的select_type為DERIVED，是指這行是包含在from子句中的查詢，我們可以看到，子句查詢也沒有使用索引 - ``是指，第一行的查詢說明表示當前查詢依賴 id=N 的查詢，此處N=2，那我們先看第二行：第二行type為`eq_ref`是指primary key 或 unique key 索引被連線（join）使用，，對於每個索引鍵的關聯查詢，返回匹配唯一行資料（有且只有1個）。在這裡就是說在子查詢查詢到emp_no後，子查詢中產生的臨時表與employees表進行連線。 - （對於這裡的explain的解釋只包含了對explain各項引數的解釋，但似乎沒有辦法直接驗證優化思路，還望各位看官前輩指點） ### 為排序欄位加上索引既然我們在內連線中是通過排序欄位`birth_date`後對`emp_no`進行查詢，那麼我們或許能再為排序欄位加上索引以再次提高效率。 ``` mysql> alter table employees add index birthdate_index (birth_date); Query OK, 0 rows affected (0.75 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> desc employees; +------------+---------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------+---------------+------+-----+---------+-------+ | emp_no | int(11) | NO | PRI | NULL | | | birth_date | date | NO | MUL | NULL | | | first_name | varchar(14) | NO | | NULL | | | last_name | varchar(16) | NO | | NULL | | | gender | enum('M','F') | NO | | NULL | | | hire_date | date | NO | | NULL | | +------------+---------------+------+-----+---------+-------+ 6 rows in set (0.00 sec) ``` 然後我們再次執行未優化和通過內連線優化的兩條查詢語句。 ``` mysql> select * from employees order by birth_date limit 200000,10; +--------+------------+------------+------------+--------+------------+ | emp_no | birth_date | first_name | last_name | gender | hire_date | +--------+------------+------------+------------+--------+------------+ | 498507 | 1960-09-24 | Perla | Delgrange | M | 1989-12-08 | | 494212 | 1960-09-25 | Susuma | Baranowski | M | 1989-05-15 | | 496888 | 1960-09-25 | Rosalyn | Rebaine | M | 1985-11-27 | | 497766 | 1960-09-25 | Matt | Atrawala | F | 1987-02-11 | | 481404 | 1960-09-25 | Sanjeeva | Eterovic | F | 1986-06-05 | | 483269 | 1960-09-25 | Mitchel | Pramanik | F | 1997-07-23 | | 483270 | 1960-09-25 | Geoff | Gulik | F | 1993-11-25 | | 59683 | 1960-09-25 | Supot | Millington | F | 1991-06-03 | | 101264 | 1960-09-25 | Mansur | Atchley | F | 1990-05-22 | | 92453 | 1960-09-25 | Khalid | Trystram | M | 1993-11-10 | +--------+------------+------------+------------+--------+------------+ 10 rows in set (0.20 sec) ``` ``` mysql> select * from employees inner join (select emp_no from employees order by birth_date limit 200000,10) as temp_table using (emp_no); +--------+------------+------------+------------+--------+------------+ | emp_no | birth_date | first_name | last_name | gender | hire_date | +--------+------------+------------+------------+--------+------------+ | 498507 | 1960-09-24 | Perla | Delgrange | M | 1989-12-08 | | 23102 | 1960-09-25 | Hsiangchu | Harbusch | M | 1986-03-14 | | 29961 | 1960-09-25 | Susumu | Munoz | F | 1989-12-31 | | 32061 | 1960-09-25 | Dipankar | Buescher | M | 1992-10-24 | | 36216 | 1960-09-25 | Xianlong | Rassart | F | 1987-09-05 | | 37058 | 1960-09-25 | Khue | Osgood | M | 1991-11-04 | | 38365 | 1960-09-25 | Sariel | Ramsak | M | 1993-02-26 | | 39901 | 1960-09-25 | Jianhui | Ushiama | M | 1985-12-03 | | 59683 | 1960-09-25 | Supot | Millington | F | 1991-06-03 | | 63784 | 1960-09-25 | Rosita | Zyda | M | 1988-08-12 | +--------+------------+------------+------------+--------+------------+ 10 rows in set (0.03 sec) ``` 我們可以看到，普通查詢語句並沒有得到效率上的提升，但是內連線的查詢效率得到了很大的提升，花費時間從原來的0.1s縮減為0.03秒，也就是說，再次優化後的內連線差不多可以應對百萬（甚至千萬級，因為實際生產中所使用的硬體設施肯定會遠遠好與我現在的基礎班ECS）級別的資料了。對於加上` birthdate_index`索引後普通查詢效率未提升的說明：因為我們查詢的是`select *`，即使emp_no和birth_date上有索引，在查詢其他列資訊的時候，我們依然需要回表。因此即使加上索引後，我們的普通查詢依然使用的是全表掃描。 ## 小結經過試驗證明，內連線對於order by+雙引數limit有一定效果，在合適的內連線子查詢下，增加相應的索引，能夠使效能進一步提升。從0.2到0.1在到0.03，當縮減一個數量級時，那都是很大的突破。（完結撒花~） ## 最後的補充 - EXPLAIN不會告訴你關於觸發器、儲存過程的資訊或使用者自定義函式對查詢的影響情況 - EXPLAIN不考慮各種Cache - EXPLAIN不能顯示MySQL在執行查詢時所作的優化工作 - 部分統計資訊是估算的，並非精確值 - EXPALIN只能解釋SELECT操作，其他操作要重寫為SELECT後檢視執行計劃

MySQL實驗內連線優化order by+limit 以及新增索引再次改進

MySQL實驗內連線優化order by+limit 以及新增索引再次改進

[mysql]SQL語句效能優化--Order by中加DESC慢很多的原因調查與處理

mysql order by limit 問題

MySQL如何利用索引優化ORDER BY排序語句

MySQL優化order by導致的 using filesort

MySQL利用索引優化ORDER BY排序語句

17.MySQL優化ORDER BY 優化

MySQL如何利用索引優化ORDER BY排序語句

mysql 5.6 order by Limit執行效率問題

mysql order by limit 的一個坑

mysql group by order by limit 1

MySQL5.6中limit的工作機制和order by limit優化原理

優化order by 語句

MySql 小內存優化

mysql中內連線，外連線，等值連線，非等值連線，自然連線的區別和聯絡

注意使用 BTREE 複合索引各欄位的 ASC/DESC 以優化 order by 查詢效率

【mysql】mysql的內連線和外連線小例子

mysql查詢報錯： ORDER BY clause is not in GROUP BY..this is incompatible with sql_mode=only_full_group_by

Mysql，union all 與 order by的使用

UPDATE...WHERE...ORDER BY...LIMIT語句

MySQL實驗 內連線優化order by+limit 以及新增索引再次改進

相關推薦

MySQL實驗內連線優化order by+limit 以及新增索引再次改進