【總結系列】網際網路服務端技術體系：高效能之資料庫索引

阿新 • • 發佈：2020-11-08

引子

建立最優的資料庫索引是提升資料庫查詢效能的重要手段。本文總結資料庫索引相關的知識及實踐。

總入口見：

基本知識

InnoDB 裡表資料是按照主鍵順序存放的。InnoDB 會按照表定義的第一個非空索引（按索引定義順序）作為主鍵。索引（在 MySQL 中）是由儲存引擎實現的。索引型別主要有順序索引和雜湊索引。順序索引的底層結構是 B+Tree ，雜湊索引的底層結構是雜湊表。

索引是以空間換時間，減少了要掃描的資料量、避免排序、將隨機IO變成順序IO。使用索引的代價是：空間佔用更大、插入和更新成本更大。順序索引可支援：全值匹配、最左順序匹配、列字首匹配、範圍匹配、精確匹配數列並範圍匹配一列、只訪問索引的查詢、索引掃描排序。雜湊索引可支援：全值匹配。

順序索引

InnoDB 的順序索引是將主鍵列表構建成一棵 B+ 樹。內節點存放的是均是主鍵值，葉子節點存放的是整張表的行資料。這樣，可以讓節點儘可能存放更多的主鍵值，從而降低樹的高度。B+ 樹是有序查詢平衡樹，高度通常在 2-4 之間，因為要儘可能減少磁碟讀寫次數。B+ 樹的插入操作在節點關鍵數滿的情況下，會分裂成兩個子節點。理解 B+ 樹對於理解順序索引非常關鍵。

順序索引可以分為聚簇索引和非聚簇索引。

聚簇索引：在葉子節點中儲存了 B-Tree 索引和資料行。將索引列放在內節點上，而將行資料放在葉子節點上。聚簇索引可以極大提升 IO 密集型的效能。一個表只能有一個聚簇索引，通常用主鍵列。聚簇索引的最優插入順序是按照主鍵值順序插入。如果是隨機插入，會更新聚簇索引的代價較高：更多的查詢操作、頻繁的“頁分裂”的問題、移動大量資料、產生碎片。

非聚簇索引：非聚簇索引的內節點存放的是非聚簇索引列的值，葉子節點儲存的是對應資料行的主鍵值。因此，根據非聚簇索引需要兩次索引查詢。先從葉子節點找到主鍵值，再根據主鍵值在聚簇索引裡找到資料行。非聚簇索引因為不儲存資料行的資訊，因此佔用空間會比聚簇索引更小。

雜湊索引

使用雜湊原理實現，效能很高，只能等值匹配，按索引整列匹配、不支援範圍查詢、不能用於排序。雜湊函式可以選擇 crc32 或者 md5 的一部分。雜湊索引要避免大量衝突同時不佔用過多空間。雜湊索引的選擇性取決於該列雜湊列值的衝突度。Memory 引擎支援雜湊索引，也支援 B+Tree 索引。可以為較長的字串（比如 URL）建立雜湊索引，在條件中必須同時帶上雜湊值條件和列值條件。where url = xxx and hashed_url = yyy 。

InnoDB 為某些非常頻繁的索引值在 B+ 上在記憶體中再建立一個雜湊索引，稱為自適應雜湊索引。

開發事項

適合做索引的列

選擇性高原則。如果所有行在該列上的“不重複值數量/所有值數量”的比率越高，則選擇性越高，越適合做索引。列的選擇性：count(distinct(col)) / count(col) 。唯一索引的選擇性是 1。使用 show index from tablename ，Cardinality 的值展示了索引列的不重複值的預估值。可以用來判斷這個索引是否合適。如果 Cardinality 的值接近於表的記錄總數，則是選擇性高的。

注意，在單列索引的時候，這個值對應指定索引列的 Cardinality 值，而在聯合索引中，這個值對應聯合列的 Cardinality 值。如下所示: sid_index 的值為 41659 , tid_index 的值是 101 , sid_index 的選擇性高於 tid_index ； stc_id_index.t_id 的值是 3443139 ，是指 (s_id, t_id) 聯合索引的值，高於 sid_index 單列索引的選擇性。

如何找到高選擇性的列呢？

定性分析：值比較傾向於唯一的，是高選擇性的；而值域在某個有限集合的，是低選擇性的。比如 ID 值通常是高選擇性的，而 age 值則是低選擇性的。
測量分析：使用 count(distinct(col)) / count(col) 來計算，值越接近於 1 的是高選擇性的。測量分析通常用於驗證或否定。

mysql> show index from student_courses;

+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table           | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| student_courses |          0 | PRIMARY      |            1 | id          | A         |     7764823 |     NULL | NULL   |      | BTREE      |         |               |

| student_courses |          1 | stc_id_index |            1 | s_id        | A         |       40417 |     NULL | NULL   |      | BTREE      |         |               |

| student_courses |          1 | stc_id_index |            2 | t_id        | A         |     3443139 |     NULL | NULL   |      | BTREE      |         |               |

| student_courses |          1 | stc_id_index |            3 | c_id        | A         |     7764823 |     NULL | NULL   |      | BTREE      |         |               |

| student_courses |          1 | sid_index    |            1 | s_id        | A         |       41659 |     NULL | NULL   |      | BTREE      |         |               |

| student_courses |          1 | tid_index    |            1 | t_id        | A         |         101 |     NULL | NULL   |      | BTREE      |         |               |

+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

6 rows in set (0.00 sec)

構建索引

先列出所有可能的搜尋語句，找到出現的列，將選擇性高的列放在最左邊，有範圍查詢的列儘可能放最右邊。從左開始逐個將列新增到聯合索引裡，儘可能覆蓋所有搜尋語句。可能需要建立多個聯合索引來覆蓋。最後，要考慮選擇語句和排序語句的列，儘可能使用索引覆蓋獲取列資料，使用索引掃描來排序。

聯合索引

聯合索引也是一棵 B+ 樹，關鍵字是一個元組。類似索引的多級搜尋，逐步大幅減少需要掃描和匹配的行。聯合索引搜尋遵循最左匹配原則。聯合索引需要建立最優索引列順序。注意，在每個需要搜尋的列上建立單列索引，不是聯合索引（搜尋的時候只能單列搜尋後，再用到索引合併來合併結果）。

聯合索引匹配遵循最左匹配原則。匹配終止條件：將搜尋條件按照聯合索引順序重排列，遇到等值查詢（包括 IN 查詢）繼續，遇到範圍查詢、BETWEEN、LIKE 查詢則終止。無法使用索引的情況：在 where 條件中，索引列在表示式中或對索引列使用函式。

實踐中，需要用相同的列但順序不同的聯合索引來滿足不同的查詢需求。

字首索引

為長字串建立索引。使用指定長度的字串的字首來建立索引。對於 BLOB, TEXT, 很長的 VARCHAR 列，必須使用字首索引。字首索引要選擇一個合適的長度：選擇性與整列的選擇性接近，同時不佔用過多空間。字首索引無法使用 GROUP BY 和 ORDER BY，無法做覆蓋掃描。如果字串字尾或某個部分的選擇性更高，也可以做一些預處理轉化為字首索引。思想是相同的。

尋找字首索引最佳長度的步驟：

STEP1 - 先找到該列所有值的 TOPN，可以使用 count as c, col from table group by col order by c desc limit N 語句；

STEP2 - 從一個比較合適的值（比如 3）開始，測試選擇性，直到 TOPN 絕大部分列的 c 的數量與 TOPN 的 c 比較接近。

覆蓋索引

覆蓋索引的列包含了所有需要查詢的列，可以減少大量的磁碟讀，大幅提升效能。如果某個列在 select cols 字句中頻繁出現，也可以考慮放在聯合索引裡，利用覆蓋索引來優化效能。延遲關聯技術可以使用覆蓋索引能力。

索引掃描排序

只有當索引的列順序與 ORDER BY 字句的順序完全一致，並且所有列的排序方向都一樣時，才能使用索引對結果做排序。有一個例外，就是前導列條件指定為常數。比如 (date, fans_id) 對於 where date = 'xxx' order by fans_id desc 也可以使用索引掃描排序。

索引提示

可以使用 FORCE INDEX(a) 強制指定 SQL 語句要使用的索引。

MRR

Multi-Range Read。針對範圍查詢的優化。MRR 會將查詢到的輔助索引鍵放到快取裡，然後按照主鍵排序（將隨機 IO 轉換為順序 IO，可以減少頁替換），再根據排序後主鍵來順序來訪問實際資料。適用於 range, ref, eq_ref 的查詢。

MRR 預設開啟。使用 optimizer_switch 的標記來控制是否使用MRR.設定mrr=on時，表示啟用MRR優化。
SET @@optimizer_switch='mrr=on,mrr_cost_based=on';

“系統賬號”問題

索引列的某個值出現次數非常多。應避免使用系統賬號值出現在查詢語句裡。

索引實驗

準備工作

準備表

假設有個學生選課表。如下所示：

## executed using root account
## mysql -uroot -p < /path/to/project.sql

DROP USER 'test'@'localhost';
drop database if exists test;

CREATE USER 'test'@'localhost' IDENTIFIED BY 'test';
create database test ;
grant all privileges on test.* to 'test'@'localhost' identified by 'test';

use test
drop table if exists student_courses;
create table student_courses (
    id int(10) UNSIGNED not null primary key AUTO_INCREMENT comment 'AUTO_INCREMENT ID',
    s_id varchar(64) not null comment 'student ID',
    t_id varchar(64) not null comment 'teacher ID',
    room varchar(64) not null comment 'room name',
    c_id varchar(32) not null comment 'course ID',
    c_time int(10) not null comment 'course time',
    extra varchar(256) default '' comment 'extra info',
    gmt_create datetime DEFAULT CURRENT_TIMESTAMP,
      gmt_modified datetime DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

準備資料

寫個 groovy 指令碼生成 800w 條選課資料。批量插入的效率更高。單個插入，每重新整理一次，幾千的插入；批量插入，每重新整理一次，20w 的插入。

package cc.lovesq.study.data

class StudentCoursesDataGenerator {

    private static final STUDENT_PREFIX = "STU";
    private static final TEACHER_PREFIX = "TCH";
    private static final ROOM_PREFIX = "ROOM";
    private static final COURSE_PREFIX = "CRE";

    static Random random = new Random(47);
    static int THREE_MONTH = 3 * 60 * 60 * 24 * 30;

    static void main(args) {

        def filePath = "./sql/stu_courses.sql"
        File file = new File(filePath)
        def batchSize = 50

        file.withWriter { writer ->
            for (int i=0; i< 8000000/batchSize; i++) {
                def insertSql = "insert into student_courses(s_id, t_id, room, c_id, c_time) values "
                for (int j=0; j< batchSize; j++) {
                    def sId = STUDENT_PREFIX + "_" + random.nextInt(40000)
                    def tId = TEACHER_PREFIX + random.nextInt(100)
                    def room = ROOM_PREFIX + random.nextInt(50)
                    def cId = COURSE_PREFIX + random.nextInt(60)
                    def cTime = Math.floor((System.currentTimeMillis() - random.nextInt(THREE_MONTH)) / 1000)
                    insertSql += "('$sId', '$tId', '$room', '$cId', $cTime),"
                }
                insertSql = insertSql.substring(0, insertSql.length()-1) + ";\n"
                //print(insertSql)
                writer.write(insertSql)
            }
        }
    }
}

生成的樣例資料如下：

insert into student_courses(s_id, t_id, room, c_id, c_time) values ('STU_29258', 'TCH55', 'ROOM43', 'CRE41', 1.604717694E9),('STU_429', 'TCH68', 'ROOM0', 'CRE42', 1.604714673E9),('STU_38288', 'TCH28', 'ROOM1', 'CRE49', 1.604719218E9),('STU_7278', 'TCH98', 'ROOM11', 'CRE20', 1.604712414E9),('STU_8916', 'TCH40', 'ROOM11', 'CRE42', 1.604715357E9),('STU_17383', 'TCH6', 'ROOM25', 'CRE10', 1.604718551E9),('STU_27674', 'TCH4', 'ROOM0', 'CRE6', 1.604714485E9),('STU_30896', 'TCH33', 'ROOM34', 'CRE4', 1.604716917E9),('STU_28303', 'TCH41', 'ROOM38', 'CRE52', 1.604716827E9),('STU_8689', 'TCH85', 'ROOM42', 'CRE46', 1.604713881E9),('STU_2447', 'TCH68', 'ROOM4', 'CRE35', 1.604713422E9),('STU_10354', 'TCH16', 'ROOM22', 'CRE36', 1.604713187E9),('STU_29257', 'TCH34', 'ROOM2', 'CRE17', 1.604717763E9),('STU_17242', 'TCH80', 'ROOM48', 'CRE1', 1.60471313E9),('STU_17052', 'TCH65', 'ROOM4', 'CRE9', 1.604711894E9),('STU_12209', 'TCH58', 'ROOM8', 'CRE43', 1.604712827E9),('STU_1246', 'TCH94', 'ROOM20', 'CRE4', 1.604715802E9),('STU_33533', 'TCH61', 'ROOM8', 'CRE8', 1.604718404E9),('STU_14367', 'TCH79', 'ROOM5', 'CRE42', 1.604714165E9),('STU_28037', 'TCH99', 'ROOM21', 'CRE13', 1.604718321E9),('STU_31909', 'TCH28', 'ROOM3', 'CRE36', 1.604718883E9),('STU_16994', 'TCH1', 'ROOM19', 'CRE3', 1.604719329E9),('STU_25382', 'TCH34', 'ROOM12', 'CRE26', 1.604714293E9),('STU_21718', 'TCH55', 'ROOM15', 'CRE40', 1.604715585E9),('STU_36228', 'TCH17', 'ROOM1', 'CRE17', 1.604716797E9),('STU_24146', 'TCH62', 'ROOM2', 'CRE12', 1.604714202E9),('STU_36499', 'TCH11', 'ROOM42', 'CRE14', 1.604718307E9),('STU_30843', 'TCH16', 'ROOM35', 'CRE6', 1.604717656E9),('STU_32930', 'TCH15', 'ROOM23', 'CRE33', 1.604718313E9),('STU_12921', 'TCH3', 'ROOM13', 'CRE35', 1.604711955E9),('STU_16669', 'TCH83', 'ROOM20', 'CRE58', 1.604717105E9),('STU_10225', 'TCH1', 'ROOM26', 'CRE5', 1.60471344E9),('STU_9399', 'TCH98', 'ROOM31', 'CRE45', 1.604714572E9),('STU_17332', 'TCH25', 'ROOM10', 'CRE31', 1.604713764E9),('STU_38771', 'TCH10', 'ROOM10', 'CRE11', 1.604716834E9),('STU_9529', 'TCH16', 'ROOM30', 'CRE10', 1.604718969E9),('STU_32513', 'TCH36', 'ROOM40', 'CRE44', 1.604714399E9),('STU_38907', 'TCH34', 'ROOM31', 'CRE33', 1.604716016E9),('STU_31551', 'TCH13', 'ROOM35', 'CRE28', 1.604716906E9),('STU_39883', 'TCH39', 'ROOM46', 'CRE23', 1.604719006E9),('STU_34965', 'TCH47', 'ROOM45', 'CRE10', 1.604713917E9),('STU_12265', 'TCH85', 'ROOM46', 'CRE11', 1.604714663E9),('STU_9348', 'TCH22', 'ROOM4', 'CRE14', 1.604712076E9),('STU_38391', 'TCH35', 'ROOM29', 'CRE37', 1.60471538E9),('STU_25424', 'TCH78', 'ROOM23', 'CRE3', 1.604717869E9),('STU_39334', 'TCH25', 'ROOM14', 'CRE48', 1.604717478E9),('STU_26085', 'TCH17', 'ROOM16', 'CRE23', 1.604718913E9),('STU_35483', 'TCH16', 'ROOM6', 'CRE5', 1.604712875E9),('STU_28009', 'TCH77', 'ROOM47', 'CRE39', 1.604716687E9),('STU_15094', 'TCH71', 'ROOM23', 'CRE18', 1.604712238E9);

可以查看錶空間大小：

mysql> select CONCAT(ROUND(SUM(DATA_LENGTH) / (1024 * 1024 * 1024),3),' GB') as TABLE_SIZE from information_schema.TABLES where information_schema.TABLES.TABLE_NAME='student_courses'\G

*************************** 1. row ***************************

TABLE_SIZE: 0.538 GB

開始試驗

給裸表新增索引

假設什麼索引都不建，裸表一個，通過 s_id 搜尋需要 2.94s; 新增 sid_index 索引後，同樣的搜尋不到 0.01s 。

select * from student_courses where s_id = 'STU_17242';

194 rows in set (2.94 sec)




ALTER TABLE `student_courses` ADD INDEX sid_index ( `s_id` );

select * from student_courses where s_id = 'STU_17242';


194 rows in set (0.01 sec)

使用 explain 解釋下：

select_type：查詢型別， SIMPLE 表示這是一個簡單的 SELECT 查詢；
type: 表的連線型別。 const 表示匹配最多一行，通常是根據主鍵查詢；ref 表示使用非主鍵/唯一索引匹配少量行； range 表示範圍查詢，<>, >, <, <=, >=, IN, BETWEEN, LIKE ； index 掃描索引樹，但數量太大，相當於全表掃描； full 全表掃描。
possible_keys 和 key : 可能使用的索引以及實際使用的索引。
ref: 對於 key 給出的列，哪些列或哪些常量被用來比較了。
rows: 掃描的預計行數。
filtered: 被過濾行數的比例。
Extra: 索引使用的額外資訊。 Using Where 需要使用 where 字句條件來過濾記錄; Using Index 要獲取的列資訊可以從索引樹上拿到; Using filesort 檔案排序; Using MRR 是否使用了 MRR 優化範圍查詢.

mysql> explain select * from student_courses where id = 5;

+----+-------------+-----------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+

| id | select_type | table           | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |

+----+-------------+-----------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |

+----+-------------+-----------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+


mysql> explain select * from student_courses where s_id = 'STU_17242';

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------+

| id | select_type | table           | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | sid_index     | sid_index | 194     | const |  194 |   100.00 | NULL  |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------+


mysql> explain select count(id) from student_courses;

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+

| id | select_type | table           | partitions | type  | possible_keys | key      | key_len | ref  | rows    | filtered | Extra       |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+

|  1 | SIMPLE      | student_courses | NULL       | index | NULL          | tc_index | 292     | NULL | 7785655 |   100.00 | Using index |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+

對索引列使用了函式不會使用索引：

select * from student_courses where REPLACE(s_id,"STU_","") = '17242';



mysql> explain select * from student_courses where REPLACE(s_id,"STU_","") = '17242';

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

|  1 | SIMPLE      | student_courses | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 7764823 |   100.00 | Using where |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

假設有如下語句，可以看到使用了索引 sid_index ，從 194 條過濾到最終 3 條。因為 sid_index 已經過濾了絕大多數記錄，因此新增 t_id 索引看上去沒有必要。

select * from student_courses where t_id = 'TCH86' and s_id = 'STU_17242';



mysql> explain select * from student_courses where t_id = 'TCH86' and s_id = 'STU_17242';

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------------+

| id | select_type | table           | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra       |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | sid_index     | sid_index | 194     | const |  194 |    10.00 | Using where |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------------+

現在刪除 sid_index 索引，新增 tid_index 索引。看看情況如何。由於 t_id 選擇性較低，新增 tid_index 過濾後仍然有 8w+ 條記錄，兩條搜尋語句耗時 0.4s 左右。

ALTER TABLE student_courses drop index sid_index;


ALTER TABLE student_courses add index tid_index(t_id);


select * from student_courses where t_id = 'TCH86';

80195 rows in set (0.45 sec)


select * from student_courses where t_id = 'TCH86' and s_id = 'STU_17242';

3 rows in set (0.40 sec)


mysql> explain select * from student_courses where t_id = 'TCH86' and s_id = 'STU_17242';

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+--------+----------+-------------+

| id | select_type | table           | partitions | type | possible_keys | key       | key_len | ref   | rows   | filtered | Extra       |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+--------+----------+-------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | tid_index     | tid_index | 194     | const | 151664 |    10.00 | Using where |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+--------+----------+-------------+


mysql> select count(distinct s_id) / count(*) as s_id_selectivity, count(distinct t_id) / count(*) as t_id_selectivity  from student_courses;

+------------------+------------------+

| s_id_selectivity | t_id_selectivity |

+------------------+------------------+

|           0.0050 |           0.0000 |

+------------------+------------------+

1 row in set (10.11 sec)

這說明：新增選擇性高的索引，效能提升更優。

聯合索引

考慮如下語句。仍然使用 tid_index ，耗時 0.4s 。如果使用聯合索引 (tid_index, cid_index) , 則耗時 0.03s 。相當於做了兩次索引查詢，當然比一次要快。代價是，索引佔用空間更高。

select * from student_courses where t_id = 'TCH86' and c_id = 'CRE33';


1423 rows in set (0.41 sec)


ALTER TABLE student_courses add index tid_cid_index(t_id, c_id);

select * from student_courses where t_id = 'TCH86' and c_id = 'CRE33';
1423 rows in set (0.03 sec)

結合情形一，通常會將多個業務 ID 建成聯合索引 (s_id, t_id, c_id) ，這樣，(s_id), (s_id, t_id), (s_id, t_id, c_id) 都可以應用到這個索引。由於 s_id 選擇性非常大，可以單獨建一個索引（節省索引佔用空間）；而 (t_id, c_id) 需要建一個聯合索引，因為 (s_id, t_id, c_id) 無法匹配 t_id 和 c_id 聯合查詢的情況。根據最左匹配原則，s_id 必須出現。

ALTER TABLE student_courses add index stc_id_index(s_id,t_id,c_id);  或者 ALTER TABLE student_courses add index sid_index(s_id)

ALTER TABLE student_courses add index stc_id_index(t_id, c_id);

聯合索引是應對多條件查詢的效能提升的關鍵。最左匹配原則是應用聯合索引的最重要的原則之一。將查詢條件按照聯合索引定義的順序 (a,b,c,d,e) 重新排列，逐個比較：

如果查詢條件均是等值查詢，則出現順序沒有關係，按照聯合索引定義順序重新排列即可。比如 a=1 and b=2 與 b=2 and a=1 是等同的。順序可以不同，但必須出現。如果 b=2 and c=3 就無法應用聯合索引 (a,b,c,d,e) 了，因為 a 沒出現。
如果聯合索引裡沒有出現該列，則匹配到此終止。比如 b=2 and a=1 and d = 4 只能應用 (a,b)，因為 c 沒出現。
如果聯合索引裡出現了範圍匹配的列，則匹配到該列終止，後面的條件無法應用索引。比如 b=2 and a=1 and d=4 and c in (2,3) 只能應用 (a,b,c) ，因為 c 出現了範圍匹配。

在 explain 命令中，可以看 ref , filter 來判斷應用了哪些索引。如果沒有應用到某個列的索引，也可以刪除相應的查詢條件，用 explain 命令的 ref 和 rows 來對比是否有變化。如果只應用到了某個索引，則 Extra = Using index condition 。假設現在只建立了 (s_id, t_id, c_id) 聯合索引。可以用 show index from student_courses; 檢視建立了哪些索引。

mysql> show index from student_courses;
+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| student_courses |          0 | PRIMARY      |            1 | id          | A         |     7764823 |     NULL | NULL   |      | BTREE      |         |               |
| student_courses |          1 | stc_id_index |            1 | s_id        | A         |       40417 |     NULL | NULL   |      | BTREE      |         |               |
| student_courses |          1 | stc_id_index |            2 | t_id        | A         |     3443139 |     NULL | NULL   |      | BTREE      |         |               |
| student_courses |          1 | stc_id_index |            3 | c_id        | A         |     7764823 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

下面是各語句以及應用聯合索引的情況：

 // 全表掃，無法應用聯合索引
mysql> explain select * from student_courses where c_id = 'CRE3' and t_id = 'TCH21'; 
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | student_courses | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 7764823 |     1.00 | Using where |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

// 應用了 (s_id, t_id, c_id) ，因為都是等值查詢且都出現了，在查詢語句的出現順序沒有關係
mysql> explain select * from student_courses where s_id = 'STU_18528' and c_id = 'CRE3' and t_id = 'TCH21';

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref               | rows | filtered | Extra |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 486     | const,const,const |    1 |   100.00 | NULL  |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+


// 應用 (s_id, t_id) ，因此 ref = const, const
mysql> explain select * from student_courses where s_id = 'STU_18528' and t_id = 'TCH21';  

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+-------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref         | rows | filtered | Extra |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 388     | const,const |    2 |   100.00 | NULL  |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+-------+



// 僅應用 (s_id) ，因為 t_id 沒出現
mysql> explain select * from student_courses where s_id = 'STU_18528' and c_id = 'CRE3';

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+-----------------------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra                 |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+-----------------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 194     | const |  195 |    10.00 | Using index condition |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+-----------------------+



// 第一個應用 (s_id, t_id, c_id) 估計是將 in ( 'TCH21') 優化為等值查詢了； 第二個應用了 (s_id, t_id)。
mysql> explain select * from student_courses where s_id = 'STU_18528' and c_id = 'CRE3' and t_id in ( 'TCH21');

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref               | rows | filtered | Extra |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 486     | const,const,const |    1 |   100.00 | NULL  |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

1 row in set, 1 warning (0.00 sec)



mysql> explain select * from student_courses where s_id = 'STU_18528' and c_id = 'CRE3' and t_id > 'TCH21';

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+

| id | select_type | table           | partitions | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra                 |

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+

|  1 | SIMPLE      | student_courses | NULL       | range | stc_id_index  | stc_id_index | 388     | NULL |  171 |    10.00 | Using index condition |

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+

索引覆蓋

索引覆蓋是指 select 中的列均出現在聯合索引列中。如下兩個語句，後面那個語句應用了索引覆蓋，Extra = Using index ，取列資料時可以直接從索引中獲取，而不需要去讀磁碟。

mysql> explain select * from student_courses where s_id = 'STU_18528' and c_id = 'CRE3' and t_id= 'TCH21';
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+
| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref               | rows | filtered | Extra |
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+
|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 486     | const,const,const |    1 |   100.00 | NULL  |
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

mysql> explain select s_id, t_id from student_courses where s_id = 'STU_18528' and c_id = 'CRE3' and t_id= 'TCH21';
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------------+
| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref               | rows | filtered | Extra       |
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------------+
|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 486     | const,const,const |    1 |   100.00 | Using index |
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------------+

索引掃描排序

索引可以用來排序，從而減少隨機 IO，提升排序效能。如下三種情況可以應用索引排序：

索引列順序與 ORDER BY 子句的順序完全一致時，並且所有列的排序方向都相同；如果要關聯多張表，則 ORDER BY 引用的排序欄位都為第一張表的欄位時；
如果前導列為等值查詢，後續的 ORDER BY 子句的欄位順序與索引列順序一致。

如果使用了索引排序，則 type = index ；如果未能引用索引排序，那麼 Extra 會提示 Using filesort 。

// 應用索引排序：ORDER BY 字句的所有列與索引列順序一致，且排序方向一致
mysql> explain select * from student_courses order by s_id desc, t_id desc, c_id desc limit 10;

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-------+

| id | select_type | table           | partitions | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra |

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | index | NULL          | stc_id_index | 486     | NULL |   10 |   100.00 | NULL  |

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-------+


// 沒有應用索引排序：ORDER BY 字句的所有列與索引列順序一致，但排序方向不一致
mysql> explain select * from student_courses order by s_id asc, t_id desc limit 10;

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+

| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra          |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+

|  1 | SIMPLE      | student_courses | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 7764823 |   100.00 | Using filesort |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+



// 沒有應用索引排序：ORDER BY 字句的所有列順序 (t_id, s_id) 與索引列順序 (s_id, t_id, c_id) 不一致
mysql> explain select * from student_courses order by t_id desc, s_id desc limit 10;

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+

| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra          |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+

|  1 | SIMPLE      | student_courses | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 7764823 |   100.00 | Using filesort |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+



// 應用索引排序：前導列為 s_id 與 t_id 聯合，與索引列定義順序一致
mysql> explain select s_id, t_id from student_courses where s_id = 'STU_18528' order by t_id;

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+--------------------------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra                    |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+--------------------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 194     | const |  195 |   100.00 | Using where; Using index |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+--------------------------+



// 應用索引排序：前導列為 s_id, t_id 與 c_id 聯合，與索引列定義順序一致
mysql> explain select s_id, t_id from student_courses where s_id = 'STU_18528' and t_id = 'TCH21' order by c_id desc;

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+--------------------------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref         | rows | filtered | Extra                    |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+--------------------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 388     | const,const |    2 |   100.00 | Using where; Using index |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+--------------------------+



// 未能應用索引排序：前導列為 s_id 與 c_id 聯合，與索引列定義順序不一致
mysql> explain select s_id, t_id from student_courses where s_id = 'STU_18528' order by c_id desc;

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+------------------------------------------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra                                    |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+------------------------------------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 194     | const |  195 |   100.00 | Using where; Using index; Using filesort |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+------------------------------------------+

MRR

如果使用 MRR 導致的開銷過高，也不會開啟 MRR。此時，可以使用強制索引，或者設定無論如何都開啟 MRR。如下所示，t_id < 'T24' 會開啟 MRR，但 t_id < 'T32' 則不會開啟。此時，可以強制使用索引 tc_index，這樣，就會使用 MRR。

mysql> explain select * from student_courses where t_id >= 'TCH21' and t_id < 'TCH24';

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+--------+----------+----------------------------------+

| id | select_type | table           | partitions | type  | possible_keys | key      | key_len | ref  | rows   | filtered | Extra                            |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+--------+----------+----------------------------------+

|  1 | SIMPLE      | student_courses | NULL       | range | tc_index      | tc_index | 194     | NULL | 508500 |   100.00 | Using index condition; Using MRR |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+--------+----------+----------------------------------+

1 row in set, 1 warning (0.00 sec)





mysql> explain select * from student_courses where t_id >= 'TCH21' and t_id < 'TCH32';

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

|  1 | SIMPLE      | student_courses | NULL       | ALL  | tc_index      | NULL | NULL    | NULL | 7785655 |    27.09 | Using where |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

1 row in set, 1 warning (0.00 sec)





mysql> explain select * from student_courses FORCE INDEX(tc_index) where t_id >= 'TCH21' and t_id < 'TCH32';

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+----------------------------------+

| id | select_type | table           | partitions | type  | possible_keys | key      | key_len | ref  | rows    | filtered | Extra                            |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+----------------------------------+

|  1 | SIMPLE      | student_courses | NULL       | range | tc_index      | tc_index | 194     | NULL | 2109100 |   100.00 | Using index condition; Using MRR |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+----------------------------------+

小結

資料庫是開發人員最常打交道的軟體，而索引是高效訪問資料庫的重中之重。深入理解索引的原理，合理設計適配查詢的索引，是有必要下功夫的。

索引基本功：

根據查詢條件建立高效的索引；
理解最左匹配原則並定義最優的聯合索引；
儘可能用好覆蓋索引和索引掃描。

參考資料

：第 5 章
：第 5 章

【總結系列】網際網路服務端技術體系：高效能之資料庫索引

引子

基本知識

開發事項

索引實驗

準備工作

開始試驗

小結

參考資料

【總結系列】網際網路服務端技術體系：高效能之資料庫索引

【總結系列】網際網路服務端技術體系：高效能之併發

【總結系列】網際網路服務端技術體系：高效能之快取面面觀

🏆【Alibaba微服務技術系列】「Dubbo3.0技術專題」回顧Dubbo2.x的技術原理和功能實現及原始碼分析（溫故而知新）

【轉】Java 服務端和 C# 客戶端實現 Socket 通訊

【Docker 系列】docker 學習十一，docker 總結和麵試題整理

【系統選型】網際網路輿情監控服務調研

【Kubernetes系列】第4篇 Kubernetes包管理工具-helm介紹

【Kubernetes系列】第6篇 Ingress controller - nginx元件介紹

【Kubernetes系列】第2篇基礎概念介紹

【Kubernetes系列】第3篇 Kubernetes叢集安裝部署

【Kubernetes系列】第7篇 CI/CD之元件部署

【平臺開發】— 5.後端：程式碼分層

【css系列】六種實現元素水平居中方法

【報告分享】 58安居客房產研究院：2020年上半年樓市總結（附下載）

【5G系列】協議升級方法

【轉載】【Codec系列】之常用位元速率控制演算法分析

【初級篇】華為NAT技術（靜態NAT）

【Azure DevOps系列】Azure DevOps EFCore命令式指令碼部署到SQL資料庫

dev c++除錯_【工具系列】VSCode混合除錯 C/C++ 和 Node.js

【總結系列】網際網路服務端技術體系：高效能之資料庫索引

引子

基本知識

開發事項

索引實驗

準備工作

開始試驗

小結

參考資料

相關推薦