MySQL索引失效之隱式轉換

阿新 • • 發佈：2022-01-08

常見索引失效：

1. 條件索引欄位"不乾淨"：函式操作、運算操作

2. 隱式型別轉換：字串轉數值；其他型別轉換

3. 隱式字元編碼轉換：按字元編碼資料長度大的方向轉換，避免資料擷取

一、常見索引失效場景

root@test 10:50 > show create table t_num\G
*************************** 1. row ***************************
       Table: t_num
Create Table: CREATE TABLE `t_num` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c1`  
int(11) NOT NULL,
  `c2` varchar(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_c1` (`c1`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8mb4

root@test 10:51 > select * from t_num;
+----+----+----+
| id | c1 | c2 |
+----+----+----+
|  1 | -2 | -2 |
|  2 | -1 | -1 |
|  3 |  0 |  0 |
|  4 |  1 | 
  1 |
|  5 |  2 |  2 |
+----+----+----+

# 在c1欄位上加上索引
root@test 10:52 > alter table t_num add index ix_c1(c1);

# 標準使用情況下，索引有效
root@test 10:55 > explain select * from t_num where c1 = -1;
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+ 

| id | select_type | table | partitions | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t_num | NULL       | ref  | ix_c1         | ix_c1 | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+

1、條件欄位函式操作

# 在where中c1上加上abs()絕對值函式，可以看到type=ALL，全表掃描，在Server層進行絕對值處理後進行比較
root@test 10:58 > explain select * from t_num where abs(c1) = 1;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t_num | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+

如上，對索引欄位做函式操作，即where條件列上不乾淨時，可能會破壞索引值的有序性（按照c1的值有序組織索引樹），因此優化器就決定放棄走索引樹搜尋功能。

但是，條件欄位函式操作下，也並非完全的走全表掃描，優化器並非完全的放棄該欄位索引。

# 選擇查詢的資料，只有id和c1欄位，可以看到type=index，使用到了ix_c1索引
root@test 10:59 > explain select id,c1 from t_num where abs(c1) = 1;
+----+-------------+-------+------------+-------+---------------+-------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys | key   | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------+-------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | t_num | NULL       | index | NULL          | ix_c1 | 4       | NULL |    5 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------+-------+---------+------+------+----------+--------------------------+

如上，由於ix_c1索引樹是根節點c1和葉子節點id構造的，雖然因為c1上的函式操作導致放棄索引定位，但優化器可以選擇遍歷該索引樹，使用覆蓋索引（Using index），無需回表，將所需的id和c1資料返回Server層後進行後續的abs()和where過濾。

2、條件欄位運算操作

# where條件裡，對c1進行運算操作
root@test 11:03 > explain select * from t_num where c1 + 1 = 2;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t_num | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+

如上，雖然“+1”的操作並沒有破壞c1索引的有序性，但優化器仍然沒有使用該索引快速定位。因此，等號左邊，注意優化掉索引欄位上的運算操作。

3、隱式型別轉換

# 在c2欄位上加上索引
root@test 12:30 > alter table t_num add index ix_c2(c2);

# 標準使用情況下（注：c2是varchar型別的），索引有效
root@test 12:30 > explain select * from t_num where c2 = "2";
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t_num | NULL       | ref  | ix_c2         | ix_c2 | 42      | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+

# 去掉等號右邊值的引號，即字串和數值進行比較，索引失效
root@test 12:30 > explain select * from t_num where c2 = 2;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t_num | NULL       | ALL  | ix_c2         | NULL | NULL    | NULL |    5 |    20.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+

如上，c2欄位是varchar型別，是字串和數值的比較，此時，MySQL是將字串轉換成數字，即此處的c2被CAST(c2 AS signed int)，這就相當於對條件欄位做了函式操作，優化器放棄走樹索引定位。

4、隱式字元編碼轉換

# 建立一個t_cou表，表結構基本和前面的t_num相同，唯一不同的設定是表字符集CHARSET=utf8
root@test 14:02 > show create table t_cou\G
*************************** 1. row ***************************
       Table: t_cou
Create Table: CREATE TABLE `t_cou` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c1` int(11) NOT NULL,
  `c2` varchar(10) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_c1` (`c1`),
  KEY `ix_c2` (`c2`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8

root@test 14:02 > insert into t_cou select * from t_num;

# join表，t_num和t_cou通過c2欄位進行關聯查詢
root@test 14:03 > select n.* from t_num n
    -> join t_cou c
    -> on n.c2 = c.c2
    -> where n.c1 = 1;
+----+----+----+
| id | c1 | c2 |
+----+----+----+
|  4 |  1 | 1  |
+----+----+----+

root@test 14:23 > explain select n.* from t_num n join t_cou c  on n.c2 = c.c2 where c.c1 = 1;
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | c     | NULL       | ref  | ix_c1         | ix_c1 | 4       | const |    1 |   100.00 | NULL                  |
|  1 | SIMPLE      | n     | NULL       | ref  | ix_c2         | ix_c2 | 42      | func  |    1 |   100.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+
# 執行計劃分析:
# 1.操作的c表，使用了ix_c1定位到一行資料
# 2.從c表定位到的行資料，拿到c2欄位去操作n表，t_cou稱為驅動表，t_num稱為被驅動表
# 3.ref=func說明使用了函式操作，指的是n.c2=CONVERT(c.c2 USING utf8mb4)
# 4.同時Using index condition，ix_c2讀取查詢時，使用被下推的條件過濾，滿足條件的才回表

root@test 14:23 > explain select n.* from t_num n join t_cou c  on n.c2 = c.c2 where n.c1 = 1;
+----+-------------+-------+------------+-------+---------------+-------+---------+-------+------+----------+-----------------------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key   | key_len | ref   | rows | filtered | Extra                                                           |
+----+-------------+-------+------------+-------+---------------+-------+---------+-------+------+----------+-----------------------------------------------------------------+
|  1 | SIMPLE      | n     | NULL       | ref   | ix_c1,ix_c2   | ix_c1 | 4       | const |    1 |   100.00 | NULL                                                            |
|  1 | SIMPLE      | c     | NULL       | index | NULL          | ix_c2 | 32      | NULL  |    5 |   100.00 | Using where; Using index; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+-------+---------+-------+------+----------+-----------------------------------------------------------------+
# 執行計劃分析:
# 1.操作的n表，使用了ix_c1定位到一行資料
# 2.從n表定位到的行資料，拿到c2欄位去操作c表，t_num稱為驅動表，t_cou稱為被驅動表
# 3.同樣的n.c2=c.c2，會將c.c2的字符集進行轉換，即被驅動表的索引欄位上加函式操作，索引失效
# 4.BNL，表join時，驅動表資料讀入join buffer，被驅動表連線欄位無索引則全表掃，每取一行和join buffer資料對比判斷，作為結果集返回

如上，分別對t_num、 t_cou作為驅動表和被驅動表的執行計劃分析，總結：

utf8mb4和utf8兩種不同字符集（編碼）型別的字串在做比較時，MySQL會先把 utf8 字串轉成 utf8mb4 字符集，再做比較。為什麼？字符集 utf8mb4 是 utf8 的超集，再做隱式自動型別轉換時，為了避免資料在轉換過程中由於截斷導致資料錯誤，會“按資料長度增加的方向”進行轉換。
表連線過程中，被驅動表的索引欄位上加函式操作，會導致對被驅動表做全表掃描。

優化手法：

修改統一join欄位的字符集
對驅動表下手，將連線欄位的字符集轉換成被驅動表連線欄位的字符集

root@test 18:09 > explain select n.* from t_num n join t_cou c  on convert(n.c2 using utf8) = c.c2 where n.c1 = 1;
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra                    |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+--------------------------+
|  1 | SIMPLE      | n     | NULL       | ref  | ix_c1         | ix_c1 | 4       | const |    1 |   100.00 | NULL                     |
|  1 | SIMPLE      | c     | NULL       | ref  | ix_c2         | ix_c2 | 32      | func  |    1 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+--------------------------+

二、型別轉換

1、字串轉整型

# 字元開頭的一律為0
root@test 18:44 > select convert("abc", unsigned integer);
+----------------------------------+
| convert("abc", unsigned integer) |
+----------------------------------+
|                                0 |
+----------------------------------+
# 'abc' = 0是成立的，因此查詢時等號右邊使用對應的型別很重要，0匹配出欄位字元開頭資料，'0'只匹配0
root@test 18:44 > select 'abc' = 0;
+-----------+
| 'abc' = 0 |
+-----------+
|         1 |
+-----------+

# 數字開頭的，直接擷取到第一個不是字元的位置
root@test 18:45 > select convert("123abc", unsigned integer);
+-------------------------------------+
| convert("123abc", unsigned integer) |
+-------------------------------------+
|                                 123 |
+-------------------------------------+

2、時間型別轉換

root@test 19:11 > show create table time_demo\G
*************************** 1. row ***************************
       Table: time_demo
Create Table: CREATE TABLE `time_demo` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c1` datetime DEFAULT NULL,
  `c2` date DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_c1` (`c1`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4

root@test 19:15 > select count(*) from time_demo;
+----------+
| count(*) |
+----------+
|       11 |
+----------+

root@test 19:16 > select * from time_demo limit 4;
+----+---------------------+------------+
| id | c1                  | c2         |
+----+---------------------+------------+
|  1 | 2022-01-08 00:01:01 | 2022-01-08 |
|  2 | 2022-01-06 23:01:01 | 2022-01-06 |
|  3 | 2022-01-06 00:00:00 | 2022-01-06 |
|  4 | 2022-01-08 00:00:00 | 2022-01-08 |
+----+---------------------+------------+

# 1.date轉datetime：末尾追加 00:00:00
root@test 19:11 > select * from time_demo where c1 between "2022-01-06" and "2022-01-08";
+----+---------------------+------------+
| id | c1                  | c2         |
+----+---------------------+------------+
|  2 | 2022-01-06 23:01:01 | 2022-01-06 |
|  3 | 2022-01-06 00:00:00 | 2022-01-06 |
|  4 | 2022-01-08 00:00:00 | 2022-01-08 |
+----+---------------------+------------+
# 結果分析：c1是datetime型別，進行比較時，between and中的date型別會轉換成datetime
# 即 where c1 between "2022-01-06 00:00:00" and "2022-01-08 00:00:00";
# 同 where c1 >= "2022-01-06 00:00:00" and c1 <= "2022-01-08 00:00:00";
root@test 19:42 > explain select * from time_demo where c1 between "2022-01-06" and "2022-01-08";
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+-----------------------+
| id | select_type | table     | partitions | type  | possible_keys | key   | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | time_demo | NULL       | range | ix_c1         | ix_c1 | 6       | NULL |    3 |   100.00 | Using index condition |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+-----------------------+
# 格式化date轉datetime
root@test 19:23 > select date_format("2022-01-08","%Y-%m-%d %H:%i:%s");
+-----------------------------------------------+
| date_format("2022-01-08","%Y-%m-%d %H:%i:%s") |
+-----------------------------------------------+
| 2022-01-06 00:00:00                           |
+-----------------------------------------------+

# 2.datetime轉date：直接擷取date部分
root@test 19:47 > select date(c1) from time_demo limit 1;
+------------+
| date(c1)   |
+------------+
| 2022-01-06 |
+------------+

# 3.date轉time，沒有意義，直接變成 00:00:00

@author：http://www.cnblogs.com/geaozhang/

MySQL索引失效之隱式轉換

一、常見索引失效場景

1、條件欄位函式操作

2、條件欄位運算操作

3、隱式型別轉換

4、隱式字元編碼轉換

二、型別轉換

1、字串轉整型

2、時間型別轉換

MySQL索引失效之隱式轉換

JS型別轉換之隱式轉換

MySQL令人咋舌的隱式轉換

MySQL效能優化：MySQL中的隱式轉換造成的索引失效

Mysql 5.6 "隱式轉換"導致的索引失效和資料不準確的問題

解析MySQL隱式轉換問題

談談MySQL中的隱式轉換

Oracle索引欄位發生隱式轉換仍然能夠使用索引

MySQL和Oracle中的隱式轉換（r6筆記第45天)

從Java的型別轉換看MySQL和Oracle中的隱式轉換(二)(r6筆記第68天)

一次弄懂Javascript隱式轉換

Scala 系列（十三）—— 隱式轉換和隱式引數

Scala implicit 隱式轉換安全駕駛指南

scala 隱式轉換與隱式引數的使用方法

SQL Server 中的資料型別隱式轉換問題

隱式轉換引起的sql慢查詢實戰記錄

C++隱式轉換問題分析及解決辦法

入門大資料---Scala隱式轉換和隱式引數

JS進階（一）資料型別與隱式轉換

隱式轉換

MySQL索引失效之隱式轉換

一、常見索引失效場景

1、條件欄位函式操作

2、條件欄位運算操作

3、隱式型別轉換

4、隱式字元編碼轉換

二、型別轉換

1、字串轉整型

2、時間型別轉換

相關推薦