MySQL Transaction--MySQL鎖升級引發的死鎖問題
阿新 • • 發佈:2022-04-07
測試環境
Server version: 5.7.26-29-log Percona Server (GPL)
transaction_isolation: REPEATABLE-READ
測試資料
/* 1. 表結構 */ CREATE TABLE t ( id BIGINT UNSIGNED NOT NULL PRIMARY KEY COMMENT 'id, 無實際意義', account_id VARCHAR (64) NOT NULL COMMENT '使用者id,不同app下的account_id可能重複', type TINYINT UNSIGNED NOT NULL COMMENT '餘額型別 1:可用餘額', balance BIGINT UNSIGNED NOT NULL DEFAULT 0 COMMENT '餘額', state INT UNSIGNED NOT NULL DEFAULT 1 COMMENT '賬戶狀態 1:NORMAL; 2:FROZE', UNIQUE KEY uk_account (account_id, type) )ENGINE = INNODB DEFAULT CHARSET utf8mb4 COMMENT '測試'; /* 2. 其中 UNIQUE INDEX 為 uk_account(account_id, type) */ /* 3. 插入資料 */ insert into t values(1,'1',1,100,1); insert into t values(2,'2',1,100,1); insert into t values(3,'3',1,100,1); insert into t values(4,'4',1,100,1); insert into t values(5,'5',1,100,1); /* 4. 查詢所有資料. */ select * from t; +----+------------+------+---------+-------+ | id | account_id | type | balance | state | +----+------------+------+---------+-------+ | 1 | 1 | 1 | 100 | 1 | | 2 | 2 | 1 | 100 | 1 | | 3 | 3 | 1 | 100 | 1 | | 4 | 4 | 1 | 100 | 1 | | 5 | 5 | 1 | 100 | 1 | +----+------------+------+---------+-------+
測試場景1
會話1開啟事務並執行(成功執行):
BEGIN;
SELECT * FROM t
WHERE account_id = '1'
AND TYPE =1
FOR UPDATE;
會話2開啟事務並執行(事務被阻塞):
BEGIN;
SELECT * FROM t
WHERE account_id = '1'
AND TYPE =1
FOR UPDATE;
會話3開啟事務並執行(事務被阻塞):
BEGIN;
SELECT * FROM t
WHERE account_id = '1'
AND TYPE =1
FOR UPDATE;
檢視阻塞事務資訊:
## 檢視阻塞事務資訊 SELECT p2.`ID` blocked_process_id, p2.`HOST` blocked_host, p2.`USER` blocked_user, r.trx_id bloecked_trx_id, r.trx_state as blocked_trx_state, r.trx_started as blocked_trx_started, TIMESTAMPDIFF(SECOND,r.trx_wait_started,CURRENT_TIMESTAMP) blocked_wait_seconds, r.trx_query blocked_query, concat('index: ',l.lock_table,'.',m.`lock_index`,', lock_mode:',m.`lock_type`,',lock_mode:',m.`lock_mode`) as blocked_lock_info, m.lock_data blocked_lock_data, p.`ID` blocking_process_id, p.`HOST` blocking_host, p.`USER` blocking_user, b.trx_id blocking_trx_id, b.trx_state as blocking_trx_state, b.trx_started as blocking_trx_started, b.trx_query blocking_query, IF (p.COMMAND = 'Sleep', CONCAT(p.TIME,' seconds'), 0) blocking_thread_idle_seconds, CONCAT('kill ',p.`ID`,';') kill_sql FROM information_schema.INNODB_LOCK_WAITS w INNER JOIN information_schema.INNODB_TRX b ON b.trx_id = w.blocking_trx_id INNER JOIN information_schema.INNODB_TRX r ON r.trx_id = w.requesting_trx_id INNER JOIN information_schema.INNODB_LOCKS l ON w.blocking_lock_id = l.lock_id AND l.`lock_trx_id`=b.`trx_id` INNER JOIN information_schema.INNODB_LOCKS m ON m.`lock_id`=w.`requested_lock_id` AND m.`lock_trx_id`=r.`trx_id` INNER JOIN information_schema.PROCESSLIST p ON p.ID = b.trx_mysql_thread_id INNER JOIN information_schema.PROCESSLIST p2 ON p2.ID = r.trx_mysql_thread_id ORDER BY blocked_wait_seconds DESC ; *************************** 1. row *************************** blocked_process_id: 155692 blocked_host: 127.0.0.1:52486 blocked_user: wenjiag.gao bloecked_trx_id: 157794 blocked_trx_state: LOCK WAIT blocked_trx_started: 2022-04-07 15:32:38 blocked_wait_seconds: 40 blocked_query: SELECT * FROM t WHERE account_id = '1' AND TYPE =1 FOR UPDATE blocked_lock_info: index: `test_db`.`t`.uk_account, lock_mode:RECORD,lock_mode:X blocked_lock_data: '1', 1 blocking_process_id: 155690 blocking_host: 127.0.0.1:52384 blocking_user: wenjiag.gao blocking_trx_id: 157793 blocking_trx_state: RUNNING blocking_trx_started: 2022-04-07 15:32:34 blocking_query: NULL blocking_thread_idle_seconds: 102 seconds kill_sql: kill 155690; *************************** 2. row *************************** blocked_process_id: 155691 blocked_host: 127.0.0.1:52480 blocked_user: wenjiag.gao bloecked_trx_id: 157795 blocked_trx_state: LOCK WAIT blocked_trx_started: 2022-04-07 15:32:48 blocked_wait_seconds: 35 blocked_query: SELECT * FROM t WHERE account_id = '1' AND TYPE =1 FOR UPDATE blocked_lock_info: index: `test_db`.`t`.uk_account, lock_mode:RECORD,lock_mode:X blocked_lock_data: '1', 1 blocking_process_id: 155692 blocking_host: 127.0.0.1:52486 blocking_user: wenjiag.gao blocking_trx_id: 157794 blocking_trx_state: LOCK WAIT blocking_trx_started: 2022-04-07 15:32:38 blocking_query: SELECT * FROM t WHERE account_id = '1' AND TYPE =1 FOR UPDATE blocking_thread_idle_seconds: 0 kill_sql: kill 155692; *************************** 3. row *************************** blocked_process_id: 155691 blocked_host: 127.0.0.1:52480 blocked_user: wenjiag.gao bloecked_trx_id: 157795 blocked_trx_state: LOCK WAIT blocked_trx_started: 2022-04-07 15:32:48 blocked_wait_seconds: 35 blocked_query: SELECT * FROM t WHERE account_id = '1' AND TYPE =1 FOR UPDATE blocked_lock_info: index: `test_db`.`t`.uk_account, lock_mode:RECORD,lock_mode:X blocked_lock_data: '1', 1 blocking_process_id: 155690 blocking_host: 127.0.0.1:52384 blocking_user: wenjiag.gao blocking_trx_id: 157793 blocking_trx_state: RUNNING blocking_trx_started: 2022-04-07 15:32:34 blocking_query: NULL blocking_thread_idle_seconds: 102 seconds kill_sql: kill 155690; 3 rows in set, 3 warnings (0.00 sec)
檢視阻塞鎖資訊:
## 輸出鎖資訊 SET GLOBAL innodb_status_output_locks = ON; ## 檢視事務資訊 SHOW ENGINE INNODB STATUS \G ---TRANSACTION 157795, ACTIVE 57 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1136, 2 row lock(s) MySQL thread id 155691, OS thread handle 139785549563648, query id 622725 127.0.0.1 wenjiag.gao statistics SELECT * FROM t WHERE account_id = '1' AND TYPE =1 FOR UPDATE ------- TRX HAS BEEN WAITING 4 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 30 page no 4 n bits 72 index uk_account of table `test_db`.`t` trx id 157795 lock_mode X locks rec but not gap waiting ------------------ TABLE LOCK table `test_db`.`t` trx id 157795 lock mode IX RECORD LOCKS space id 30 page no 4 n bits 72 index uk_account of table `test_db`.`t` trx id 157795 lock_mode X locks rec but not gap waiting ---TRANSACTION 157794, ACTIVE 67 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1136, 2 row lock(s) MySQL thread id 155692, OS thread handle 139785549838080, query id 622724 127.0.0.1 wenjiag.gao statistics SELECT * FROM t WHERE account_id = '1' AND TYPE =1 FOR UPDATE ------- TRX HAS BEEN WAITING 9 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 30 page no 4 n bits 72 index uk_account of table `test_db`.`t` trx id 157794 lock_mode X locks rec but not gap waiting ------------------ TABLE LOCK table `test_db`.`t` trx id 157794 lock mode IX RECORD LOCKS space id 30 page no 4 n bits 72 index uk_account of table `test_db`.`t` trx id 157794 lock_mode X locks rec but not gap waiting ---TRANSACTION 157793, ACTIVE 71 sec 3 lock struct(s), heap size 1136, 2 row lock(s) MySQL thread id 155690, OS thread handle 139785550108416, query id 622719 127.0.0.1 wenjiag.gao TABLE LOCK table `test_db`.`t` trx id 157793 lock mode IX RECORD LOCKS space id 30 page no 4 n bits 72 index uk_account of table `test_db`.`t` trx id 157793 lock_mode X locks rec but not gap RECORD LOCKS space id 30 page no 3 n bits 72 index PRIMARY of table `test_db`.`t` trx id 157793 lock_mode X locks rec but not gap
從上面的事務阻塞資訊和事務鎖資訊可得到:
- 會話1執行請求,獲取到唯一索引uk_account上(account_id,TYPE)為('1',1)的行鎖,獲取到主鍵索引上(id)為(1)的行鎖。
- 會話2執行請求,被會話1阻塞,等待唯一索引uk_account上(account_id,TYPE)為('1',1)的行鎖。
- 會話3執行請求,被會話1和會話2阻塞,等待唯一索引uk_account上(account_id,TYPE)為('1',1)的行鎖。
雖然會話2並未獲得唯一索引uk_account上(account_id,TYPE)為('1',1)的行鎖(處於等待鎖的狀態),由於會話3同樣申請會話2等待的鎖資源且會話3在會話2後執行,因此MySQL判斷會話3被會話2阻塞。
測試場景2
會話1開啟事務並執行(成功執行):
BEGIN;
SELECT * FROM t
WHERE account_id = '1'
AND TYPE =1
FOR UPDATE;
會話2開始事務執行(事務被阻塞):
BEGIN;
SELECT * FROM t
WHERE account_id = '1'
AND TYPE =1
FOR UPDATE;
會話1繼續執行(執行成功):
UPDATE t
SET state = 2
WHERE account_id = '1';
會話2出現死鎖被回滾:
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transactio
根據上個測試案例的測試結果分析:
- 會話1執行請求(第一次),獲取到唯一索引uk_account上(account_id,TYPE)為('1',1)的行鎖,獲取到主鍵索引上(id)為(1)的行鎖。
- 會話2執行請求,被會話1阻塞,等待唯一索引uk_account上(account_id,TYPE)為('1',1)的行鎖。
- 會話1執行請求(第二次),由於條件為account_id = '1',只能使用唯一索引uk_account上(account_id,TYPE)的第一列account_id,且在可重複讀事務隔離級別下執行,需要申請唯一索引uk_account上:
- (account_id,TYPE)為('1',1)記錄之前的間隙鎖,防止其他事務在該記錄前插入account_id = '1'的記錄如('1',0)。
- (account_id,TYPE)為('1',1)記錄的行鎖,防止其他事務修改該記錄。
- (account_id,TYPE)為('1',1)記錄之後的間隙鎖,防止其他事務在該記錄前插入account_id = '1'的記錄如('1',2)。
- (account_id,TYPE)為('1',1)記錄的行鎖和該記錄後的間隙鎖合為Next-Key鎖。
- 會話1的第二次執行申請唯一索引uk_account上(account_id,TYPE)為('1',1)的行鎖,而會話2同樣申請該行鎖且在會話1的第二次操作前執行,因此會話1的第二次執行會被會話2阻塞,觸發MySQL死鎖檢測機制,發現死鎖環路:
- 會話2等待會話1釋放唯一索引uk_account上(account_id,TYPE)為('1',1)的行鎖。
- 會話1等待會話2釋放唯一索引uk_account上(account_id,TYPE)為('1',1)的行鎖。
- 死鎖檢測機制挑選會話2作為死鎖犧牲者,將其回滾。
- 會話2回滾後,不再阻塞會話1的第二次執行,會話1申請到鎖資源併成功執行。
在MySQL早期版本中,即使事務之前操作已經獲取到相應的鎖資源,在後續操作如需"更大鎖資源"時會嘗試申請鎖資源而不是立即獲得該鎖資源,在MySQL 8.0.18修改該問題:
InnoDB: A deadlock was possible when a transaction tries to upgrade a record lock to a next key lock. (Bug #23755664, Bug #82127)
PS: 如果會話1執行的SQL仍是SELECT * FROM t WHERE account_id = '1' AND TYPE =1 FOR UPDATE;
,無需鎖升級也不會被會話2阻塞。