【案例】利用innodb_force_recovery 解決MySQL服務器crash無法重啟問題
一 背景
某一創業的朋友的主機因為磁盤陣列損壞機器crash,重啟MySQL服務時 報如下錯誤:
-
InnoDB: Reading tablespace information from the .ibd files...
-
InnoDB: Restoring possible half-written data pages from the doublewrite
-
InnoDB: buffer...
-
InnoDB: Doing recovery: scanned up to log sequence number 9120034833
-
150125 16:12:51 InnoDB: Starting an apply batch of log records to the database...
-
InnoDB: Progress in percents: 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 150125 16:12:51 [ERROR] mysqld got signal 11 ;
-
This could be because you hit a bug. It is also possible that this binary
-
or one of the libraries it was linked against is corrupt, improperly built,
-
or misconfigured. This error can also be caused by malfunctioning hardware.
-
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
-
We will try our best to scrape up some info that will hopefully help
-
diagnose the problem, but since we have already crashed,
-
something is definitely wrong and this may fail.
-
Server version: 5.5.37-MariaDB-log
-
key_buffer_size=268435456
-
read_buffer_size=1048576
-
max_used_connections=0
-
max_threads=1002
-
thread_count=0
-
It is possible that mysqld could use up to
-
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2332093 K bytes of memory
- 41 Hope that.
二 分析
主要關註 mysqld got signal 11 的問題,從日誌內容分析來看,數據庫在機器crash 導致日誌文件損壞,重啟之後無法正常恢復,更無法正常對外提供服務。
三 解決
因為日誌已經損壞,這裏采用非常規手段,首先修改innodb_force_recovery參數,使mysqld跳過恢復步驟,將mysqld 啟動,將數據導出來然後重建數據庫。
innodb_force_recovery可以設置為1-6,大的數字包含前面所有數字的影響。
1. (SRV_FORCE_IGNORE_CORRUPT):忽略檢查到的corrupt頁。
2. (SRV_FORCE_NO_BACKGROUND):阻止主線程的運行,如主線程需要執行full purge操作,會導致crash。
3. (SRV_FORCE_NO_TRX_UNDO):不執行事務回滾操作。
4. (SRV_FORCE_NO_IBUF_MERGE):不執行插入緩沖的合並操作。
5. (SRV_FORCE_NO_UNDO_LOG_SCAN):不查看重做日誌,InnoDB存儲引擎會將未提交的事務視為已提交。
6. (SRV_FORCE_NO_LOG_REDO):不執行前滾的操作。
註意
a 當設置參數值大於0後,可以對表進行select,create,drop操作,但insert,update或者delete這類操作是不允許的。
b 當innodb_purge_threads 和 innodb_force_recovery一起設置會出現一種loop現象:
-
150125 17:07:42 InnoDB: Waiting for the background threads to start
-
150125 17:07:43 InnoDB: Waiting for the background threads to start
-
150125 17:07:44 InnoDB: Waiting for the background threads to start
-
150125 17:07:45 InnoDB: Waiting for the background threads to start
-
150125 17:07:46 InnoDB: Waiting for the background threads to start
- 150125 17:07:47 InnoDB: Waiting for the background threads to start
在my.cnf中修改以下兩個參數
innodb_force_recovery=6
innodb_purge_thread=0
重啟MySQL
-
150125 17:10:47 [Note] Crash recovery finished.
-
150125 17:10:47 [Note] Server socket created on IP: ‘0.0.0.0‘.
-
150125 17:10:47 [Note] Event Scheduler: Loaded 0 events
-
150125 17:10:47 [Note] /vdata/webserver/mysql/bin/mysqld: ready for connections.
- Version: ‘5.5.37-MariaDB-log‘ socket: ‘/tmp/mysql.sock‘ port: 3306 Source distribution
立即對數據庫做邏輯導出 ,完成之後將innodb_force_recovery設置為0 ,innodb_purge_thread=1 ,然後重建數據庫 。
另外 MySQL 版本 5.5以及之前 ,當innodb_purge_threads =1,innodb_force_recovery >1 的情況會出現上文提到的循環報warning 問題(=1 沒有問題),
原因:
MySQL 的源代碼中顯示 當innodb_purge_threads 和 innodb_force_recovery一起設置會出現loop循環
-
while (srv_shutdown_state == SRV_SHUTDOWN_NONE) {
-
if (srv_thread_has_reserved_slot(SRV_MASTER) == ULINT_UNDEFINED
-
|| (srv_n_purge_threads == 1
-
&& srv_thread_has_reserved_slot(SRV_WORKER)
-
== ULINT_UNDEFINED)) {
-
ut_print_timestamp(stderr);
- fprintf(stderr, " InnoDB: Waiting for the background threads to start\n");
-
os_thread_sleep(1000000);
-
} else {
-
break;
-
}
- }
所以當需要設置innodb_force_recovery>1的時候需要關閉 innodb_purge_threads,設置為0(默認)。
四 小結
MySQL crash 或者 MySQL 數據庫服務器 crash 會導致各種各樣的問題 ,比如主備之間的error 1594 (5.6 版本開啟crash-safe ,會最大程度上避免 error 1594的問題,以後會寫5.6新特性介紹該功能 ),error 1236, 日誌損壞,數據文件損壞 ,等等,本案例只是其中的一種,細心從日誌中找的相關錯誤提示,逐步解決即可。
【案例】利用innodb_force_recovery 解決MySQL服務器crash無法重啟問題