027_【重要#集群恢復步驟】MySQL Group Replication Got fatal error 1236 - CrazyPig的技術博客 - CSDN博客
MySQL Group Replication Got fatal error
12362017年03月13日 14:30:03ZzzCrazyPig閱讀數:2227
版權聲明:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/d6619309/article/details/61917089 問題產生的場景
原先配置了MGR集群,從節點退出集群(stop group_replication),隔了很長一段時間想要加進來,這段時間內,主節點進行很多操作,並且之前的binlog被purged掉
問題產生原因分析
MGR主節點的binary log被刪除掉(purge或者定期刪除掉,人為地在硬盤上執行rm刪除不在此考慮範圍), MGR從節點無法讀取到這些binary log, 嘗試多次recovery(找donor進行recovery)後失敗,從replication group中退出,從節點的報錯信息大致如下所示:
2017-03-13T10:00:26.358681+08:00 355 [ERROR] Error reading packet from server for channel ‘group_replication_recovery‘: The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires. (server_errno=1236) 2017-03-13T10:00:26.358720+08:00 355 [ERROR] Slave I/O for channel ‘group_replication_recovery‘: Got fatal error 1236 from master when reading data from binary log: ‘The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.‘, Error_code: 1236 ...
x1
2017-03-13T10:00:26.358681+08:00 355 [ERROR] Error reading packet from server for channel ‘group_replication_recovery‘: The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires. (server_errno=1236)
22017-03-13T10:00:26.358720 +08:00 355 [ERROR] Slave I/O for channel ‘group_replication_recovery‘: Got fatal error 1236 from master when reading data from binary log: ‘The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.‘, Error_code: 1236
3
解決思路
從MGR主節點利用mysqldump dump出數據,包括gtid_purged,在從節點上應用
解決步驟
mysqldump --all-databases --set-gtid-purged=ON --single-transaction -uroot -S"/mysql_data/data/mgrtest1/mysql_mgrtest1.sock" -P24801 -p > /mysql_data/data/mgrtest1_alldb.sql
1 1mysqldump --all-databases --set-gtid-purged=ON --single-transaction -uroot -S"/mysql_data/data/mgrtest1/mysql_mgrtest1.sock" -P24801 -p > /mysql_data/data/mgrtest1_alldb.sql
輸入密碼回車後,提示:
Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don‘t want to restore GTIDs, pass --set-gtid-purged=OFF. To make a complete dump, pass --all-databases --triggers --routines --events.
mysqldump: Couldn‘t execute ‘SAVEPOINT sp‘: The MySQL server is running with the --transaction-write-set-extraction!=OFF option so it cannot execute this statement (1290)
2 1Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don‘t want to restore GTIDs, pass --set-gtid-purged=OFF. To make a complete dump, pass --all-databases --triggers --routines --events.
2mysqldump: Couldn‘t execute ‘SAVEPOINT sp‘: The MySQL server is running with the --transaction-write-set-extraction!=OFF option so it cannot execute this statement (1290)
關聯到mysql bug: https://bugs.mysql.com/bug.php?id=81494
根據提示信息,當節點運行在group replication模式下,不支持savepoint,而mysqldump中需要dump出這條語句,然後dump出set @@gitd_purged=XXX 這個語句
根據mysql bug鏈接提供的解決步驟,在主節點上執行:
mysql> set global transaction_write_set_extraction=OFF;
ERROR 3093 (HY000): The write set algorithm cannot be changed when Group replication is running.
2 1mysql> set global transaction_write_set_extraction=OFF;
2ERROR 3093 (HY000): The write set algorithm cannot be changed when Group replication is running.
主節點需要關閉group_replication(此時如果有其他ONLINE從節點,選擇其他ONLINE節點進行mysqldump應該是更好的選擇)
若stop的是主節點,請確保其他ONLINE從節點是否需要提前從集群中解除,如果需要,先stop所有從節點,再stop主節點
mysql> stop group_replication;
Query OK, 0 rows affected (8.74 sec)
mysql> set global transaction_write_set_extraction=OFF;
Query OK, 0 rows affected (0.00 sec)
5 1mysql> stop group_replication;
2Query OK, 0 rows affected (8.74 sec)
3
4mysql> set global transaction_write_set_extraction=OFF;
5Query OK, 0 rows affected (0.00 sec)
再次執行mysqldump命令即可成功dump出數據
將dump出來的文件拷貝到從節點所在機器,登陸從節點,執行:
reset master;
source ${your_sql_file}
2 1reset master;
2source ${your_sql_file}
恢復主節點:
mysql> set global transaction_write_set_extraction=XXHASH64;
Query OK, 0 rows affected (0.00 sec)
mysql> SET GLOBAL group_replication_bootstrap_group=ON;
Query OK, 0 rows affected (0.01 sec)
mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (1.06 sec)
mysql> SET GLOBAL group_replication_bootstrap_group=OFF;
Query OK, 0 rows affected (0.00 sec)
11 1mysql> set global transaction_write_set_extraction=XXHASH64;
2Query OK, 0 rows affected (0.00 sec)
3
4mysql> SET GLOBAL group_replication_bootstrap_group=ON;
5Query OK, 0 rows affected (0.01 sec)
6
7mysql> START GROUP_REPLICATION;
8Query OK, 0 rows affected (1.06 sec)
9
10mysql> SET GLOBAL group_replication_bootstrap_group=OFF;
11Query OK, 0 rows affected (0.00 sec)
最後,在從節點上執行:
start group_replication;
1 1start group_replication;
到這裏從節點應該能夠正常加入了
參考
引用 http://lefred.be/content/mysql-group-replication-limitations-savepoints/,評論部分如下描述 :
Savepoints are also used when executing mysqldump with –single-transaction option.
http://dev.mysql.com/doc/refman/5.7/en/mysqldump.html#option_mysqldump_single-transaction
That means if you want to do a consistent non blocking mysqldump on a node of the group you need to put the node out of the cluster first.
e.g.
mysql> — Put the node out of the Group Replication cluster
mysql> STOP group_replication;
mysql> SET GLOBAL transaction_write_set_extraction=OFF;
$ # Dump the entire node (instance)
$ mysqldump –all-databases –triggers –routines –events –single-transaction > /mysqldump/dump_YYYYMMDD.sql
mysql> — Bring back the node into the cluster
mysql> SET GLOBAL transaction_write_set_extraction=XXHASH64;
mysql> START group_replication;
x 1Savepoints are also used when executing mysqldump with –single-transaction option.
2http://dev.mysql.com/doc/refman/5.7/en/mysqldump.html#option_mysqldump_single-transaction
3
4That means if you want to do a consistent non blocking mysqldump on a node of the group you need to put the node out of the cluster first.
5
6e.g.
7mysql> — Put the node out of the Group Replication cluster
8mysql> STOP group_replication;
9mysql> SET GLOBAL transaction_write_set_extraction=OFF;
10
11$ # Dump the entire node (instance)
12$ mysqldump –all-databases –triggers –routines –events –single-transaction > /mysqldump/dump_YYYYMMDD.sql
13
14mysql> — Bring back the node into the cluster
15mysql> SET GLOBAL transaction_write_set_extraction=XXHASH64;
16mysql> START group_replication;
來自為知筆記(Wiz)
027_【重要#集群恢復步驟】MySQL Group Replication Got fatal error 1236 - CrazyPig的技術博客 - CSDN博客