1. 程式人生 > 其它 >技術分享 | binlog 實用解析工具 my2sql

技術分享 | binlog 實用解析工具 my2sql

技術標籤:技術分享my2sqlbinlognyflush

作者:趙黎明
愛可生 MySQL DBA 團隊成員,Oracle 10g OCM,MySQL 5.7 OCP,擅長資料庫效能問題診斷、事務與鎖問題的分析等,負責處理客戶 MySQL 及我司自研 DMP 平臺日常運維中的問題,對開源資料庫相關技術非常感興趣。
本文來源:原創投稿
*愛可生開源社群出品,原創內容未經授權不得隨意使用,轉載請聯絡小編並註明來源。

前言

大部分 DBA 應該都已經熟悉並使用過一些閃回工具,諸如:binlog2sql、MyFlash。
今天要介紹的是另一款基於 Go 編寫的 binlog 解析工具:my2sql,他的同門師兄還有 my2fback、binlog_inspector(binlog_rollback)。

為什麼不直接稱其為閃回工具呢?因為閃回只是它其中一個功能,除此之外,還可用於從 binlog 中解析出執行過的 SQL 語句用於資料補償,或者對線上執行的事務進行分析(捕獲大/長事務)。
Github 地址:https://github.com/liuhr/my2sql

對比

  • binlog2sql:Python 編寫(執行時需要有 Python2.7、Python3.4+ 的環境),用於生成回滾/前滾 SQL 進行資料恢復/補償
  • MyFlash:C 語言編寫(需要動態編譯成二級制指令碼後執行),用於生成反向 binlog 檔案(二進位制)進行資料恢復
  • my2sql:Go 語言編寫(可直接下載 linux 二進位制版本執行),除了閃回,還提供了前滾和事務分析的功能

主要引數

  • -work-type:指定工作型別(前滾、閃回、事務分析),合法值分別為:2sql(預設)、rollback、stats
  • -sql:過濾 DML 語句的型別,合法值為:insert、update、delete
  • -ignorePrimaryKeyForInsert:對於 work-type 為 2sql 的 insert 操作,忽略主鍵(適合大量資料匯入的場景)
  • -big-trx-row-limit int:判定為大事務的閾值(預設 500 行),合法值區間:10-30000 行
  • -long-trx-seconds int:判定為長事務的閾值(預設 300 秒),合法值區間:1-3600 秒
  • -databases:過濾庫,預設為全部庫
  • -tables:過濾表,預設為全部表
  • -start-file:指定開始的 binlog 檔案
  • -start-pos:指定 binlog 檔案中開始的點位
  • -start-datetime:指定開始的時間
  • -stop-datetime:指定結束的時間
  • -output-dir:指定檔案生成目錄
  • -output-toScreen:指定輸出到螢幕
  • -tl:指定時區(time location),預設為 local(Asia/Shanghai)

應用場景

場景 1:閃回 DML 誤操作資料

-- 建立測試表
mysql> show create table t\G
*************************** 1. row ***************************
       Table: t
Create Table: CREATE TABLE `t` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=10001 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
​
mysql> select count(*) from t;
+----------+
| count(*) |
+----------+
|    10000 |
+----------+
1 row in set (0.00 sec)
​
-- 檢視測試表校驗值
mysql> checksum table t;
+-------+------------+
| Table | Checksum   |
+-------+------------+
| zlm.t | 1966401574 |
+-------+------------+
1 row in set (0.02 sec)

-- 檢視當前binlog
mysql> show master status;
+------------------+----------+--------------+------------------+----------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                            |
+------------------+----------+--------------+------------------+----------------------------------------------+
| mysql-bin.000003 | 15994082 |              |                  | 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-23123 |
+------------------+----------+--------------+------------------+----------------------------------------------+
1 row in set (0.00 sec)

-- 刪除5000條記錄
mysql> delete from t where id<5001;
Query OK, 5000 rows affected (0.04 sec)
​
-- 執行my2sql生成回滾語句(rollback模式)
04:38 PM dmp1 (master) ~# ./my2sql -user zlm -password zlm -host 10.186.60.62 -port 3332 -work-type rollback -start-file mysql-bin.000003 -start-pos 15994082 --add-extraInfo -output-dir /tmp/my2sql_test
[2020/12/25 16:38:17] [info] binlogsyncer.go:144 create BinlogSyncer with config {1113306 mysql 10.186.60.62 3332 zlm   utf8 false false <nil> false Local false 0 0s 0s 0 false false 0}
[2020/12/25 16:38:17] [info] events.go:58 start thread 1 to generate redo/rollback sql
[2020/12/25 16:38:17] [info] events.go:58 start thread 2 to generate redo/rollback sql
[2020/12/25 16:38:17] [info] events.go:208 start thread to write redo/rollback sql into file
[2020/12/25 16:38:17] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin.000003, 15994082)
[2020/12/25 16:38:17] [info] stats_process.go:166 start thread to analyze statistics from binlog
[2020/12/25 16:38:17] [info] repl.go:15 start to get binlog from mysql
[2020/12/25 16:38:17] [info] binlogsyncer.go:777 rotate to (mysql-bin.000003, 15994082)
[2020/12/25 16:38:17] [info] events.go:242 finish processing mysql-bin.000003 16002473
[2020/12/25 16:38:22] [info] repl.go:83 deadline exceeded.
[2020/12/25 16:38:22] [info] repl.go:17 finish getting binlog from mysql
[2020/12/25 16:38:22] [info] stats_process.go:266 exit thread to analyze statistics from binlog
[2020/12/25 16:38:22] [info] events.go:183 exit thread 1 to generate redo/rollback sql
[2020/12/25 16:38:22] [info] events.go:183 exit thread 2 to generate redo/rollback sql
[2020/12/25 16:38:22] [info] events.go:257 finish writing rollback sql into tmp files, start to revert content order of tmp files
[2020/12/25 16:38:22] [info] rollback_process.go:15 start thread 1 to revert rollback sql files
[2020/12/25 16:38:22] [info] rollback_process.go:41 start to revert tmp file /tmp/my2sql_test/.rollback.3.sql into /tmp/my2sql_test/rollback.3.sql
[2020/12/25 16:38:22] [info] rollback_process.go:156 finish reverting tmp file /tmp/my2sql_test/.rollback.3.sql into /tmp/my2sql_test/rollback.3.sql
[2020/12/25 16:38:22] [info] rollback_process.go:25 exit thread 1 to revert rollback sql files
[2020/12/25 16:38:22] [info] events.go:270 finish reverting content order of tmp files
[2020/12/25 16:38:22] [info] events.go:275 exit thread to write redo/rollback sql into file

-- 檢視生成的檔案
04:40 PM dmp1 /tmp/my2sql_test# ll
total 1228
-rw-r--r-- 1 root root     251 Dec 25 16:38 biglong_trx.txt
-rw-r--r-- 1 root root     288 Dec 25 16:38 binlog_status.txt
-rw-r--r-- 1 root root 1246880 Dec 25 16:38 rollback.3.sql

## 其中rollback.x.sql就是我們閃回資料需要的sql檔案(x對應mysql-binlog.00000x的檔案編號x),由於執行命令時只指定了binlog開始的位置,例項中後續執行的DML事務也都會被記錄
## 另外2個檔案分別是從binlog中獲取到的binlog狀態和事務資訊,之後的案例會詳細展示說明,此處略過

-- 檢視生成的回滾SQL文字檔案
04:42 PM dmp1 /tmp/my2sql_test# tail -5 rollback.3.sql 
INSERT INTO `zlm`.`t` (`id`,`k`,`c`,`pad`) VALUES (4,5027,'54133149494-75722987476-23015721680-47254589498-40242947469-55055884969-23675271222-20181439230-74473404563-55407972672','88488171626-98596569412-94026374972-58040528656-38000028170');
INSERT INTO `zlm`.`t` (`id`,`k`,`c`,`pad`) VALUES (3,4990,'51185622598-89397522786-28007882305-52050087550-68686337807-48942386476-96555734557-05264042377-33586177817-31986479495','00592560354-80393027097-78244247549-39135306455-88936868384');
INSERT INTO `zlm`.`t` (`id`,`k`,`c`,`pad`) VALUES (2,5025,'13241531885-45658403807-79170748828-69419634012-13605813761-77983377181-01582588137-21344716829-87370944992-02457486289','28733802923-10548894641-11867531929-71265603657-36546888392');
INSERT INTO `zlm`.`t` (`id`,`k`,`c`,`pad`) VALUES (1,5015,'68487932199-96439406143-93774651418-41631865787-96406072701-20604855487-25459966574-28203206787-41238978918-19503783441','22195207048-70116052123-74140395089-76317954521-98694025897');
# datetime=2020-12-25_16:33:30 database=zlm table=t binlog=mysql-bin.000003 startpos=15994218 stoppos=16002473

05:10 PM dmp1 /tmp/my2sql_test# cat rollback.3.sql |grep "INSERT INTO"|wc -l
5000

## 可以看到,該閃回SQL檔案中有5000個INSERT語句,正好對應之前刪除的5000條記錄
## 閃回SQL檔案中的“# datetime=...”這行的內容就是加了引數-add-extrainfo後加入的附加資訊,可以獲取每個語句執行的具體時間和點位
## 在實際情況中,從binlog中解析出來的事務會很複雜,為了便於分析,建議加上過濾庫、表的引數-databases和-tables,這樣生成的SQL檔案也會小很多

-- 資料恢復(將誤刪資料匯入)
05:17 PM dmp1 (master) ~# mysql32 < /tmp/my2sql_test/rollback.3.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
05:19 PM dmp1 (master) ~# mysql32 -Ne "select count(*) from zlm.t;checksum table zlm.t;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+-------+
| 10000 |
+-------+
+-------+------------+
| zlm.t | 1966401574 |
+-------+------------+

## 5000條刪除的資料匯入表後,表的checksum與刪除前一致,說明該在表上沒有進行過其他DML操作
## 如果記錄數一致而checksum不一致,則認為恢復後的資料仍然是不一致的,需要確認是否需要做更多的閃回操作

場景 2:手動補償主從異常切換後的資料不一致

-- 用sysbench給一個持續寫入
01:34 AM dmp1 /usr/local/sysbench/share/sysbench# sysbench /usr/local/sysbench/share/sysbench/oltp_read_write.lua --db-driver=mysql --tables=1 --table_size=10000 --mysql-host=10.186.60.62 --mysql-port=3332 --mysql-db=zlm --mysql-user=zlm --mysql-password=zlm --report-interval=2 --threads=10 --time=600 --skip-trx=on --mysql-ignore-errors=1062,1213 --db-ps-mode=disable run
sysbench 1.0.17 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 10
Report intermediate results every 2 second(s)
Initializing random number generator from current time


Initializing worker threads...

Threads started!

[ 2s ] thds: 10 tps: 712.84 qps: 12964.69 (r/w/o: 10095.03/2855.81/13.85) lat (ms,95%): 27.17 err/s: 5.44 reconn/s: 0.00
[ 4s ] thds: 10 tps: 733.05 qps: 13289.89 (r/w/o: 10342.52/2939.79/7.57) lat (ms,95%): 28.16 err/s: 4.54 reconn/s: 0.00
... 略

-- 確認從庫正常複製後停止IO執行緒
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.186.60.62
                  Master_User: zlm
                  Master_Port: 3332
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000013
          Read_Master_Log_Pos: 60687481
               Relay_Log_File: relay-bin.000009
                Relay_Log_Pos: 60685775
        Relay_Master_Log_File: mysql-bin.000013
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 60685562
              Relay_Log_Space: 594023093
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 623332
                  Master_UUID: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e
             Master_Info_File: /data/mysql/mysql3332/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Reading event from the relay log
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:58752-915239
            Executed_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-915235
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

mysql> stop slave io_thread;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 10.186.60.62
                  Master_User: zlm
                  Master_Port: 3332
                Connect_Retry: 60
              Master_Log_File:mysql-bin.000013
          Read_Master_Log_Pos: 78769827
               Relay_Log_File: relay-bin.000009
                Relay_Log_Pos: 78770040
        Relay_Master_Log_File: mysql-bin.000013
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos:78769827
              Relay_Log_Space: 612105439
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 623332
                  Master_UUID: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e
             Master_Info_File: /data/mysql/mysql3332/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:58752-941653
            Executed_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

-- 停止sysbench寫入
[ 120s ] thds: 10 tps: 820.27 qps: 14836.91 (r/w/o: 11540.32/3284.59/12.00) lat (ms,95%): 22.28 err/s: 4.00 reconn/s: 0.00
[ 122s ] thds: 10 tps: 718.49 qps: 13009.74 (r/w/o: 10126.80/2874.44/8.50) lat (ms,95%): 24.83 err/s: 4.50 reconn/s: 0.00
[ 124s ] thds: 10 tps: 781.74 qps: 14178.87 (r/w/o: 11029.89/3137.97/11.00) lat (ms,95%): 23.52 err/s: 6.50 reconn/s: 0.00
^C

## 此時主庫的資料是比從庫多的(可以看作是非同步複製中的IO執行緒延遲),假設又正好發生了高可用切換,從庫被切成了主庫,我們需要手動補償新主上缺失的這部分資料(如果高可用切換工具沒有這部分實現邏輯的話)

-- 確認後續(缺失)事務的起始點位
從show slave status的輸出,我們可以獲知以下資訊:
1、新主已執行過的事務集合:
1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653
2、對應舊主上的點位:
Master_Log_File:mysql-bin.000013
Exec_Master_Log_Pos: 78769827

-- 檢視舊主當前GTID資訊
mysql> show master status;
+------------------+-----------+--------------+------------------+------------------------------------------------+
| File             | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                              |
+------------------+-----------+--------------+------------------+------------------------------------------------+
| mysql-bin.000013 | 151283749 |              |                  | 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-1047585 |
+------------------+-----------+--------------+------------------+------------------------------------------------+
1 row in set (0.00 sec)

## 相差了105932個事務,binlog從77M增長到150M左右

-- 也可以離線解析binlog檔案來確認file+pos的點位與GTID是否是對應
/usr/local/mysql5732/bin/mysqlbinlog -vv --base64-output=decode-rows mysql-bin.000013|less
... 略
#201226  1:36:31 server id 623332  end_log_pos 78769827 CRC32 0xe5032fcf        Xid = 4003981
COMMIT/*!*/;
# at78769827   //當1d7ef0f4-4593-11eb-9f04-02000aba3c3e:941653的事務COMMIT後,pos點位就停在這裡,是一致的
#201226  1:36:31 server id 623332  end_log_pos 78769892 CRC32 0x6cec07fc        GTID    last_committed=115069   sequence_number=115070  rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= '1d7ef0f4-4593-11eb-9f04-02000aba3c3e:941654'/*!*/;
# at 78769892
#201226  1:36:31 server id 623332  end_log_pos 78769963 CRC32 0x0b9b5557        Query   thread_id=89    exec_time=0     error_code=0
SET TIMESTAMP=1608917791/*!*/;
BEGIN
/*!*/;
# at 78769963
#201226  1:36:31 server id 623332  end_log_pos 78770228 CRC32 0x75e40978        Rows_query
# INSERT INTO sbtest1 (id, k, c, pad) VALUES (5020, 5013, '79652507036-05590009094-10370692577-33401396318-81508361252-10613546461-82822929332-17272183925-71915791860-00345159222', '25450417435-19336936168-49193845527-09907338597-56878802246')
# at 78770228
#201226  1:36:31 server id 623332  end_log_pos 78770284 CRC32 0xfc0d0d16        Table_map: `zlm`.`sbtest1` mapped to number 204
# at 78770284
#201226  1:36:31 server id 623332  end_log_pos 78770509 CRC32 0xe1d44365        Write_rows: table id 204 flags: STMT_END_F
### INSERT INTO `zlm`.`sbtest1`
### SET
###   @1=5020 /* INT meta=0 nullable=0 is_null=0 */
###   @2=5013 /* INT meta=0 nullable=0 is_null=0 */
###   @3='79652507036-05590009094-10370692577-33401396318-81508361252-10613546461-82822929332-17272183925-71915791860-00345159222' /* STRING(360) meta=61032 nullable=0 is_null=0 */
###   @4='25450417435-19336936168-49193845527-09907338597-56878802246' /* STRING(180) meta=65204 nullable=0 is_null=0 */
# at 78770509
#201226  1:36:31 server id 623332  end_log_pos 78770540 CRC32 0x1adbcf14        Xid = 4003993
COMMIT/*!*/;
# at 78770540
#201226  1:36:31 server id 623332  end_log_pos 78770605 CRC32 0x52d9646f        GTID    last_committed=115070   sequence_number=115071  rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= '1d7ef0f4-4593-11eb-9f04-02000aba3c3e:941655'/*!*/;
# at 78770605
#201226  1:36:31 server id 623332  end_log_pos 78770676 CRC32 0x7ed98e8d        Query   thread_id=85    exec_time=0     error_code=0
SET TIMESTAMP=1608917791/*!*/;
BEGIN
/*!*/;
# at 78770676
#201226  1:36:31 server id 623332  end_log_pos 78770738 CRC32 0x28584bc5        Rows_query
# UPDATE sbtest1 SET k=k+1 WHERE id=6248

-- 生成前滾SQL(從mysql-bin.000013的78769827開始)
01:55 PM dmp1 (master) ~# ./my2sql -user zlm -password zlm -host 10.186.60.62 -port 3332 -work-type 2sql -start-file mysql-bin.000013 -start-pos78769827 --add-extraInfo -output-dir /tmp/my2sql_test
[2020/12/26 01:54:51] [info] binlogsyncer.go:144 create BinlogSyncer with config {1113306 mysql 10.186.60.62 3332 zlm   utf8 false false <nil> false Local false 0 0s 0s 0 false false 0}
[2020/12/26 01:54:51] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin.000013, 78769827)
[2020/12/26 01:54:51] [info] events.go:208 start thread to write redo/rollback sql into file
[2020/12/26 01:54:51] [info] events.go:58 start thread 1 to generate redo/rollback sql
[2020/12/26 01:54:51] [info] events.go:58 start thread 2 to generate redo/rollback sql
[2020/12/26 01:54:51] [info] stats_process.go:166 start thread to analyze statistics from binlog
[2020/12/26 01:54:51] [info] repl.go:15 start to get binlog from mysql
[2020/12/26 01:54:51] [info] binlogsyncer.go:777 rotate to (mysql-bin.000013, 78769827)
[2020/12/26 01:54:51] [info] events.go:242 finish processing mysql-bin.000013 78770509
[2020/12/26 01:54:52] [info] events.go:242 finish processing mysql-bin.000013 89256973
[2020/12/26 01:54:53] [info] events.go:242 finish processing mysql-bin.000013 99742835
[2020/12/26 01:54:54] [info] events.go:242 finish processing mysql-bin.000013 110229245
[2020/12/26 01:54:55] [info] events.go:242 finish processing mysql-bin.000013 120715455
[2020/12/26 01:54:56] [info] events.go:242 finish processing mysql-bin.000013 131201391
[2020/12/26 01:54:57] [info] events.go:242 finish processing mysql-bin.000013 141687468
[2020/12/26 01:55:03] [info] repl.go:83 deadline exceeded.
[2020/12/26 01:55:03] [info] repl.go:17 finish getting binlog from mysql
[2020/12/26 01:55:03] [info] stats_process.go:266 exit thread to analyze statistics from binlog
[2020/12/26 01:55:03] [info] events.go:183 exit thread 1 to generate redo/rollback sql
[2020/12/26 01:55:03] [info] events.go:183 exit thread 2 to generate redo/rollback sql
[2020/12/26 01:55:03] [info] events.go:272 finish writing redo/forward sql into file
[2020/12/26 01:55:03] [info] events.go:275 exit thread to write redo/rollback sql into file
​
-- 檢查生成的新檔案
01:54 AM dmp1 /tmp/my2sql_test# ll -lrt
total 25732
-rw-r--r-- 1 root root      107 Dec 26 01:54 biglong_trx.txt
-rw-r--r-- 1 root root 26339348 Dec 26 01:55 forward.13.sql
-rw-r--r-- 1 root root      432 Dec 26 01:55 binlog_status.txt

01:56 AM dmp1 /tmp/my2sql_test# cat binlog_status.txt 
binlog            starttime           stoptime            startpos   stoppos    inserts  updates  deletes  database        table               
mysql-bin.000013  2020-12-26_01:36:31 2020-12-26_01:37:00 78770228   143726338  23713    47466    23715    zlm             sbtest1             
mysql-bin.000013  2020-12-26_01:37:01 2020-12-26_01:37:04 143726770  151283718  2755     5530     2753     zlm             sbtest1             
01:57 AM dmp1 /tmp/my2sql_test# cat biglong_trx.txt 
binlog            starttime           stoptime            startpos   stoppos    rows     duration   tables

-- 計算事務個數
01:57 AM dmp1 /tmp/my2sql_test# mysql -Ne "select 23713+47466+23715+2755+5530+2753;"
+--------+
| 105932 |
+--------+
02:01 AM dmp1 /tmp/my2sql_test# cat forward.13.sql |grep INSERT|wc -l
26468
02:03 AM dmp1 /tmp/my2sql_test# cat forward.13.sql |grep UPDATE|wc -l
52996
02:03 AM dmp1 /tmp/my2sql_test# cat forward.13.sql |grep DELETE|wc -l
26468
02:04 AM dmp1 /tmp/my2sql_test# mysql32 -Ne "select 26468+52996+26468;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+--------+
| 105932 |
+--------+
01:56 AM dmp1 /tmp/my2sql_test# scp forward.13.sql 10.186.60.68:/tmp
forward.13.sql                                                                                                                                               100%   25MB 140.9MB/s   00:00

## 前滾SQL檔案有26M,拷貝到新主(60.68)
## 檢視生成的binlog_status.txt檔案,會統計每個時間段(POS)區間內相關庫表所產生的DML次數
## biglong_trx.txt檔案記錄的是大/長事務,此檔案為空,說明沒有大/長事務
## 由檔案中記錄的DML總執行次數可知,確實是執行了105932 個事務,與之前估算的一致

-- 將差異資料匯入新主
02:05 AM dmp2 (master) ~# mysql32 < /tmp/forward.13.sql 
mysql: [Warning] Using a password on the command line interface can be insecure.
02:05 AM dmp2 (master) ~#
​
-- 檢視從庫的事務寫入情況
mysql> show slave status\G
... 略 
           Retrieved_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:58752-941653
            Executed_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653,
3f4e72ab-46af-11eb-9bac-02000aba3c44:1-105932    //新主寫入了105932個事務,與之前統計的值一致
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

-- 校驗舊主和新主的測試表
## 舊主
mysql> checksum table sbtest1;
+-------------+-----------+
| Table       | Checksum  |
+-------------+-----------+
| zlm.sbtest1 | 670442058 |
+-------------+-----------+
1 row in set (0.01 sec)

## 新主
mysql> checksum table zlm.sbtest1;
+-------------+-----------+
| Table       | Checksum  |
+-------------+-----------+
| zlm.sbtest1 | 670442058 |
+-------------+-----------+
1 row in set (0.02 sec)
​
## 由於本次測試中僅僅是對sbtest1表執行了DML操作,可以認為新主缺失的資料已經得到補償
## 在真實環境中,補償資料可能會比較麻煩一些,因為會涉及到很多庫表的操作,但原理是一樣的

場景 3:線上事務分析

3.1 準備

-- 建立一張500w行的大表並持續寫入
02:38 AM dmp1 /usr/local/sysbench/share/sysbench# sysbench /usr/local/sysbench/share/sysbench/oltp_read_write.lua --db-driver=mysql --tables=1 --table_size=5000000 --mysql-host=10.186.60.68 --mysql-port=3332 --mysql-db=zlm --mysql-user=zlm --mysql-password=zlm --report-interval=2 --threads=10 --time=600 --skip-trx=on --mysql-ignore-errors=1062,1213 --db-ps-mode=disable prepare
sysbench 1.0.17 (using bundled LuaJIT 2.1.0-beta2)

Initializing worker threads...

Creating table 'sbtest1'...
Inserting 5000000 records into 'sbtest1'
... 略

-- 檢視主從延遲
## 主庫正在執行的事務
mysql> show master status;
+------------------+-----------+--------------+------------------+-----------------------------------------------------------------------------------------------+
| File             | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                                                                             |
+------------------+-----------+--------------+------------------+-----------------------------------------------------------------------------------------------+
| mysql-bin.000018 | 215205384 |              |                  | 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653,
3f4e72ab-46af-11eb-9bac-02000aba3c44:1-1335429 |
+------------------+-----------+--------------+------------------+-----------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

## 從庫回放狀態
11:22 AM dmp1 (master) ~# mysql32 -e "show slave status\G"|grep Running |grep -v "Running_State"|awk '{print $2}' && mysql32 -e "show slave status\G"|grep Behind |awk '{print $2}'
mysql: [Warning] Using a password on the command line interface can be insecure.
Yes
Yes
mysql: [Warning] Using a password on the command line interface can be insecure.
0

## 此時並沒有發現SQL執行緒有延遲(Senconds_Behind_Master=0)

-- 用my2sql工具分析mysql-bin.000018這個binlog
11:30 AM dmp1 (master) ~# ./my2sql -user zlm -password zlm -host 10.186.60.68 -port 3332 -work-type stats -start-file mysql-bin.000018 -start-pos 194 -output-dir /tmp/my2sql_test
[2020/12/28 11:30:35] [info] binlogsyncer.go:144 create BinlogSyncer with config {1113306 mysql 10.186.60.68 3332 zlm   utf8 false false <nil> false Local false 0 0s 0s 0 false false 0}
[2020/12/28 11:30:35] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin.000018, 194)
[2020/12/28 11:30:35] [info] stats_process.go:166 start thread to analyze statistics from binlog
[2020/12/28 11:30:35] [info] repl.go:15 start to get binlog from mysql
[2020/12/28 11:30:35] [info] binlogsyncer.go:777 rotate to (mysql-bin.000018, 194)
[2020/12/28 11:30:44] [info] repl.go:83 deadline exceeded.
[2020/12/28 11:30:44] [info] repl.go:17 finish getting binlog from mysql
[2020/12/28 11:30:44] [info] stats_process.go:266 exit thread to analyze statistics from binlog

-- 檢視生成檔案的內容
11:32 AM dmp1 /tmp/my2sql_test# cat binlog_status.txt 
binlog            starttime           stoptime            startpos   stoppos    inserts  updates  deletes  database        table               
mysql-bin.000018  2020-12-28_11:24:43 2020-12-28_11:25:12 390        40472959   14704    29399    14703    zlm             sbtest1             
mysql-bin.000018  2020-12-28_11:25:13 2020-12-28_11:25:42 40473397   83094395   15482    30964    15481    zlm             sbtest1             
mysql-bin.000018  2020-12-28_11:25:43 2020-12-28_11:26:12 83094627   124446683  15020    30043    15020    zlm             sbtest1             
mysql-bin.000018  2020-12-28_11:26:13 2020-12-28_11:26:42 124446910  165109718  14771    29540    14771    zlm             sbtest1             
mysql-bin.000018  2020-12-28_11:26:43 2020-12-28_11:27:12 165110068  205873372  14806    29615    14807    zlm             sbtest1             
mysql-bin.000018  2020-12-28_11:27:13 2020-12-28_11:27:19 205873604  215205353  3391     6778     3390     zlm             sbtest1             
11:32 AM dmp1 /tmp/my2sql_test# cat biglong_trx.txt 
binlog            starttime           stoptime            startpos   stoppos    rows     duration   tables

## 在18的binlog中寫入了大量的事務,從11:24:43(390)開始,到11:27:19(215205353),由於沒有指定停止的位置,列印的內容就截止到執行my2sql的時間點

3.2 大事務分析

-- 將主庫上測試表的主鍵刪除
mysql> show create table sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=5000001 DEFAULT CHARSET=utf8
1 row in set (0.01 sec)

mysql> alter table sbtest1 modify id int not null,drop primary key;
Query OK, 5000000 rows affected (1 min 33.31 sec)
Records: 5000000  Duplicates: 0  Warnings: 0

mysql> show create table sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  KEY `k_1` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

-- 主庫執行刪除400萬行記錄
mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
|  5000000 |
+----------+
1 row in set (3.05 sec)

mysql> delete from sbtest1 where id<4000001;
Query OK, 4000000 rows affected (3 min 7.85 sec)

-- 確定主庫當前事務執行到的點位
mysql> show master status;select @@server_uuid;
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                                                                             |
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+
| mysql-bin.000019 |      194 |              |                  | 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653,
3f4e72ab-46af-11eb-9bac-02000aba3c44:1-1398898 |
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

+--------------------------------------+
| @@server_uuid                        |
+--------------------------------------+
| 3f4e72ab-46af-11eb-9bac-02000aba3c44 |
+--------------------------------------+
1 row in set (0.00 sec)

## 很明顯,這是一個大事務,主庫執行查詢全表記錄用了3s,執行刪除400w行記錄用了3.785s
## 因為要模擬主從延遲,先把測試表的主鍵刪除,再進行資料刪除

-- 檢視主從延遲
12:57 PM dmp1 (master) ~# mysql32 -e "show slave status\G"|grep Running |grep -v "Running_State"|awk '{print $2}' && mysql32 -e "show slave status\G"|grep Behind |awk '{print $2}'
mysql: [Warning] Using a password on the command line interface can be insecure.
Yes
Yes
mysql: [Warning] Using a password on the command line interface can be insecure.
281
... 略
12:58 PM dmp1 (master) ~# mysql32 -e "show slave status\G"|grep Running |grep -v "Running_State"|awk '{print $2}' && mysql32 -e "show slave status\G"|grep Behind |awk '{print $2}'
mysql: [Warning] Using a password on the command line interface can be insecure.
Yes
Yes
mysql: [Warning] Using a password on the command line interface can be insecure.
441

## 現在已經開始出現延遲,並且Seconds_Behind_Master的值會越來越大(當主庫上的表沒有主鍵或唯一鍵時,從庫回放時需要全表掃描來定位每一行記錄,記錄越多,這個過程越慢,最終導致主從延遲)

-- 執行my2sql,指定工作模式為事務分析(指定10000為大事務閾值)
01:10 PM dmp1 (master) ~# ./my2sql -user zlm -password zlm -host 10.186.60.68 -port 3332 -work-type stats -start-file mysql-bin.000018 -big-trx-row-limit 10000 -output-dir /tmp/my2sql_test
[2020/12/28 13:11:18] [info] stats_process.go:166 start thread to analyze statistics from binlog
[2020/12/28 13:11:18] [info] binlogsyncer.go:144 create BinlogSyncer with config {1113306 mysql 10.186.60.68 3332 zlm   utf8 false false <nil> false Local false 0 0s 0s 0 false false 0}
[2020/12/28 13:11:18] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin.000018, 4)
[2020/12/28 13:11:18] [info] repl.go:15 start to get binlog from mysql
[2020/12/28 13:11:18] [info] binlogsyncer.go:777 rotate to (mysql-bin.000018, 4)
[2020/12/28 13:11:27] [info] binlogsyncer.go:777 rotate to (mysql-bin.000019, 4)
[2020/12/28 13:11:27] [info] binlogsyncer.go:777 rotate to (mysql-bin.000019, 4)
[2020/12/28 13:11:32] [info] repl.go:83 deadline exceeded.
[2020/12/28 13:11:32] [info] repl.go:17 finish getting binlog from mysql
[2020/12/28 13:11:32] [info] stats_process.go:266 exit thread to analyze statistics from binlog

-- 檢視生成的檔案內容
01:13 PM dmp1 /tmp/my2sql_test# ll
total 8
-rw-r--r-- 1 root root  260 Dec 28 13:11 biglong_trx.txt
-rw-r--r-- 1 root root 1440 Dec 28 13:11 binlog_status.txt
01:13 PM dmp1 /tmp/my2sql_test# cat binlog_status.txt 
binlog            starttime           stoptime            startpos   stoppos    inserts  updates  deletes  database        table               
mysql-bin.000018  2020-12-28_11:24:43 2020-12-28_11:25:12 390        40472959   14704    29399    14703    zlm             sbtest1             
mysql-bin.000018  2020-12-28_11:25:13 2020-12-28_11:25:42 40473397   83094395   15482    30964    15481    zlm             sbtest1             
mysql-bin.000018  2020-12-28_11:25:43 2020-12-28_11:26:12 83094627   124446683  15020    30043    15020    zlm             sbtest1             
mysql-bin.000018  2020-12-28_11:26:13 2020-12-28_11:26:42 124446910  165109718  14771    29540    14771    zlm             sbtest1             
mysql-bin.000018  2020-12-28_11:26:43 2020-12-28_11:27:12 165110068  205873372  14806    29615    14807    zlm             sbtest1             
mysql-bin.000018  2020-12-28_11:27:13 2020-12-28_11:27:19 205873604  215205353  3391     6778     3390     zlm             sbtest1             
mysql-bin.000018  2020-12-28_12:37:02 2020-12-28_12:37:31 215205990  249240812  30410    11715    5787     sbtest          sbtest1             
mysql-bin.000018  2020-12-28_12:37:32 2020-12-28_12:37:40 249241253  260168297  7263     5533     2758     sbtest          sbtest1             
mysql-bin.000018  2020-12-28_12:53:51 2020-12-28_12:53:51 260168687  1023424583 0        0        4000000  zlm             sbtest1             
01:13 PM dmp1 /tmp/my2sql_test# cat biglong_trx.txt 
binlog            starttime           stoptime            startpos   stoppos    rows     duration   tables
mysql-bin.000018  2020-12-28_12:53:51 2020-12-28_12:53:51 260168556  1023424614 4000000  0          [zlm.sbtest1(inserts=0, updates=0, deletes=4000000)]

## 由於主庫仍然在寫mysql-bin.000018這個binlog檔案,還未切換,我們繼續以它為起始位置分析
## 執行命令的時候用了-big-trx-row-limit 10000來指定超過1w行的DML為大事務;如果不指定該引數,預設超過500行就會被統計
## 由生成的結果檔案得知,在12:53:51的時候,在zml.sbtest1表上有一個deletes=4000000的大事務操作,與之前的操作相對應

3.3 長事務分析

-- 建立一個測試表,顯式開啟10s事務
mysql> show master status;
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                                                                             |
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+
| mysql-bin.000019 |      380 |              |                  | 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653,
3f4e72ab-46af-11eb-9bac-02000aba3c44:1-1398899 |
+------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> create table t1(id int);
Query OK, 0 rows affected (0.01 sec)

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1(id) values(1);select sleep(10);commit;
Query OK, 1 row affected (0.00 sec)

+-----------+
| sleep(10) |
+-----------+
|         0 |
+-----------+
1 row in set (10.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> select count(*) from t1;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)

-- 執行my2sql,工作模式選擇事務分析(指定5s為長事務閾值)
02:03 PM dmp1 (master) ~# ./my2sql -user zlm -password zlm -host 10.186.60.68 -port 3332 -work-type stats -start-file mysql-bin.000019 -long-trx-seconds 5 -output-dir /tmp/my2sql_test
[2020/12/28 14:10:01] [info] binlogsyncer.go:144 create BinlogSyncer with config {1113306 mysql 10.186.60.68 3332 zlm   utf8 false false <nil> false Local false 0 0s 0s 0 false false 0}
[2020/12/28 14:10:01] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin.000019, 4)
[2020/12/28 14:10:01] [info] stats_process.go:166 start thread to analyze statistics from binlog
[2020/12/28 14:10:01] [info] repl.go:15 start to get binlog from mysql
[2020/12/28 14:10:01] [info] binlogsyncer.go:777 rotate to (mysql-bin.000019, 4)
[2020/12/28 14:10:06] [info] repl.go:83 deadline exceeded.
[2020/12/28 14:10:06] [info] repl.go:17 finish getting binlog from mysql
[2020/12/28 14:10:06] [info] stats_process.go:266 exit thread to analyze statistics from binlog

-- 檢視生成的檔案內容
02:10 PM dmp1 /tmp/my2sql_test# ll
total 8
-rw-r--r-- 1 root root 249 Dec 28 14:10 biglong_trx.txt
-rw-r--r-- 1 root root 288 Dec 28 14:10 binlog_status.txt
02:10 PM dmp1 /tmp/my2sql_test# cat binlog_status.txt 
binlog            starttime           stoptime            startpos   stoppos    inserts  updates  deletes  database        table               
mysql-bin.000019  2020-12-28_14:09:02 2020-12-28_14:09:02 728        812        1        0        0        zlm             t1                  
02:10 PM dmp1 /tmp/my2sql_test# cat biglong_trx.txt 
binlog            starttime           stoptime            startpos   stoppos    rows     duration   tables
mysql-bin.000019  2020-12-28_14:09:02 2020-12-28_14:09:12 605        843        1        10         [zlm.t1(inserts=1, updates=0, deletes=0)]

## 首先,確定事務開始的binlog位置為mysql-bin.000019,從這個檔案開始解析
## 用引數-long-trx-seconds指定長事務的閾值為10s,只要超過這個值的事務就會被統計;如果不指定該引數,預設執行超過5min的事務會被統計,建議實際使用的時候指定這個引數,並設定較小的閾值進行分析
## 生成的統計檔案中記錄了剛才執行的10s事務(即zlm.t1表上的insert操作)

限制

  • my2sql 是模擬一個從庫去線上獲取主庫 binlog,然後進行解析,因此執行操作的資料庫使用者需要具有 SELECT,REPLICATION SALVE,REPLICATION CLIENT 的許可權。
  • 與 binlog2sql、MyFlash 差不多,my2sql 目前也不支援 8.0;閃回功能需要開啟 binlog_format=row,binlog_row_image=full;只能閃回 DML 操作,不支援 DDL 的閃回。
  • 無法離線解析 binlog(MyFlash 支援)。
  • 不能以 GTID 事務為單位進行解析(MyFlash 支援),具體 file+pos 點位需要先通過手工解析 binlog 後確認。
  • 閃回/前滾 SQL 中,沒有提供具體的 begin/commit 的位置,使用時無法分隔事務,需要人工判斷。
  • 使用事務分析功能時,只能給出具體的大/長事務發生時間、點位、涉及的物件和操作型別,不能給出具體的 SQL 語句,完整的語句仍然需要去 binlog 中進行檢視(需設定 binlog_rows_query_log_events=on)

總結

  • my2sql 是一款比較實用的 binlog 解析工具,除了能閃回 DML 的誤操作外,還能通過生成前滾 SQL 進行資料補償、利用事務分析功能來排查主從延遲問題、捕捉長時間不提交的事務等
  • my2sql 基於 Go 語言編寫,直接提供了 Linux 二進位制版本,對環境無依賴,使用便捷
  • my2sql 效能較好,解析 binlog 時生成閃回/前滾 SQL 的效率較高(對比 binlog2sql),作者號稱能達到 50-60 倍左右,有興趣的朋友可進行一輪效能對比測試
  • MyFlash 解析 binlog 的效率也比 binlog2sql 高,但生成的回滾檔案仍然是二進位制格式的,需要依賴 mysqlbinlog 來進行處理,同時,也無法直觀地對反解析後的回滾內容進行業務驗證,my2sql 的出現正好彌補了這兩者的不足