1. 程式人生 > >Duplicate entry導致mysql主從複製中斷的事故

Duplicate entry導致mysql主從複製中斷的事故

mysql> show slave status\G 
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 1.1.1.1
                  Master_User: repl_user
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000036
          Read_Master_Log_Pos: 1072421316
               Relay_Log_File: xx_DB_0_127-relay-bin.000107
                Relay_Log_Pos: 733952079
        Relay_Master_Log_File: mysql-bin.000036
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: xx
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1062
                   Last_Error: Error 'Duplicate entry '45489-1' for key 'pk_tbl_UserDeviceProfile'' on query. Default database: 'xx'. Query: 'insert tbl_GroupProfileVer(Uin,type,Ver,CreateDate,UpdateDate) value(45489,1,1,1386571046,1386571046)'
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 733951933
              Relay_Log_Space: 1072421668
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1062
               Last_SQL_Error: Error 'Duplicate entry '45489-1' for key 'pk_tbl_UserDeviceProfile'' on query. Default database: 'xx'. Query: 'insert tbl_GroupProfileVer(Uin,type,Ver,CreateDate,UpdateDate) value(45489,1,1,1386571046,1386571046)'
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
1 row in set (0.00 sec)

今天業務又出事了,累覺不愛……

原因是程式碼邏輯:在查詢不到時,直接insert。這個操作在master是ok的,但如果在slave執行,那麼就會導致從節點duplicate錯誤

恢復方法:刪除重複鍵,重啟slave。然後坐等Seconds_Behind_Master降為0吧。

mysql> delete from tbl_GroupProfileVer where Uin=123;
mysql> stop slave;
mysql> start slave;

另外,還暴露了“mysql複製狀態缺乏監控”的問題,一直想做,也一直停留在“想”……再次印證了墨菲定律

寫個監控指令碼,丟到crontab(分分鐘搞定的事情,非得拖到出問題)

#!/bin/bash
array=($(mysql -uroot -e "show slave status\G" | grep "Running" | awk '{print $2}'))
if [ "${array[0]}" == "Yes" ] && [ "${array[1]}" == "Yes" ]
then
        echo "slave is OK" 
else
        echo "MySQL Slave Error!!!"
fi