[ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():765: Will never receive state. Need
一. 問題概述:
PXC第三個節點起不來,啟動報錯如下:
[[email protected] local]# /usr/local/mysql/support-files/mysql.server start
ERROR! MySQL (Percona XtraDB Cluster) is not running, but lock file (/var/lock/subsys/mysql) exists
Stale sst_in_progress file in datadir
Starting MySQL (Percona XtraDB Cluster)State transfer in progress, setting sleep higher
./usr/local/mysql/bin/mysqld_safe: line 218: 3559 Aborted nohup /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/data/mysql/mysql3306/data --plugin-dir=/usr/local/mysql/lib/mysql/plugin --user=mysql --wsrep-provider=/usr/local/mysql/lib/libgalera_smm.so --log-error=/data/mysql/mysql3306/data/error.log --open-files-limit=65535 --pid-file=/usr/local/mysql/mysqlcentos03.pid --socket=/tmp/mysql.sock --port=3306 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1 < /dev/null > /dev/null 2>&1
ERROR! The server quit without updating PID file (/usr/local/mysql/mysqlcentos03.pid).
ERROR! MySQL (Percona XtraDB Cluster) server startup failed!
[[email protected] local]#
在我看來其實看不出什麼,去看錯誤日誌
二.問題分析
2017-08-31T06:14:33.067169Z 0 [Note] WSREP: Save the discovered primary-component to disk2017-08-31T06:14:33.563179Z 0 [Note] WSREP: gcomm: connected
2017-08-31T06:14:33.563247Z 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2017-08-31T06:14:33.563306Z 0 [Note] WSREP: Waiting for SST/IST to complete.
2017-08-31T06:14:33.563476Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2017-08-31T06:14:33.563495Z 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2017-08-31T06:14:33.563520Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: a7e982b2-8e13-11e7-a955-2a9cccd9faad
2017-08-31T06:14:33.563526Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: a7e982b2-8e13-11e7-a955-2a9cccd9faad from 1 (mysqlcentos02)
2017-08-31T06:14:33.563531Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: a7e982b2-8e13-11e7-a955-2a9cccd9faad from 0 (mysqlcentos01)
2017-08-31T06:14:33.565017Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: a7e982b2-8e13-11e7-a955-2a9cccd9faad from 2 (mysqlcentos03)
2017-08-31T06:14:33.565044Z 0 [Note] WSREP: Quorum results:
component = PRIMARY,
conf_id = 18,
members = 2/3 (primary/total),
act_id = 2,
last_appl. = -1,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = 0ce2bfc9-8e01-11e7-be90-3a29ba99b973
2017-08-31T06:14:33.565147Z 0 [Note] WSREP: Flow-control interval: [173, 173]
2017-08-31T06:14:33.565152Z 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 2)
2017-08-31T06:14:33.565291Z 2 [Note] WSREP: State transfer required:
Local state: 00000000-0000-0000-0000-000000000000:-1
2017-08-31T06:14:33.565306Z 2 [Note] WSREP: New cluster view: global state: 0ce2bfc9-8e01-11e7-be90-3a29ba99b973:2, view# 19: Primary, number of
nodes: 3, my index: 2, protocol version 3
2017-08-31T06:14:33.565310Z 2 [Warning] WSREP: Gap in state sequence. Need state transfer.
2017-08-31T06:14:33.565429Z 0 [Note] WSREP: Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.194
.41.228' --datadir '/data/mysql/mysql3306/data/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '22140' --binlog '/data/mysql/
mysql3306/logs/mysql-bin' )
2017-08-31T06:14:33.854454Z WSREP_SST: [WARNING] Found a stale sst_in_progress file: /data/mysql/mysql3306/data//sst_in_progress
2017-08-31T06:14:33.881703Z 2 [Note] WSREP: Prepared SST/IST request: xtrabackup-v2|10.194.41.228:4444/xtrabackup_sst//1 這裡是請求的伺服器資訊,也就是本機的資訊
2017-08-31T06:14:33.881727Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-08-31T06:14:33.881746Z 2 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-08-31T06:14:33.881753Z 2 [Note] WSREP: Assign initial position for certification: 2, protocol version: 3
2017-08-31T06:14:33.881793Z 0 [Note] WSREP: Service thread queue flushed.
2017-08-31T06:14:33.882165Z 2 [Note] WSREP: Check if state gap can be serviced using IST
2017-08-31T06:14:33.882180Z 2 [Note] WSREP: Local UUID: 00000000-0000-0000-0000-000000000000 != Group UUID: 0ce2bfc9-8e01-11e7-be90-3a29ba99b973
2017-08-31T06:14:33.882216Z 2 [Note] WSREP: State gap can't be serviced using IST. Switching to SST
2017-08-31T06:14:33.885746Z 0 [Note] WSREP: Member 2.0 (mysqlcentos03) requested state transfer from '*any*'. Selected 0.0 (mysqlcentos01)(SYNCED
) as donor.
2017-08-31T06:14:33.885763Z 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 2)
2017-08-31T06:14:33.885815Z 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2017-08-31T06:14:33.885829Z 2 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(0ce2bfc9-8e01-11e7-be90-3a29
ba99b973:2)
2017/08/31 14:14:33 socat[22534] E bind(3, {AF=2 0.0.0.0:4444}, 16): Address already in use
2017-08-31T06:14:33.921670Z WSREP_SST: [ERROR] ******************* FATAL ERROR **********************
2017-08-31T06:14:33.922872Z WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 1 0
2017-08-31T06:14:33.924010Z WSREP_SST: [ERROR] ******************************************************
2017-08-31T06:14:33.925313Z WSREP_SST: [ERROR] Cleanup after exit with status:32
2017-08-31T06:14:33.947934Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.194.41.228' --da
tadir '/data/mysql/mysql3306/data/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '22140' --binlog '/data/mysql/mysql3306/log
s/mysql-bin' : 32 (Broken pipe) 這裡注意下,這是你本節點的資訊,仔細核對有誤錯誤,比如IP,目錄等
2017-08-31T06:14:33.947974Z 0 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
2017-08-31T06:14:33.947994Z 0 [ERROR] WSREP: SST script aborted with error 32 (Broken pipe)
2017-08-31T06:14:33.948060Z 0 [ERROR] WSREP: SST failed: 32 (Broken pipe)
2017-08-31T06:14:33.948071Z 0 [ERROR] Aborting
2017-08-31T06:14:33.948076Z 0 [Note] WSREP: Signalling cancellation of the SST request.
2017-08-31T06:14:33.948179Z 0 [Note] WSREP: SST request was cancelled
2017-08-31T06:14:33.948193Z 0 [Note] Giving 2 client threads a chance to die gracefully
2017-08-31T06:14:33.948228Z 2 [Note] WSREP: Closing send monitor...
2017-08-31T06:14:33.948233Z 2 [Note] WSREP: Closed send monitor.
2017-08-31T06:14:33.948240Z 2 [Note] WSREP: gcomm: terminating thread
2017-08-31T06:14:33.948246Z 2 [Note] WSREP: gcomm: joining thread
2017-08-31T06:14:33.948406Z 2 [Note] WSREP: gcomm: closing backend
2017-08-31T06:14:33.949833Z 2 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(NON_PRIM,0ce23ed2,19)
memb {
a79ba5c4,0
}
joined {
}
left {
}
partitioned {
0ce23ed2,0
8eb4e337,0
}
)
2017-08-31T06:14:33.949896Z 2 [Note] WSREP: Current view of cluster as seen by this node
view ((empty))
2017-08-31T06:14:33.950265Z 2 [Note] WSREP: gcomm: closed
2017-08-31T06:14:33.950308Z 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2017-08-31T06:14:33.950324Z 0 [Note] WSREP: Flow-control interval: [100, 100]
2017-08-31T06:14:33.950330Z 0 [Note] WSREP: Received NON-PRIMARY.
2017-08-31T06:14:33.950334Z 0 [Note] WSREP: Shifting JOINER -> OPEN (TO: 2)
2017-08-31T06:14:33.950342Z 0 [Note] WSREP: Received self-leave message.
2017-08-31T06:14:33.950348Z 0 [Note] WSREP: Flow-control interval: [0, 0]
2017-08-31T06:14:33.950352Z 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2017-08-31T06:14:33.950356Z 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 2)
2017-08-31T06:14:33.950364Z 0 [Note] WSREP: RECV thread exiting 0: Success
2017-08-31T06:14:33.950737Z 2 [Note] WSREP: recv_thread() joined.
2017-08-31T06:14:33.950759Z 2 [Note] WSREP: Closing replication queue.
2017-08-31T06:14:33.950765Z 2 [Note] WSREP: Closing slave action queue.
2017-08-31T06:14:33.950798Z 2 [Note] WSREP: New cluster view: global state: 00000000-0000-0000-0000-000000000000:0, view# -1: non-Primary, number
of nodes: 0, my index: -1, protocol version -1
2017-08-31T06:14:33.950806Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-08-31T06:14:33.950822Z 2 [Note] WSREP: applier thread exiting (code:6)
2017-08-31T06:14:35.948418Z 0 [Note] WSREP: Waiting for active wsrep applier to exit
2017-08-31T06:14:35.948463Z 1 [Note] WSREP: rollbacker thread exiting
2017-08-31T06:14:35.948548Z 0 [Note] WSREP: Service disconnected.
2017-08-31T06:14:35.948555Z 0 [Note] WSREP: Waiting to close threads......
2017-08-31T06:14:40.954027Z 0 [Note] WSREP: Some threads may fail to exit.
2017-08-31T06:14:40.954079Z 0 [Note] Binlog end
2017-08-31T06:14:40.954287Z 0 [Note] /usr/local/mysql/bin/mysqld: Shutdown complete
2017-08-31T06:15:20.513818Z WSREP_SST: [ERROR] ******************* FATAL ERROR **********************
2017-08-31T06:15:20.514703Z WSREP_SST: [ERROR] Possible timeout in receving first data from donor in
/data/pxc/Percona-XtraDB-Cluster-5.7.18-rel15-29.20.1.Linux.x86_64.ssl101/bin/wsrep_sst_xtrabackup-v2: line 920: gtid/keyring stage: No such file
or directory
2017-08-31T06:15:20.516231Z WSREP_SST: [ERROR] Cleanup after exit with status:1
Log of wsrep recovery (--wsrep-recover): 這只是表象
2017-08-31T06:18:39.716440Z 0 [Warning] option 'table_definition_cache': unsigned value 40 adjusted to 400
2017-08-31T06:18:39.858858Z 0 [Note] --secure-file-priv is set to NULL. Operations related to importing and exporting data are disabled
2017-08-31T06:18:39.858897Z 0 [Note] /usr/local/mysql/bin/mysqld (mysqld 5.7.18-15-29.20-log) starting as process 24185 ...
2017-08-31T06:18:39.863914Z 0 [Note] InnoDB: PUNCH HOLE support available
2017-08-31T06:18:39.863940Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-08-31T06:18:39.863944Z 0 [Note] InnoDB: Uses event mutexes
2017-08-31T06:18:39.863948Z 0 [Note] InnoDB: GCC builtin __sync_synchronize() is used for memory barrier
2017-08-31T06:18:39.863950Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.3
2017-08-31T06:18:39.863953Z 0 [Note] InnoDB: Using Linux native AIO
2017-08-31T06:18:39.863967Z 0 [Note] InnoDB: Adjusting innodb_buffer_pool_instances from 8 to 1 since innodb_buffer_pool_size is less than 1024 M
iB
2017-08-31T06:18:43.890529Z 2 [Note] WSREP: Prepared SST/IST request: xtrabackup-v2|10.194.41.228:4444/xtrabackup_sst//1
2017-08-31T06:18:43.890562Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-08-31T06:18:43.890585Z 2 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-08-31T06:18:43.890595Z 2 [Note] WSREP: Assign initial position for certification: 2, protocol version: 3
2017-08-31T06:18:43.890638Z 0 [Note] WSREP: Service thread queue flushed.
2017-08-31T06:18:43.890836Z 2 [Note] WSREP: Check if state gap can be serviced using IST
2017-08-31T06:18:43.890856Z 2 [Note] WSREP: Local UUID: 00000000-0000-0000-0000-000000000000 != Group UUID: 0ce2bfc9-8e01-11e7-be90-3a29ba99b973
2017-08-31T06:18:43.890904Z 2 [Note] WSREP: State gap can't be serviced using IST. Switching to SST
2017-08-31T06:18:43.893050Z 0 [Note] WSREP: Member 1.0 (mysqlcentos03) requested state transfer from '*any*'. Selected 0.0 (mysqlcentos01)(SYNCED
) as donor.
2017-08-31T06:18:43.893066Z 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 2)
2017-08-31T06:18:43.893226Z 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2017-08-31T06:18:43.893246Z 2 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(0ce2bfc9-8e01-11e7-be90-3a29
ba99b973:2)
2017-08-31T06:18:44.275679Z 0 [Warning] WSREP: 0.0 (mysqlcentos01): State transfer to 1.0 (mysqlcentos03) failed: -32 (Broken pipe)
2017-08-31T06:18:44.275702Z 0 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():765: Will never receive state. Need to abort. 這裡報錯顯示無法加入
2017-08-31T06:18:44.275715Z 0 [Note] WSREP: gcomm: terminating thread
2017-08-31T06:18:44.275722Z 0 [Note] WSREP: gcomm: joining thread
2017-08-31T06:18:44.275929Z 0 [Note] WSREP: gcomm: closing backend
2017-08-31T06:18:44.278216Z 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(NON_PRIM,0ce23ed2,21)
memb {
3cc5181e,0
}
joined {
}
left {
}
partitioned {
0ce23ed2,0
8eb4e337,0
}
)
2017-08-31T06:18:44.278263Z 0 [Note] WSREP: Current view of cluster as seen by this node
view ((empty))
2017-08-31T06:18:44.278442Z 0 [Note] WSREP: gcomm: closed
2017-08-31T06:18:44.278459Z 0 [Note] WSREP: /usr/local/mysql/bin/mysqld: Terminated.
2017-08-31T06:20:23.709892Z WSREP_SST: [ERROR] ******************* FATAL ERROR **********************
2017-08-31T06:20:23.710711Z WSREP_SST: [ERROR] Possible timeout in receving first data from donor in
/data/pxc/Percona-XtraDB-Cluster-5.7.18-rel15-29.20.1.Linux.x86_64.ssl101/bin/wsrep_sst_xtrabackup-v2: line 920: gtid/keyring stage: No such file
or directory
2017-08-31T06:20:23.712208Z WSREP_SST: [ERROR] Cleanup after exit with status:1