MHA搭建及故障維護
MHA是一種方便簡單可靠的MySQL高可用架構,具體的介紹我在這裏就不多說了,下面是我在網上找的一個教程,我在此基礎上進行了一些修改:
大致步驟
(一)、環境介紹 (二)、用ssh-keygen實現四臺主機之間相互免密鑰登錄 (三)、安裝MHAmha4mysql-node,mha4mysql-manager和perl環境包 (四)、建立master,slave1,slave2之間主從復制 (五)、管理機manager上配置MHA文件 (六)、masterha_check_ssh工具驗證ssh信任登錄是否成功 (七)、masterha_check_repl工具驗證mysql復制是否成功 (八)、啟動MHA manager,並監控日誌文件 (九)、測試master宕機後,是否會自動切換
(一)環境介紹
1.主機部署
CentOS 7改主機名
hostnamectl set-hostname master 192.168.56.121 master 192.168.56.122 slave1 #備用master 192.168.56.123 slave2 192.168.56.124 manager
將ip和域名配置到/etc/hosts文件中
嘗試在各主機上的防火墻上加上端口的允許
iptables -I INPUT -s 0/0 -p tcp --dport 3306 -j ACCEPT
這條規則的意思是,想要在輸入數據INPUT中,protocol為tcp/IP的方式,訪問端口3306,都會被允許的
iptables -L -n|grep 3306 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:3306
(二)用ssh-keygen實現四臺主機之間相互免密鑰登錄
1.生成密鑰
[master,slave1,slave2,manager]
ssh-keygen -t rsa
[slave1,slave2,manager]
scp .ssh/id_rsa.pub master:/root/.ssh/slave1.pub scp .ssh/id_rsa.pub master:/root/.ssh/slave2.pub scp .ssh/id_rsa.pub master:/root/.ssh/manager.pub
2.在主機上用cat xxx >> authorized_keys導入公鑰到/root/.ssh/authorized_keys文件中
[master]
cat ~/.ssh/*.pub>>~/.ssh/authorized_keys scp ~/.ssh/authorized_keys slave1:/root/.ssh/authorized_keys scp ~/.ssh/authorized_keys slave2:/root/.ssh/authorized_keys scp ~/.ssh/authorized_keys manager:/root/.ssh/authorized_keys
(三)安裝MHAmha4mysql-node,mha4mysql-manager 軟件包
1.安裝MHAmha4mysql-node
[manager,master,slave1,slave2] yum -y install perl-DBD-MySQL yum -y install perl-Config-Tiny yum -y install perl-Log-Dispatch yum -y install perl-Parallel-ForkManager mha4mysql-node-0.55-0.el6.noarch.rpm
2.安裝mha4mysql-manager
[manager] yum -y install perl yum -y install cpan rpm -ivh mha4mysql-manager-0.55-0.el6.noarch.rpm
缺啥,yum install xxx 啥就行。
(四)、建立master,slave1,slave2之間主從復制
略
(五)、管理機manager上配置MHA文件
[manager]
1.創建目錄
mkdir -p /masterha/app1 mkdir /etc/masterha vi /etc/masterha/app1.cnf [server default] user=root password=root manager_workdir=/masterha/app1 manager_log=/masterha/app1/manager.log remote_workdir=/masterha/app1 ssh_user=root repl_user=rep repl_password=repl ping_interval=1 [server1] hostname=192.168.56.122 master_binlog_dir=/var/lib/mysql candidate_master=1 #relay_log_purge=0 [server2] hostname=192.168.56.121 master_binlog_dir=/var/lib/mysql candidate_master=1 [server3] hostname=192.168.56.123 master_binlog_dir=/var/lib/mysql no_master=1 #relay_log_purge=0
(六)、masterha_check_ssh工具驗證ssh信任登錄是否成功
[manager] masterha_check_ssh --conf=/etc/masterha/app1.cnf [[email protected] ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf Thu Feb 23 12:00:24 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Feb 23 12:00:24 2017 - [info] Reading application default configurations from /etc/masterha/app1.cnf.. Thu Feb 23 12:00:24 2017 - [info] Reading server configurations from /etc/masterha/app1.cnf.. Thu Feb 23 12:00:24 2017 - [info] Starting SSH connection tests.. Thu Feb 23 12:00:25 2017 - [debug] Thu Feb 23 12:00:24 2017 - [debug] Connecting via SSH from [email protected](192.168.56.122:22) to [email protected](192.168.56.121:22).. Thu Feb 23 12:00:25 2017 - [debug] ok. Thu Feb 23 12:00:25 2017 - [debug] Connecting via SSH from [email protected](192.168.56.122:22) to [email protected](192.168.56.123:22).. Thu Feb 23 12:00:25 2017 - [debug] ok. Thu Feb 23 12:00:25 2017 - [debug] Thu Feb 23 12:00:25 2017 - [debug] Connecting via SSH from [email protected](192.168.56.121:22) to [email protected](192.168.56.122:22).. Warning: Permanently added ‘192.168.56.121‘ (ECDSA) to the list of known hosts. Thu Feb 23 12:00:25 2017 - [debug] ok. Thu Feb 23 12:00:25 2017 - [debug] Connecting via SSH from [email protected](192.168.56.121:22) to [email protected](192.168.56.123:22).. Thu Feb 23 12:00:25 2017 - [debug] ok. Thu Feb 23 12:00:26 2017 - [debug] Thu Feb 23 12:00:25 2017 - [debug] Connecting via SSH from [email protected](192.168.56.123:22) to [email protected](192.168.56.122:22).. Warning: Permanently added ‘192.168.56.123‘ (ECDSA) to the list of known hosts. Thu Feb 23 12:00:26 2017 - [debug] ok. Thu Feb 23 12:00:26 2017 - [debug] Connecting via SSH from [email protected](192.168.56.123:22) to [email protected](192.168.56.121:22).. Thu Feb 23 12:00:26 2017 - [debug] ok. Thu Feb 23 12:00:26 2017 - [info] All SSH connection tests passed successfully. [[email protected] ~]#
(七)、masterha_check_repl工具驗證mysql復制是否成功
[manager] masterha_check_repl --conf=/etc/masterha/app1.cnf [[email protected] mysql]# masterha_check_repl --conf=/etc/masterha/app1.cnf Thu Feb 23 14:37:05 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Feb 23 14:37:05 2017 - [info] Reading application default configurations from /etc/masterha/app1.cnf.. Thu Feb 23 14:37:05 2017 - [info] Reading server configurations from /etc/masterha/app1.cnf.. Thu Feb 23 14:37:05 2017 - [info] MHA::MasterMonitor version 0.55. Thu Feb 23 14:37:05 2017 - [info] Dead Servers: Thu Feb 23 14:37:05 2017 - [info] Alive Servers: Thu Feb 23 14:37:05 2017 - [info] master(192.168.56.121:3306) Thu Feb 23 14:37:05 2017 - [info] slave1(192.168.56.122:3306) Thu Feb 23 14:37:05 2017 - [info] slave2(192.168.56.123:3306) Thu Feb 23 14:37:05 2017 - [info] Alive Slaves: .......此處省略 Thu Feb 23 14:37:08 2017 - [info] Connecting to [email protected](slave2:22).. Creating directory /masterha/app1.. done. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /tmp, up to mysql-relay-bin.000004 Temporary relay log file is /tmp/mysql-relay-bin.000004 Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Thu Feb 23 14:37:08 2017 - [info] Slaves settings check done. Thu Feb 23 14:37:08 2017 - [info] master (current master) +--slave1 +--slave2 Thu Feb 23 14:37:08 2017 - [info] Checking replication health on slave1.. Thu Feb 23 14:37:08 2017 - [info] ok. Thu Feb 23 14:37:08 2017 - [info] Checking replication health on slave2.. Thu Feb 23 14:37:08 2017 - [info] ok. Thu Feb 23 14:37:08 2017 - [warning] master_ip_failover_script is not defined. Thu Feb 23 14:37:08 2017 - [warning] shutdown_script is not defined. Thu Feb 23 14:37:08 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
(八)、啟動MHA manager,並監控日誌文件
[manager] masterha_manager --conf=/etc/masterha/app1.cnf tail -f /masterha/app1/manager.log
(九)測試master(宕機後,是否會自動切換
1.停掉master上的mysql服務
[master] [[email protected] ~]# service mysql stop Shutting down MySQL..... SUCCESS! [[email protected] ~]# [manager]
2.宕掉master後,/masterha/app1/manager.log文件顯示:
tail -f /masterha/app1/manager.log
日誌文件顯示:
----- Failover Report ----- app1: MySQL Master failover master to slave1 succeeded Master master is down! Check MHA Manager logs at manager:/masterha/app1/manager.log for details. Started automated(non-interactive) failover. The latest slave slave1(192.168.56.122:3306) has all relay logs for recovery. Selected slave1 as a new master. slave1: OK: Applying all logs succeeded. slave2: This host has the latest relay log events. Generating relay diff files from the latest slave succeeded. slave2: OK: Applying all logs succeeded. Slave started, replicating from slave1. slave1: Resetting slave info succeeded. Master failover to slave1(192.168.56.122:3306) completed successfully.
上面的結果表明master成功切換。
切換過程中需要關註的幾個問題
1.切換過程會自動把read_only關閉
2.切換之後需要刪除手工刪除/masterha/app1/app1.failover.complete,才能進行第二次測試
3.一旦發生切換管理進程將會退出,無法進行再次測試,需將故障數據庫加入到MHA環境中來
4.原主節點重新加入到MHA時只能設置為slave,在
change master to master_host=‘192.168.56.122‘, master_user=‘repl‘, master_password=‘repl‘, master_log_file=‘mysql-bin.000010‘, master_log_pos=120;
之前需要先 reset slave
5.關於ip地址的接管有幾種方式,這裏采用的是MHA自動調用IP別名的方式,好處是在能夠保證數據庫狀態與業務IP切換的一致性。啟動管理節點 之後 VIP會自動別名到當前主節點上,Keepalived也只能做到對3306的健康檢查,但是做不到比如像MySQL復制中的Slave-SQL、 Slave-IO進程的檢查,容易出現對切換的誤判。
6.註意:二級從服務器需要將log_slave_updates打開
7.手工切換需要先定義好master_ip_online_change_script腳本,不然只會切換mysql,IP地址不會綁定上去,可以根據模板來配置該腳本
8.通過設置no_master=1可以讓某一個節點永遠不成為新的主節點
恢復集群運行
①在manager上刪除app1.failover.complete文件
cd /masterha/app1 rm -f app1.failover.complete
②原master主節點服務啟動
service mysql start
③ manager管理節點,檢查同步報錯
masterha_check_repl --conf=/etc/masterha/app1.cnf Thu Feb 23 15:00:56 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln604] There are 2 non-slave servers! MHA manages at most one non-slave server. Check configurations.
⑤查看現在的slave1上的信息
mysql> show master status\G *************************** 1. row *************************** File: mysql-bin.000010 Position: 120 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 1 row in set (0.00 sec)
④配置187節點mysql為新的slave,並啟動同步進程
change master to master_host=‘192.168.56.122‘, master_user=‘repl‘, master_password=‘repl‘, master_log_file=‘mysql-bin.000010‘, master_log_pos=120; mysql> start slave;
再次在管理節點上檢查同步狀態成功:
masterha_check_repl --conf=/etc/masterha/app1.cnf
需註意:按如上步驟操作後,此時121節點作為slaver已加入到集群中,但是宕機這段時間122、123中新產生的數據在121中沒有,所以還需要先從主節點備份導入最新的數據再啟動同步
⑤啟動MHA
nohup masterha_manager –conf=/etc/masterha/app1.cnf > /mha/app1/mha_manager.log &1 &
回切:
同樣的道理,以上步驟配置無問題的話停止當前master的MySQL進程,MHA可直接切換master至原節點
來源:http://blog.itpub.net/24742969/viewspace-2134542/
MHA搭建及故障維護