1. 程式人生 > 其它 >通過備份 Etcd 來完美恢復 Kubernetes 中的誤刪資料

通過備份 Etcd 來完美恢復 Kubernetes 中的誤刪資料

誤刪或者機器宕機,會導致 Etcd 資料的丟失或某個節點的 Etcd 資料異常時,請不要慌,認真看完此文,絕對有收穫。當誤刪時,如何恢復資料,這個操作需求在實際環境當中是不可避免的。以下描述刪除兩個 namespace 下的 Pod,如何恢復對應 namespace 的資料。

備份etcd

 ETCDCTL_API=3; etcdctl snapshot save snap.db --endpoints=https://172.16.230.84:2379  --cacert=/etc/kubernetes/ssl/ca.pem  --cert=/etc/kubernetes/ssl/etcd.pem  --key=/etc/kubernetes/ssl/etcd-key.pem 
{"level":"info","ts":1628844708.147097,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"snap.db.part"} {"level":"info","ts":"2021-08-13T16:51:48.169+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"} {"level":"info","ts":1628844708.1696196,"
caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"https://172.16.230.84:2379"} {"level":"info","ts":"2021-08-13T16:51:48.441+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"} {"level":"info","ts":1628844708.467766,"caller":"snapshot/v3_snapshot.go:142
","msg":"fetched snapshot","endpoint":"https://172.16.230.84:2379","size":"16 MB","took":0.320511222} {"level":"info","ts":1628844708.46799,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"snap.db"} Snapshot saved at snap.db

停止所有 master 上 kube-apiserver 服務

systemctl  stop kube-apiserver

停止3臺master上的 Etcd 執行

systemctl  stop  etcd

恢復備份

不同環境下,目錄可能不一樣,可以通過 systemctl status etcd 檢視 Etcd 配置引數。特別需要注意 name、initial-cluster、initial-cluster-token、initial-advertise-peer-urls 和 data-dir 引數的值。

  • 在第一臺 Etcd 節點上,注意需要 ETCDCTL_API=3、name 值、IP 值、snapshot.db 檔案目錄和 data-dir 目錄。
export ETCDCTL_API=3
一條指令,可以直接在終端上修改裡面引數
etcdctl snapshot restore snapshot.db --name etcd1 --initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" --initial-cluster-token k8s_etcd --initial-advertise-peer-urls https://192.168.0.25:2380 --data-dir=/var/lib/etcd
和上面指令一樣作用,把長的指令以換行形式展現
etcdctl snapshot restore snapshot.db --name etcd1 \
--initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" \
--initial-cluster-token k8s_etcd \
--initial-advertise-peer-urls https://192.168.0.25:2380 \
--data-dir=/var/lib/etcd

2021-01-19 11:17:06.773113 I | mvcc: restore compact to 96139
2021-01-19 11:17:06.800086 I | etcdserver/membership: added member 7370b1d3dc967c [https://192.168.0.25:2380] to cluster e4d7f96e88cc9d71
2021-01-19 11:17:06.800159 I | etcdserver/membership: added member 2ef3cfc4ca48ad38 [https://192.168.0.26:2380] to cluster e4d7f96e88cc9d71
2021-01-19 11:17:06.800190 I | etcdserver/membership: added member 3a0c86c4c744477c [https://192.168.0.28:2380] to cluster e4d7f96e88cc9d71
  • 第二臺和第三臺 Etcd 恢復資料,同樣需要改變 ETCDCTL_API=3、name 值、IP 值、snapshot.db 檔案目錄和 data-dir 目錄。
export ETCDCTL_API=3
一條指令,可以直接在終端上修改裡面引數
etcdctl snapshot restore snapshot.db --name etcd2 --initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" --initial-cluster-token k8s_etcd --initial-advertise-peer-urls https://192.168.0.26:2380 --data-dir=/var/lib/etcd
和上面指令一樣作用,把長的指令以換行形式展現
etcdctl snapshot restore snapshot.db --name etcd2 \
--initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" \
--initial-cluster-token k8s_etcd \
--initial-advertise-peer-urls https://192.168.0.26:2380 \
--data-dir=/var/lib/etcd

2021-01-19 11:19:59.857363 I | mvcc: restore compact to 96139
2021-01-19 11:19:59.873793 I | etcdserver/membership: added member 7370b1d3dc967c [https://192.168.0.25:2380] to cluster e4d7f96e88cc9d71
2021-01-19 11:19:59.873837 I | etcdserver/membership: added member 2ef3cfc4ca48ad38 [https://192.168.0.26:2380] to cluster e4d7f96e88cc9d71
2021-01-19 11:19:59.873852 I | etcdserver/membership: added member 3a0c86c4c744477c [https://192.168.0.28:2380] to cluster e4d7f96e88cc9d71

export ETCDCTL_API=3
一條指令,可以直接在終端上修改裡面引數
etcdctl snapshot restore snapshot.db --name etcd3 --initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" --initial-cluster-token k8s_etcd --initial-advertise-peer-urls https://192.168.0.28:2380 --data-dir=/var/lib/etcd
和上面指令一樣作用,把長的指令以換行形式展現
etcdctl snapshot restore snapshot.db --name etcd3 \
--initial-cluster "etcd1=https://192.168.0.25:2380,etcd2=https://192.168.0.26:2380,etcd3=https://192.168.0.28:2380" \
--initial-cluster-token k8s_etcd \
--initial-advertise-peer-urls https://192.168.0.28:2380 \
--data-dir=/var/lib/etcd

2021-01-19 11:22:21.423215 I | mvcc: restore compact to 96139
2021-01-19 11:22:21.438319 I | etcdserver/membership: added member 7370b1d3dc967c [https://192.168.0.25:2380] to cluster e4d7f96e88cc9d71
2021-01-19 11:22:21.438357 I | etcdserver/membership: added member 2ef3cfc4ca48ad38 [https://192.168.0.26:2380] to cluster e4d7f96e88cc9d71
2021-01-19 11:22:21.438371 I | etcdserver/membership: added member 3a0c86c4c744477c [https://192.168.0.28:2380] to cluster e4d7f96e88cc9d71
  • 三臺 Etcd 啟動
一條指令,可以直接在終端上修改裡面引數
etcdctl --cacert=/etc/ssl/etcd/ssl/ca.pem --cert=/etc/ssl/etcd/ssl/node-node3.pem --key=/etc/ssl/etcd/ssl/node-node3-key.pem --endpoints=https://192.168.0.25:2379,https://192.168.0.26:2379,https://192.168.0.28:2379 endpoint health
和上面指令一樣作用,把長的指令以換行形式展現
etcdctl --cacert=/etc/ssl/etcd/ssl/ca.pem \
--cert=/etc/ssl/etcd/ssl/node-node3.pem \
--key=/etc/ssl/etcd/ssl/node-node3-key.pem \
--endpoints=https://192.168.0.25:2379,https://192.168.0.26:2379,https://192.168.0.28:2379 \
endpoint health

https://192.168.0.28:2379 is healthy: successfully committed proposal: took = 11.664519ms
https://192.168.0.26:2379 is healthy: successfully committed proposal: took = 5.04665ms
https://192.168.0.25:2379 is healthy: successfully committed proposal: took = 1.837265ms

總結

Kubernetes 叢集備份主要是備份 Etcd 叢集。而恢復時,主要考慮恢復整個順序:

停止 Kube-apiserver--> 停止 Etcd--> 恢復資料 --> 啟動 Etcd --> 啟動 Kube-apiserver

注意:備份 Etcd 叢集時,只需要備份一個 Etcd 就行,恢復時,拿同一份備份資料恢復。

參考:https://mp.weixin.qq.com/s/4b2COdr5q4SFfJTy3wl8gA