部署k8s ssl集群實踐4:部署etcd集群
https://github.com/opsnull/follow-me-install-kubernetes-cluster
感謝作者的無私分享。
集群環境已搭建成功跑起來。
文章是部署過程中遇到的錯誤和詳細操作步驟記錄。如有需要對比參考,請按照順序閱讀和測試。
4.1
下載和分發二進制安裝包
[root@k8s-master kubernetes]# wget https://github.com/coreos/etcd/releases/download/v3.3.7/etcd-v3.3.7-linux-amd64.tar.gz [root@k8s-master kubernetes]# ls etcd-v3.3.7-linux-amd64.tar.gz? kubernetes? kubernetes-client-linux-amd64.tar.gz? kubernetes-src.tar.gz [root@k8s-master kubernetes]# [root@k8s-master kubernetes]# tar zxvf etcd-v3.3.7-linux-amd64.tar.gz [root@k8s-master kubernetes]# ls etcd-v3.3.7-linux-amd64?
分發到所有節點
[root@k8s-master kubernetes]# cp etcd-v3.3.7-linux-amd64/etcd* /opt/k8s/bin [root@k8s-master kubernetes]# scp etcd-v3.3.7-linux-amd64/etcd* root@k8s-node1:/opt/k8s/bin etcd? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100%?? 18MB? 91.6MB/s?? 00:00? ? etcdctl? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? 100%?? 15MB? 96.1MB/s?? 00:00? ? [root@k8s-master kubernetes]# scp etcd-v3.3.7-linux-amd64/etcd* root@k8s-node2:/opt/k8s/bin etcd? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100%?? 18MB? 92.2MB/s?? 00:00? ? etcdctl? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? 100%?? 15MB? 92.3MB/s?? 00:00? ? [root@k8s-master kubernetes]#
4.2
創建etcd證書和私鑰
創建證書簽名請求
[root@k8s-master etcd]# cat etcd-csr.json { "CN": "etcd", "hosts": [ "127.0.0.1", "192.168.1.92", "192.168.1.93", "192.168.1.95" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "SZ", "L": "SZ", "O": "k8s", "OU": "4Paradigm" } ] } [root@k8s-master etcd]#
hosts 字段指定授權使用該證書的 etcd 節點 IP 或域名列表,這裏將 etcd 集群的三
個節點 IP 都列在其中
生成證書和私鑰
[root@k8s-master etcd]# cfssl gencert -ca=/etc/kubernetes/cert/ca.pem -ca-key=/etc/kubernetes/cert/ca-key.pem -config=/etc/kubernetes/cert/ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
[root@k8s-master etcd]# ls
etcd.csr? etcd-csr.json? etcd-key.pem? etcd.pem
[root@k8s-master etcd]#
分發證書和私鑰到節點
[root@k8s-master etcd]# cp etcd* /etc/etcd/cert/
[root@k8s-master etcd]# scp etcd* root@k8s-node1:/etc/etcd/cert/
etcd.csr? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100% 1054? ?? 1.5MB/s?? 00:00? ?
etcd-csr.json? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? 100%? 213?? 350.8KB/s?? 00:00? ?
etcd-key.pem? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100% 1679? ?? 2.5MB/s?? 00:00? ?
etcd.pem? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100% 1415? ?? 2.3MB/s?? 00:00? ?
[root@k8s-master etcd]# scp etcd* root@k8s-node2:/etc/etcd/cert/
etcd.csr? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100% 1054? ?? 1.2MB/s?? 00:00? ?
etcd-csr.json? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? 100%? 213?? 296.9KB/s?? 00:00? ?
etcd-key.pem? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100% 1679? ?? 2.6MB/s?? 00:00? ?
etcd.pem? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100% 1415? ?? 2.5MB/s?? 00:00? ?
[root@k8s-master etcd]#
4.3
創建etcd的systemd unit模塊文件
註意: \ 這個符號需改成\?
[root@k8s-master etcd]# cat etcd.service.template
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
User=k8s
Type=notify
WorkingDirectory=/var/lib/etcd/
ExecStart=/opt/k8s/bin/etcd \--data-dir=/var/lib/etcd \--name=##NODE_NAME## \--cert-file=/etc/etcd/cert/etcd.pem \--key-file=/etc/etcd/cert/etcd-key.pem \--trusted-ca-file=/etc/kubernetes/cert/ca.pem \--peer-cert-file=/etc/etcd/cert/etcd.pem \--peer-key-file=/etc/etcd/cert/etcd-key.pem \--peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem \--peer-client-cert-auth \--client-cert-auth \--listen-peer-urls=https://##NODE_IP##:2380 \--initial-advertise-peer-urls=https://##NODE_IP##:2380 \--listen-client-urls=https://##NODE_IP##:2379,http://127.0.0.1:2379
\--advertise-client-urls=https://##NODE_IP##:2379 \--initial-cluster-token=etcd-cluster-0 \--initial-cluster=${ETCD_NODES} \--initial-cluster-state=new
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
[root@k8s-master etcd]#
User :指定以 k8s 賬戶運行;
WorkingDirectory 、 --data-dir :指定工作目錄和數據目錄為
/var/lib/etcd ,需在啟動服務前創建這個目錄;
--name :指定節點名稱,當 --initial-cluster-state 值為 new 時, --
name 的參數值必須位於 --initial-cluster 列表中;
--cert-file 、 --key-file :etcd server 與 client 通信時使用的證書和私鑰;
--trusted-ca-file :簽名 client 證書的 CA 證書,用於驗證 client 證書;
--peer-cert-file 、 --peer-key-file :etcd 與 peer 通信使用的證書和私
鑰;
--peer-trusted-ca-file :簽名 peer 證書的 CA 證書,用於驗證 peer 證書;
分發生成的 systemd unit 文件,並修改好各節點配置文件裏的##NODE_NAME##和##NODE_IP##
[root@k8s-master etcd]# cp etcd.service.template /etc/systemd/system/etcd.service
[root@k8s-master etcd]# scp etcd.service.template root@k8s-node1:/etc/systemd/system/etcd.service
etcd.service.template? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? 100% 1038? ?? 1.1MB/s?? 00:00? ?
[root@k8s-master etcd]# scp etcd.service.template root@k8s-node2:/etc/systemd/system/etcd.service
etcd.service.template? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? 100% 1038? ?? 1.2MB/s?? 00:00? ?
[root@k8s-master etcd]#
##各個節點修改下
4.4
啟動etcd
[root@k8s-master ~]# systemctl daemon-reload && systemctl enable etcd && systemctl restart etcd
啟動報錯
Aug 20 16:40:29 k8s-master systemd: etcd.service holdoff time over, scheduling restart.
Aug 20 16:40:29 k8s-master systemd: Starting Etcd Server...
Aug 20 16:40:29 k8s-master etcd: etcd Version: 3.3.7
Aug 20 16:40:29 k8s-master etcd: Git SHA: 56536de55
Aug 20 16:40:29 k8s-master etcd: Go Version: go1.9.6
Aug 20 16:40:29 k8s-master etcd: Go OS/Arch: linux/amd64
Aug 20 16:40:29 k8s-master etcd: setting maximum number of CPUs to 1, total number of available CPUs is 1
Aug 20 16:40:29 k8s-master etcd: peerTLS: cert = /etc/etcd/cert/etcd.pem, key = /etc/etcd/cert/etcd-key.pem, ca = , trusted-ca = /etc/kubernetes/cert/ca.pem, client-cert-auth = true, crl-file =
Aug 20 16:40:29 k8s-master etcd: open /etc/etcd/cert/etcd-key.pem: permission denied
Aug 20 16:40:29 k8s-master systemd: etcd.service: main process exited, code=exited, status=1/FAILURE
Aug 20 16:40:29 k8s-master systemd: Failed to start Etcd Server.
Aug 20 16:40:29 k8s-master systemd: Unit etcd.service entered failed state.
Aug 20 16:40:29 k8s-master systemd: etcd.service failed.
[root@k8s-master ~]#
明顯 ?/etc/etcd/cert/etcd-key.pem: permission denied ?沒有權限
[root@k8s-master cert]# pwd
/etc/etcd/cert
[root@k8s-master cert]# ll
總用量 16
-rw-r--r-- 1 root root 1054 8月? 20 15:39 etcd.csr
-rw-r--r-- 1 root root? 213 8月? 20 15:39 etcd-csr.json
-rw------- 1 root root 1679 8月? 20 15:39 etcd-key.pem
-rw-r--r-- 1 root root 1415 8月? 20 15:39 etcd.pem
[root@k8s-master cert]#
我們啟用啟動etcd的用戶是k8s,而且這裏沒有x的權限。
修改權限設置
[root@k8s-master etc]# chown -R k8s /etc/etcd/cert/
[root@k8s-master cert]# chmod +x -R /etc/etcd/cert/
[root@k8s-master cert]# ll
總用量 16
-rwxr-xr-x 1 k8s root 1054 8月? 20 15:39 etcd.csr
-rwxr-xr-x 1 k8s root? 213 8月? 20 15:39 etcd-csr.json
-rwx--x--x 1 k8s root 1679 8月? 20 15:39 etcd-key.pem
-rwxr-xr-x 1 k8s root 1415 8月? 20 15:39 etcd.pem
/etc/kubernetes/cert/?權限也不對
[root@k8s-master cert]# cd /etc/kubernetes/cert/
[root@k8s-master cert]# ll
總用量 20
-rw-r--r-- 1 root root? 292 8月? 16 16:05 ca-config.json
-rw-r--r-- 1 root root? 993 8月? 16 16:05 ca.csr
-rw-r--r-- 1 root root? 201 8月? 16 16:05 ca-csr.json
-rw------- 1 root root 1675 8月? 16 16:05 ca-key.pem
-rw-r--r-- 1 root root 1338 8月? 16 16:05 ca.pem
[root@k8s-master kubernetes]# chown -R k8s /etc/kubernetes/cert/
[root@k8s-master kubernetes]# chmod -R +x /etc/kubernetes/cert
正常啟動的配置文件,見下:
[root@k8s-master cert]# cat /etc/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https:\github.com/coreos
[Service]
User=k8s
Type=notify
WorkingDirectory=/var/lib/etcd/
ExecStart=/opt/k8s/bin/etcd --data-dir=/var/lib/etcd --name=k8s-master --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem --trusted-ca-file=/etc/kubernetes/cert/ca.pem --peer-cert-file=/etc/etcd/cert/etcd.pem --peer-key-file=/etc/etcd/cert/etcd-key.pem --peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem --peer-client-cert-auth --client-cert-auth --listen-peer-urls=https://192.168.1.92:2380 --initial-advertise-peer-urls=https://192.168.1.92:2380 --listen-client-urls=https://192.168.1.92:2379,http://127.0.0.1:2379 --advertise-client-urls=https://192.168.1.92:2379 --initial-cluster-token=etcd-cluster-0 --initial-cluster=k8s-master=https://192.168.1.92:2380,k8s-node1=https://192.168.1.93:2380,k8s-node2=https://192.168.1.95:2380 --initial-cluster-state=new
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
[root@k8s-master cert]#
4.5
驗證etcd集群
報錯:
[root@k8s-master ~]# etcdctl cluster-health
failed to check the health of member 64fe8a986fbba907 on https://192.168.1.95:2379: Get https://192.168.1.95:2379/health: dial tcp 192.168.1.95:2379: getsockopt: no route to host
member 64fe8a986fbba907 is unreachable: [https://192.168.1.95:2379] are all unreachable
failed to check the health of member 9eddf87b04c89943 on https://192.168.1.93:2379: Get https://192.168.1.93:2379/health: dial tcp 192.168.1.93:2379: getsockopt: no route to host
member 9eddf87b04c89943 is unreachable: [https://192.168.1.93:2379] are all unreachable
failed to check the health of member d71352a6aad35c57 on https://192.168.1.92:2379: Get https://192.168.1.92:2379/health: x509: certificate signed by unknown authority
member d71352a6aad35c57 is unreachable: [https://192.168.1.92:2379] are all unreachable
cluster is unavailable
[root@k8s-master ~]#
[root@k8s-master ~]# etcdctl member list
client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint https://192.168.1.95:2379 exceeded header timeout
; error #1: client: endpoint https://192.168.1.93:2379 exceeded header timeout
; error #2: x509: certificate signed by unknown authority
[root@k8s-master ~]#
logs
[root@k8s-master ~]# cat /var/log/messages
Aug 20 18:06:36 k8s-master etcd: health check for peer 64fe8a986fbba907 could not connect: dial tcp 192.168.1.95:2380: getsockopt: no route to host
Aug 20 18:06:36 k8s-master etcd: health check for peer 9eddf87b04c89943 could not connect: dial tcp 192.168.1.93:2380: getsockopt: no route to host
Aug 20 18:06:36 k8s-master etcd: failed to reach the peerURL(https://192.168.1.95:2380) of member 64fe8a986fbba907 (Get https://192.168.1.95:2380/version: dial tcp 192.168.1.95:2380: getsockopt: no route to host)
Aug 20 18:06:36 k8s-master etcd: cannot get the version of member 64fe8a986fbba907 (Get https://192.168.1.95:2380/version: dial tcp 192.168.1.95:2380: getsockopt: no route to host)
Aug 20 18:06:36 k8s-master etcd: failed to reach the peerURL(https://192.168.1.93:2380) of member 9eddf87b04c89943 (Get https://192.168.1.93:2380/version: dial tcp 192.168.1.93:2380: getsockopt: no route to host)
Aug 20 18:06:36 k8s-master etcd: cannot get the version of member 9eddf87b04c89943 (Get https://192.168.1.93:2380/version: dial tcp 192.168.1.93:2380: getsockopt: no route to host)
Aug 20 18:06:39 k8s-master etcd: rejected connection from "192.168.1.92:50868" (error "remote error: tls: bad certificate", ServerName "")
Aug 20 18:06:40 k8s-master etcd: failed to reach the peerURL(https://192.168.1.95:2380) of member 64fe8a986fbba907 (Get https://192.168.1.95:2380/version: dial tcp 192.168.1.95:2380: getsockopt: no route to host)
Aug 20 18:06:40 k8s-master etcd: cannot get the version of member 64fe8a986fbba907 (Get https://192.168.1.95:2380/version: dial tcp 192.168.1.95:2380: getsockopt: no route to host)
Aug 20 18:06:40 k8s-master etcd: failed to reach the peerURL(https://192.168.1.93:2380) of member 9eddf87b04c89943 (Get https://192.168.1.93:2380/version: dial tcp 192.168.1.93:2380: getsockopt: no route to host)
Aug 20 18:06:40 k8s-master etcd: cannot get the version of member 9eddf87b04c89943 (Get https://192.168.1.93:2380/version: dial tcp 192.168.1.93:2380: getsockopt: no route to host)
Aug 20 18:06:41 k8s-master etcd: health check for peer 64fe8a986fbba907 could not connect: dial tcp 192.168.1.95:2380: getsockopt: no route to host
Aug 20 18:06:41 k8s-master etcd: health check for peer 9eddf87b04c89943 could not connect: dial tcp 192.168.1.93:2380: getsockopt: no route to host
Aug 20 18:06:42 k8s-master etcd: rejected connection from "192.168.1.92:50902" (error "remote error: tls: bad certificate", ServerName "")
Aug 20 18:06:44 k8s-master etcd: failed to reach the peerURL(https://192.168.1.95:2380) of member 64fe8a986fbba907 (Get https://192.168.1.95:2380/version: dial tcp 192.168.1.95:2380: getsockopt: no route to host)
Aug 20 18:06:44 k8s-master etcd: cannot get the version of member 64fe8a986fbba907 (Get https://192.168.1.95:2380/version: dial tcp 192.168.1.95:2380: getsockopt: no route to host)
[root@k8s-master ~]#
分析思路:
出問題的可能性:
配置文件配置出錯
證書
網絡
防火墻屏蔽了端口
一個個來測試
用telnet檢查發現2379和2380,防火墻沒有關閉。
關閉防火墻再測試,還是報錯:
Aug 21 09:04:02 k8s-node1 etcd: rejected connection from "192.168.1.92:36138" (error "remote error: tls: bad certificate", ServerName "")
Aug 21 09:04:19 k8s-node1 etcd: rejected connection from "192.168.1.93:51698" (error "remote error: tls: bad certificate", ServerName "")
[root@k8s-master ~]# etcdctl cluster-health
failed to check the health of member 64fe8a986fbba907 on https://192.168.1.95:2379: Get https://192.168.1.95:2379/health: x509: certificate signed by unknown authority
member 64fe8a986fbba907 is unreachable: [https://192.168.1.95:2379] are all unreachable
failed to check the health of member 9eddf87b04c89943 on https://192.168.1.93:2379: Get https://192.168.1.93:2379/health: x509: certificate signed by unknown authority
member 9eddf87b04c89943 is unreachable: [https://192.168.1.93:2379] are all unreachable
failed to check the health of member d71352a6aad35c57 on https://192.168.1.92:2379: Get https://192.168.1.92:2379/health: x509: certificate signed by unknown authority
member d71352a6aad35c57 is unreachable: [https://192.168.1.92:2379] are all unreachable
cluster is unavailable
這個報錯應該是證書的問題了
找資料發現,如果不帶證書測試就是報這個錯誤,帶證書後,測試正常,見下:
[root@k8s-master cert]# etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem --endpoints=https://192.168.1.92:2379,https://192.168.1.93:2379,https://192.168.1.95:2379 cluster-health
member 64fe8a986fbba907 is healthy: got healthy result from https://192.168.1.95:2379
member 9eddf87b04c89943 is healthy: got healthy result from https://192.168.1.93:2379
member d71352a6aad35c57 is healthy: got healthy result from https://192.168.1.92:2379
cluster is healthy
[root@k8s-master cert]#
[root@k8s-node2 ~]# etcdctl? --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem --endpoints=https://192.168.1.92:2379,https://192.168.1.93:2379,https://192.168.1.95:2379? member list
64fe8a986fbba907: name=k8s-node2 peerURLs=https://192.168.1.95:2380 clientURLs=https://192.168.1.95:2379 isLeader=true
9eddf87b04c89943: name=k8s-node1 peerURLs=https://192.168.1.93:2380 clientURLs=https://192.168.1.93:2379 isLeader=false
d71352a6aad35c57: name=k8s-master peerURLs=https://192.168.1.92:2380 clientURLs=https://192.168.1.92:2379 isLeader=false
[root@k8s-node2 ~]#
執行命令看看
master創建
[root@k8s-master cert]# etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem mkdir test
[root@k8s-master cert]# etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem mkdir ls
[root@k8s-master cert]# etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem? ls
/test
/ls
node2檢索
[root@k8s-master cert]# etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem mkdir test
[root@k8s-master cert]# etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem mkdir ls
[root@k8s-master cert]# etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem? ls
/test
/ls
數據同步了
4.6
執行文件的屬主和有沒有執行x的權限,請小心對比檢查。
部署k8s ssl集群實踐4:部署etcd集群