kubernetes二進位制部署k8s-master叢集controller-manager服務unhealthy問題
阿新 • • 發佈:2018-12-29
一.問題現象
我們使用二進位制部署k8s的高可用叢集時,在部署多master時,kube-controller-manager服務提示Unhealthy
[[email protected] system]# kubectl get cs NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02" etcd-1 Healthy {"health":"true"} etcd-0 Healthy {"health":"true"} etcd-2 Healthy {"health":"true"}
這裡我們檢視得知kube-controller-manager的服務執行時提示有一些日誌報錯問題:
[[email protected] system]# systemctl status kube-controller-manager -l ● kube-controller-manager.service - Kubernetes Controller Manager Loaded: loaded (/etc/systemd/system/kube-controller-manager.service; enabled; vendor preset: disabled) Active: active (running) since Sat 2018-12-29 03:56:00 EST; 31min ago Docs: https://github.com/GoogleCloudPlatform/kubernetes Main PID: 126295 (kube-controller) Tasks: 8 Memory: 8.4M CGroup: /system.slice/kube-controller-manager.service └─126295 /usr/local/bin/kube-controller-manager --port=0 --secure-port=10252 --bind-address=127.0.0.1 --kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig --authentication-kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig --service-cluster-ip-range=10.254.0.0/16 --cluster-name=kubernetes --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem --experimental-cluster-signing-duration=8760h --root-ca-file=/etc/kubernetes/cert/ca.pem --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem --leader-elect=true --feature-gates=RotateKubeletServerCertificate=true --controllers=*,bootstrapsigner,tokencleaner --horizontal-pod-autoscaler-use-rest-clients=true --horizontal-pod-autoscaler-sync-period=10s --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem --use-service-account-credentials=true --alsologtostderr=true --logtostderr=false --log-dir=/var/log/kubernetes --v=2 Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: I1229 03:56:00.395082 126295 flags.go:33] FLAG: --version="false" Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: I1229 03:56:00.395093 126295 flags.go:33] FLAG: --vmodule="" Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: W1229 03:56:00.819583 126295 authentication.go:296] Cluster doesn't provide requestheader-client-ca-file in configmap/extension-apiserver-authentication in kube-system, so request-header client certificate authentication won't work. Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: W1229 03:56:00.820210 126295 authorization.go:146] No authorization-kubeconfig provided, so SubjectAccessReview of authorization tokens won't work. Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: I1229 03:56:00.820252 126295 controllermanager.go:151] Version: v1.13.1 Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: I1229 03:56:00.822080 126295 secure_serving.go:116] Serving securely on 127.0.0.1:10252 Dec 29 03:56:00 ceph-01 kube-controller-manager[126295]: I1229 03:56:00.822954 126295 leaderelection.go:205] attempting to acquire leader lease kube-system/kube-controller-manager... Dec 29 03:57:44 ceph-01 kube-controller-manager[126295]: I1229 03:57:44.753997 126295 log.go:172] http: TLS handshake error from 127.0.0.1:40918: tls: first record does not look like a TLS handshake Dec 29 03:57:46 ceph-01 kube-controller-manager[126295]: I1229 03:57:46.558093 126295 log.go:172] http: TLS handshake error from 127.0.0.1:40948: tls: first record does not look like a TLS handshake Dec 29 04:08:35 ceph-01 kube-controller-manager[126295]: I1229 04:08:35.872211 126295 log.go:172] http: TLS handshake error from 127.0.0.1:43564: tls: first record does not look like a TLS handshake
二.問題解決
這裡我們推測是kube-controller-manager服務的Service檔案的配置問題:
[[email protected] system]# cat kube-controller-manager.service [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] ExecStart=/usr/local/bin/kube-controller-manager \ --port=0 \ --secure-port=10252 \ --bind-address=127.0.0.1 \ --kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig \ --authentication-kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig \ --service-cluster-ip-range=10.254.0.0/16 \ --cluster-name=kubernetes \ --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \ --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \ --experimental-cluster-signing-duration=8760h \ --root-ca-file=/etc/kubernetes/cert/ca.pem \ --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem \ --leader-elect=true \ --feature-gates=RotateKubeletServerCertificate=true \ --controllers=*,bootstrapsigner,tokencleaner \ --horizontal-pod-autoscaler-use-rest-clients=true \ --horizontal-pod-autoscaler-sync-period=10s \ --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \ --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \ --use-service-account-credentials=true \ --alsologtostderr=true \ --logtostderr=false \ --log-dir=/var/log/kubernetes \ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target
我們在service檔案中加了--port=0
和--secure-port=10252
和--bind-address=127.0.0.1
這三行配置的功能是:
- --port=0:關閉監聽 http /metrics 的請求,同時 --address 引數無效,--bind-address 引數有效
- --secure-port=10252、--bind-address=0.0.0.0: 在所有網路介面監聽 10252 埠的 https /metrics 請求
這裡我們去掉這三行配置:
[[email protected] system]# cat kube-controller-manager.service
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
ExecStart=/usr/local/bin/kube-controller-manager \
--kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig \
--authentication-kubeconfig=/etc/kubernetes/cert/kube-controller-manager.kubeconfig \
--service-cluster-ip-range=10.254.0.0/16 \
--cluster-name=kubernetes \
--cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \
--cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \
--experimental-cluster-signing-duration=8760h \
--root-ca-file=/etc/kubernetes/cert/ca.pem \
--service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem \
--leader-elect=true \
--feature-gates=RotateKubeletServerCertificate=true \
--controllers=*,bootstrapsigner,tokencleaner \
--horizontal-pod-autoscaler-use-rest-clients=true \
--horizontal-pod-autoscaler-sync-period=10s \
--tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \
--tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \
--use-service-account-credentials=true \
--alsologtostderr=true \
--logtostderr=false \
--log-dir=/var/log/kubernetes \
--v=2
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
重啟相關服務:
[[email protected] system]# systemctl daemon-reload
[[email protected] system]# systemctl restart kube-controller-manager
三.檢視叢集服務是否正常
[[email protected] system]# kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}