1. 程式人生 > 其它 >三、k8s叢集可用性驗證與調參(第一章、k8s高可用叢集安裝)

三、k8s叢集可用性驗證與調參(第一章、k8s高可用叢集安裝)

作者:北京小遠
出處:http://www.cnblogs.com/bj-xy/
參考課程: Kubernetes全棧架構師(電腦端購買優惠)
文件禁止轉載,轉載需標明出處,否則保留追究法律責任的權利!

目錄:

目錄

一、叢集可用性驗證

k8s安裝完成後需要進行可用性驗證,因為k8s中有pod通訊 server通訊 node通訊要保證通訊正常

1、1檢視基本元件情況

kubectl get node
kubectl get po -n kube-system

1、2建立測試pod

cat<<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  containers:
  - name: busybox
    image: busybox:1.28
    command:
      - sleep
      - "3600"
    imagePullPolicy: IfNotPresent
  restartPolicy: Always
EOF

1.3 測試連通性

1、測試pod能否解析同namespace的kubernetes的 svc
kubectl  get svc (檢視kubernetes的svc)
kubectl exec busybox -n default -- nslookup kubernetes (解析)

結果:
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

2、測試pod能否解析跨namespace的 svc
kubectl  get svc -n kube-system
kubectl exec busybox -n default -- nslookup kube-dns.kube-system
結果:
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      kube-dns.kube-system
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

3、測試每個節點能否訪問kubernetes svc的443埠 與kube-dns service的53埠
kubectl  get svc  獲取 cluster ip
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   24d
在每個節點執行
telnet 10.96.0.1 443

測試kube-dns的聯通
kubectl  get svc -n kube-system
NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
kube-dns         ClusterIP   10.96.0.10      <none>        53/UDP,53/TCP,9153/TCP   21d
metrics-server   ClusterIP   10.111.181.43   <none>        443/TCP                  21d

在每個節點執行
telnet 10.96.0.10  53

4、測試pod之間能否通訊
kubectl  get po -n kube-system -owide 獲取pod

進入到calico-node的某臺容器中
kubectl exec -it calico-node-2zkq8 -n kube-system -- bash
然後找一個其他節點的 calico-node pod的一個ip ping測試連通性

測試與busybox 測試容器的連通性
kubectl get po -owide

5、清除測試pod
kubectl delete po busybox
檢視是否刪除
kubectl  get po 

二、引數優化

2.1 容器配置引數優化

每臺機器執行
cat > /etc/docker/daemon.json <<EOF
{
        "exec-opts": ["native.cgroupdriver=systemd"],
        "log-driver": "json-file",
        "log-opts": {
                "max-size": "50m",
                "max-file": "3"
        },
        "registry-mirrors": ["https://ufkb7xyg.mirror.aliyuncs.com"],
        "max-concurrent-downloads": 10,
        "max-concurrent-uploads": 5,
        "live-restore": true
}
EOF

systemctl daemon-reload
systemctl restart docker
#刪除狀態是exited的容器
docker rm $(docker ps -q -f status=exited)


引數解析:
exec-opts:docker的CgroupDriver改成systemd
log-driver:日誌格式
log-opts:日誌儲存大小與份數
registry-mirrors:映象下載代理地址
max-concurrent-downloads:映象下載啟動的程序
max-concurrent-uploads:映象上傳啟動的程序
live-restore:配置守護程序,讓重啟docker服務不重啟容器

2.2 controller-manager配置引數優化

master節點配置
vim /usr/lib/systemd/system/kube-controller-manager.service 
--cluster-signing-duration=43800h0m0s \

systemctl daemon-reload
systemctl restart kube-controller-manager

設定自動辦法證書週期:
最大支援5年
--cluster-signing-duration

2.3 kubelet配置引數優化

每臺機器執行
vim  /etc/systemd/system/kubelet.service.d/10-kubelet.conf
Environment="KUBELET_EXTRA_ARGS=--node-labels=node.kubernetes.io/node='' --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384    --image-pull-progress-deadline=30m " 

解析
-tls-cipher-suites= 加密方式因為預設的加密方式不安全
--image-pull-progress-deadline 設定下載映象的 deadline避免重複嘗試


vim /opt/kubernetes/kubelet-conf.yml
rotateServerCertificates: true
allowedUnsafeSysctls:
 - "net.core*"
 - "net.ipv4.*"
kubeReserved:
  cpu: "100m"
  memory: 300Mi
  ephemeral-storage: 10Gi
systemReserved:
  cpu: "100m"
  memory: 300Mi
  ephemeral-storage: 10Gi
  
解析:
rotateServerCertificates 自動配置證書
allowedUnsafeSysctls 允許就該核心引數
kubeReserved    預留系統資源給k8s 根據機器資源分配
systemReserved

systemctl daemon-reload
systemctl restart kubelet

2.4 更改master節點的ROLES

設定所有master節點的角色
kubectl label node k8s1 node-role.kubernetes.io/master=""

三、設定coredns為自動擴容(可以不做)

官方文件

git專案地址

官方資源配置檔案

1、資源配置檔案
cat > dns-horizontal-autoscaler_v1.9.yaml << EOF
kind: ServiceAccount
apiVersion: v1
metadata:
  name: kube-dns-autoscaler
  namespace: kube-system
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: system:kube-dns-autoscaler
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["list", "watch"]
  - apiGroups: [""]
    resources: ["replicationcontrollers/scale"]
    verbs: ["get", "update"]
  - apiGroups: ["apps"]
    resources: ["deployments/scale", "replicasets/scale"]
    verbs: ["get", "update"]
# Remove the configmaps rule once below issue is fixed:
# kubernetes-incubator/cluster-proportional-autoscaler#16
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get", "create"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: system:kube-dns-autoscaler
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
subjects:
  - kind: ServiceAccount
    name: kube-dns-autoscaler
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: system:kube-dns-autoscaler
  apiGroup: rbac.authorization.k8s.io

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kube-dns-autoscaler
  namespace: kube-system
  labels:
    k8s-app: kube-dns-autoscaler
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  selector:
    matchLabels:
      k8s-app: kube-dns-autoscaler
  template:
    metadata:
      labels:
        k8s-app: kube-dns-autoscaler
      annotations:
        seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
    spec:
      priorityClassName: system-cluster-critical
      securityContext:
        supplementalGroups: [ 65534 ]
        fsGroup: 65534
      nodeSelector:
        kubernetes.io/os: linux
      containers:
      - name: autoscaler
        image: registry.cn-beijing.aliyuncs.com/yuan-k8s/cluster-proportional-autoscaler-amd64:1.8.1
        resources:
            requests:
                cpu: "20m"
                memory: "10Mi"
        command:
          - /cluster-proportional-autoscaler
          - --namespace=kube-system
          - --configmap=kube-dns-autoscaler
          - --target=deployment/coredns
          - --default-params={"linear":{"coresPerReplica":16,"nodesPerReplica":4,"min":2,"max":4,"preventSinglePointFailure":true,"includeUnschedulableNodes":true}}
          - --logtostderr=true
          - --v=2
      tolerations:
      - key: "CriticalAddonsOnly"
        operator: "Exists"
      serviceAccountName: kube-dns-autoscaler
EOF

2、解析
-default-params中的引數
min:coredns最小數
max:coredns最打數
coresPerReplica:叢集CPU核心數
nodesPerReplica:叢集節點數
preventSinglePointFailure:設定為 時true,如果有多個節點,控制器確保至少有 2 個副本
includeUnschedulableNodes被設定為true,副本將刻度基於節點的總數。否則副本將僅根據可排程節點的數量進行擴充套件

注意:
- --target= 這裡與kube-dns.yaml中定義的name一致

如果報錯沒有ConfigMap則建立下面的資源
cat > dns-autoscaler-ConfigMap.yaml << EOF
kind: ConfigMap
apiVersion: v1
metadata:
  name: dns-autoscaler
  namespace: kube-system
data:
  linear: |-
    {
      "coresPerReplica": 16,
      "nodesPerReplica": 4,
      "min": 2, 
      "preventSinglePointFailure": true
    }
EOF

3、建立
dns-horizontal-autoscaler_v1.9.yaml 

4、檢視
kubectl  get po -n kube-system