1. 程式人生 > 其它 >RKE搭建k8s叢集&Helm3安裝Rancher2.5.8高可用

RKE搭建k8s叢集&Helm3安裝Rancher2.5.8高可用

HPA控制器介紹

當系統資源過高的時候,我們可以使用如下命令來實現 Pod 的擴縮容功能

$ kubectl -n luffy scale deployment myblog --replicas=2

但是這個過程是手動操作的。在實際專案中,我們需要做到是的是一個自動化感知並自動擴容的操作。Kubernetes 也為提供了這樣的一個資源物件:Horizontal Pod Autoscaling(Pod 水平自動伸縮),簡稱HPA

基本原理:HPA 通過監控分析控制器控制的所有 Pod 的負載變化情況來確定是否需要調整 Pod 的副本數量

HPA的實現有兩個版本:

  • autoscaling/v1,只包含了根據CPU指標的檢測,穩定版本
  • autoscaling/v2beta1,支援根據memory或者使用者自定義指標進行伸縮

如何獲取Pod的監控資料?

  • k8s 1.8以下:使用heapster,1.11版本完全廢棄
  • k8s 1.8以上:使用metric-server

思考:為什麼之前用 heapster ,現在廢棄了專案,改用 metric-server ?

heapster時代,apiserver 會直接將metric請求通過apiserver proxy 的方式轉發給叢集內的 hepaster 服務,採用這種 proxy 方式是有問題的:

  • http://kubernetes_master_address/api/v1/namespaces/namespace_name/services/service_name[:port_name]/proxy
    
  • proxy只是代理請求,一般用於問題排查,不夠穩定,且版本不可控

  • heapster的介面不能像apiserver一樣有完整的鑑權以及client整合

  • pod 的監控資料是核心指標(HPA排程),應該和 pod 本身擁有同等地位,即 metric應該作為一種資源存在,如metrics.k8s.io 的形式,稱之為 Metric Api

於是官方從 1.8 版本開始逐步廢棄 heapster,並提出了上邊 Metric api 的概念,而 metrics-server 就是這種概念下官方的一種實現,用於從 kubelet獲取指標,替換掉之前的 heapster。

Metrics Server

可以通過標準的 Kubernetes API 把監控資料暴露出來,比如獲取某一Pod的監控資料:

https://192.168.136.10:6443/apis/metrics.k8s.io/v1beta1/namespaces/<namespace-name>/pods/<pod-name>

# https://192.168.136.10:6443/api/v1/namespaces/luffy/pods?limit=500

目前的採集流程:

Metric Server

官方介紹

...
Metric server collects metrics from the Summary API, exposed by Kubelet on each node.

Metrics Server registered in the main API server through Kubernetes aggregator, which was introduced in Kubernetes 1.7
...
安裝

官方程式碼倉庫地址:https://github.com/kubernetes-sigs/metrics-server

Depending on your cluster setup, you may also need to change flags passed to the Metrics Server container. Most useful flags:

  • --kubelet-preferred-address-types - The priority of node address types used when determining an address for connecting to a particular node (default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])
  • --kubelet-insecure-tls - Do not verify the CA of serving certificates presented by Kubelets. For testing purposes only.
  • --requestheader-client-ca-file - Specify a root certificate bundle for verifying client certificates on incoming requests.
$ wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml

修改args引數:

...
 84       containers:
 85       - name: metrics-server
 86         image: registry.aliyuncs.com/google_containers/metrics-server-amd64:v0.3.6
 87         imagePullPolicy: IfNotPresent
 88         args:
 89           - --cert-dir=/tmp
 90           - --secure-port=4443
 91           - --kubelet-insecure-tls
 92           - --kubelet-preferred-address-types=InternalIP
...

執行安裝:

$ kubectl create -f components.yaml

$ kubectl -n kube-system get pods

$ kubectl top nodes
kubelet的指標採集

無論是 heapster還是 metric-server,都只是資料的中轉和聚合,兩者都是呼叫的 kubelet 的 api 介面獲取的資料,而 kubelet 程式碼中實際採集指標的是 cadvisor 模組,你可以在 node 節點訪問 10250 埠獲取監控資料:

呼叫示例:

$ curl -k  -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6InhXcmtaSG5ZODF1TVJ6dUcycnRLT2c4U3ZncVdoVjlLaVRxNG1wZ0pqVmcifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi10b2tlbi1xNXBueiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJhZG1pbiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImViZDg2ODZjLWZkYzAtNDRlZC04NmZlLTY5ZmE0ZTE1YjBmMCIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbiJ9.iEIVMWg2mHPD88GQ2i4uc_60K4o17e39tN0VI_Q_s3TrRS8hmpi0pkEaN88igEKZm95Qf1qcN9J5W5eqOmcK2SN83Dd9dyGAGxuNAdEwi0i73weFHHsjDqokl9_4RGbHT5lRY46BbIGADIphcTeVbCggI6T_V9zBbtl8dcmsd-lD_6c6uC2INtPyIfz1FplynkjEVLapp_45aXZ9IMy76ljNSA8Uc061Uys6PD3IXsUD5JJfdm7lAt0F7rn9SdX1q10F2lIHYCMcCcfEpLr4Vkymxb4IU4RCR8BsMOPIO_yfRVeYZkG4gU2C47KwxpLsJRrTUcUXJktSEPdeYYXf9w" https://localhost:10250/metrics

kubelet雖然提供了 metric 介面,但實際監控邏輯由內建的cAdvisor模組負責,早期的時候,cadvisor是單獨的元件,從k8s 1.12開始,cadvisor 監聽的埠在k8s中被刪除,所有監控資料統一由Kubelet的API提供。

cadvisor獲取指標時實際呼叫的是 runc/libcontainer庫,而libcontainer是對 cgroup檔案 的封裝,即 cadvsior也只是個轉發者,它的資料來自於cgroup檔案。

cgroup檔案中的值是監控資料的最終來源,如

  • mem usage的值,

    • 對於docker容器來講,來源於/sys/fs/cgroup/memory/docker/[containerId]/memory.usage_in_bytes

    • 對於pod來講,/sys/fs/cgroup/memory/kubepods/besteffort/pod[podId]/memory.usage_in_bytes或者

      /sys/fs/cgroup/memory/kubepods/burstable/pod[podId]/memory.usage_in_bytes

  • 如果沒限制記憶體,Limit = machine_mem,否則來自於
    /sys/fs/cgroup/memory/docker/[id]/memory.limit_in_bytes

  • 記憶體使用率 = memory.usage_in_bytes/memory.limit_in_bytes

Metrics資料流:

思考:

Metrics Server是獨立的一個服務,只能服務內部實現自己的api,是如何做到通過標準的kubernetes 的API格式暴露出去的?
kube-aggregator

kube-aggregator聚合器及Metric-Server的實現

kube-aggregator是對 apiserver 的api的一種拓展機制,它允許開發人員編寫一個自己的服務,並把這個服務註冊到k8s的api裡面,即擴充套件 API 。

定義一個APIService物件:

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  name: v1beta1.luffy.k8s.io
spec:
  group: luffy.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: service-A       # 必須https訪問
    namespace: luffy
    port: 443   
  version: v1beta1
  versionPriority: 100

k8s會自動幫我們代理如下url的請求:

proxyPath := "/apis/" + apiService.Spec.Group + "/" + apiService.Spec.Version

即:https://192.168.136.10:6443/apis/luffy.k8s.io/v1beta1/xxxx轉到我們的service-A服務中,service-A中只需要實現 https://service-A/luffy.k8s.io/v1beta1/xxxx 即可。

看下metric-server的實現:

$ kubectl get apiservice 
NAME                       SERVICE                      AVAILABLE                      
v1beta1.metrics.k8s.io   kube-system/metrics-server		True

$ kubectl get apiservice v1beta1.metrics.k8s.io -oyaml
...
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
    port: 443
  version: v1beta1
  versionPriority: 100
...

$ kubectl -n kube-system get svc metrics-server
NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
metrics-server   ClusterIP   10.110.111.146   <none>        443/TCP   11h

$ curl -k  -H "Authorization: Bearer xxxx" https://10.110.111.146
{
  "paths": [
    "/apis",
    "/apis/metrics.k8s.io",
    "/apis/metrics.k8s.io/v1beta1",
    "/healthz",
    "/healthz/healthz",
    "/healthz/log",
    "/healthz/ping",
    "/healthz/poststarthook/generic-apiserver-start-informers",
    "/metrics",
    "/openapi/v2",
    "/version"
  ]

# https://192.168.136.10:6443/apis/metrics.k8s.io/v1beta1/namespaces/<namespace-name>/pods/<pod-name>
# 
$ curl -k  -H "Authorization: Bearer xxxx" https://10.110.111.146/apis/metrics.k8s.io/v1beta1/namespaces/luffy/pods/myblog-5d9ff54d4b-4rftt

$ curl -k  -H "Authorization: Bearer xxxx" https://192.168.136.10:6443/apis/metrics.k8s.io/v1beta1/namespaces/luffy/pods/myblog-5d9ff54d4b-4rftt

HPA實踐
基於CPU的動態伸縮

建立hpa物件:

# 方式一
$ cat hpa-myblog.yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-myblog-cpu
  namespace: luffy
spec:
  maxReplicas: 3
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myblog
  targetCPUUtilizationPercentage: 10

# 方式二
$ kubectl -n luffy autoscale deployment myblog --cpu-percent=10 --min=1 --max=3

Deployment物件必須配置requests的引數,不然無法獲取監控資料,也無法通過HPA進行動態伸縮
驗證:

$ yum -y install httpd-tools
$ kubectl -n luffy get svc myblog
myblog   ClusterIP   10.104.245.225   <none>        80/TCP    6d18h

# 為了更快看到效果,先調整副本數為1
$ kubectl -n luffy scale deploy myblog --replicas=1

# 模擬1000個使用者併發訪問頁面10萬次
$ ab -n 100000 -c 1000 http://10.104.245.225/blog/index/

$ kubectl get hpa
$ kubectl -n luffy get pods

壓力降下來後,會有預設5分鐘的scaledown的時間,可以通過controller-manager的如下引數設定:

--horizontal-pod-autoscaler-downscale-stabilization

The value for this option is a duration that specifies how long the autoscaler has to wait before another downscale operation can be performed after the current one has completed. The default value is 5 minutes (5m0s).

是一個逐步的過程,當前的縮放完成後,下次縮放的時間間隔,比如從3個副本降低到1個副本,中間大概會等待2*5min = 10分鐘

基於記憶體的動態伸縮

建立hpa物件

$ cat hpa-demo-mem.yaml
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-demo-mem
  namespace: luffy
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hpa-demo-mem
  minReplicas: 1
  maxReplicas: 3
  metrics:
  - type: Resource
    resource:
      name: memory
      targetAverageUtilization: 30

加壓演示指令碼:

$ cat increase-mem-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: increase-mem-config
  namespace: luffy
data:
  increase-mem.sh: |
    #!/bin/bash  
    mkdir /tmp/memory  
    mount -t tmpfs -o size=40M tmpfs /tmp/memory  
    dd if=/dev/zero of=/tmp/memory/block  
    sleep 60 
    rm /tmp/memory/block  
    umount /tmp/memory  
    rmdir /tmp/memory

測試deployment:

$ cat hpa-demo-mem-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: hpa-demo-mem
  namespace: luffy
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      volumes:
      - name: increase-mem-script
        configMap:
          name: increase-mem-config
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80
        volumeMounts:
        - name: increase-mem-script
          mountPath: /etc/script
        resources:
          requests:
            memory: 50Mi
            cpu: 50m
        securityContext:
          privileged: true

測試:

$ kubectl create -f increase-mem-config.yaml
$ kubectl create -f hpa-demo-mem.yaml
$ kubectl create -f hpa-demo-mem-deploy.yaml

$ kubectl -n luffy exec -ti hpa-demo-mem-7fc75bf5c8-xx424 sh
#/ sh /etc/script/increase-mem.sh


# 觀察hpa及pod
$ kubectl -n luffy get hpa
$ kubectl -n luffy get po
基於自定義指標的動態伸縮

除了基於 CPU 和記憶體來進行自動擴縮容之外,我們還可以根據自定義的監控指標來進行。這個我們就需要使用 Prometheus Adapter,Prometheus 用於監控應用的負載和叢集本身的各種指標,Prometheus Adapter 可以幫我們使用 Prometheus 收集的指標並使用它們來制定擴充套件策略,這些指標都是通過 APIServer 暴露的,而且 HPA 資源物件也可以很輕易的直接使用。


架構圖: