1. 程式人生 > >聊聊你可能誤解的Kubernetes Deployment滾動更新機制

聊聊你可能誤解的Kubernetes Deployment滾動更新機制

Author: [email protected]

摘要: Kubernetes Deployment滾動更新機制不同於ReplicationController rolling update,Deployment rollout還提供了滾動進度查詢,滾動歷史記錄,回滾等能力,無疑是使用Kubernetes進行應用滾動釋出的首選。本博文,將帶你聊聊那些容易被大家忽略或者誤解的特性。

定義Deployment時與rolling update的相關項

以下面的frontend Deployment為例,重點關注.spec.minReadySeconds,.spec.strategy.rollingUpdate.maxSurge

,.spec.strategy.rollingUpdate. maxUnavailable

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: frontend
spec:
  minReadySeconds: 5
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 3
      maxUnavailable: 2
  replicas: 25
  template:
    metadata:
      labels:
        app: guestbook
        tier: frontend
    spec:
      containers:
      - name: php-redis
        image: gcr.io/google-samples/gb-frontend:v4
        resources:
          requests:
            cpu: 100
m memory: 100Mi env: - name: GET_HOSTS_FROM value: dns # If your cluster config does not include a dns service, then to # instead access environment variables to find service host # info, comment out the 'value: dns' line above, and uncomment the # line below: # value
: env ports: - containerPort: 80
  • .spec.minReadySeconds: 新建立的Pod狀態為Ready持續的時間至少為.spec.minReadySeconds才認為Pod Available(Ready)。
  • .spec.strategy.rollingUpdate.maxSurge: specifies the maximum number of Pods that can be created over the desired number of Pods. The value cannot be 0 if MaxUnavailable is 0. 可以為整數或者百分比,預設為desired Pods數的25%. Scale Up新的ReplicaSet時,按照比例計算出允許的MaxSurge,計算時向上取整(比如3.4,取4)。
  • .spec.strategy.rollingUpdate.maxUnavailable: specifies the maximum number of Pods that can be unavailable during the update process. The value cannot be 0 if maxSurge is 0.可以為整數或者百分比,預設為desired Pods數的25%. Scale Down舊的ReplicaSet時,按照比例計算出允許的maxUnavailable,計算時向下取整(比如3.6,取3)。

因此,在Deployment rollout時,需要保證Available(Ready) Pods數不低於 desired pods number - maxUnavailable; 保證所有的Pods數不多於 desired pods number + maxSurge

滾動更新的流程

Note: A Deployment’s rollout is triggered if and only if the Deployment’s pod template (that is, .spec.template) is changed, for example if the labels or container images of the template are updated. Other updates, such as scaling the Deployment, do not trigger a rollout.

我們繼續以上面的Deployment為例子,並考慮最常用的情況–更新image(釋出新版本):

kubectl set image deploy frontend php-redis=gcr.io/google-samples/gb-frontend:v3 --record

set image之後,導致Deployment’s Pod Template發生變化,就會觸發rollout。我們只考慮RollingUpdate策略(Kubernetes還支援ReCreate更新策略)。通過kubectl get rs -w來watch ReplicaSet的變化。

[root@master03 ~]# kubectl get rs -w
NAME                  DESIRED   CURRENT   READY     AGE
frontend-3114648124   25        25        25        14m
frontend-3099797709   0         0         0         1h
frontend-3099797709   0         0         0         1h
frontend-3099797709   3         0         0         1h
frontend-3114648124   23        25        25        17m
frontend-3099797709   5         0         0         1h
frontend-3114648124   23        25        25        17m
frontend-3114648124   23        23        23        17m
frontend-3099797709   5         0         0         1h
frontend-3099797709   5         3         0         1h
frontend-3099797709   5         5         0         1h
frontend-3099797709   5         5         1         1h
frontend-3114648124   22        23        23        17m
frontend-3099797709   5         5         2         1h
frontend-3114648124   22        23        23        17m
frontend-3114648124   22        22        22        17m
frontend-3099797709   6         5         2         1h
frontend-3114648124   21        22        22        17m
frontend-3099797709   6         5         2         1h
frontend-3114648124   21        22        22        17m
frontend-3099797709   7         5         2         1h
frontend-3099797709   7         6         2         1h
frontend-3114648124   21        21        21        17m
frontend-3099797709   7         6         2         1h
frontend-3099797709   7         7         2         1h
frontend-3099797709   7         7         2         1h
frontend-3099797709   7         7         3         1h
frontend-3099797709   7         7         4         1h
frontend-3114648124   20        21        21        17m
frontend-3099797709   8         7         4         1h
frontend-3114648124   20        21        21        17m
frontend-3114648124   20        20        20        17m
frontend-3099797709   8         7         4         1h
frontend-3099797709   8         8         4         1h
frontend-3099797709   8         8         5         1h
frontend-3114648124   19        20        20        17m
frontend-3099797709   9         8         5         1h
frontend-3114648124   19        20        20        17m
frontend-3099797709   9         8         5         1h
frontend-3099797709   9         9         5         1h
frontend-3114648124   19        19        19        17m
frontend-3099797709   9         9         5         1h
frontend-3114648124   18        19        19        18m
frontend-3099797709   10        9         5         1h
frontend-3114648124   18        19        19        18m
frontend-3099797709   10        9         5         1h
frontend-3114648124   18        18        18        18m
frontend-3099797709   10        10        5         1h
frontend-3099797709   10        10        5         1h
frontend-3114648124   18        18        18        18m
frontend-3099797709   10        10        6         1h
frontend-3099797709   10        10        6         1h
frontend-3114648124   17        18        18        18m
frontend-3114648124   17        18        18        18m
frontend-3099797709   11        10        6         1h
frontend-3099797709   11        10        6         1h
frontend-3114648124   17        17        17        18m
frontend-3099797709   11        11        6         1h

說明:
1. frontend-3114648124為原來的RS(成為OldRS),frontend-3099797709為新建的RS(成為NewRS,當然也可能是Old RS,如果之前執行過這個一樣的內容)。
2. maxSurge:3, maxUnavailable=2, desired replicas=25

  • NewRS建立maxSurge(3)個Pods,這時達到pods數的上限值desired replicas + maxSurge (28個)
  • 不會等NewRS建立的Pods Ready,而是馬上delete OldRS maxUnavailable(2)個Pods,這時Ready的Pods number最差也能保證desired replicas - maxUnavailable(23個)
  • 接下來的流程是不固定,只要新建的Pods有幾個返回Ready,則意味著可以接著刪除幾個舊的Pods了。只要有幾個刪除成功的Pods返回,就會建立一定數量的Pods,只要All pods數量與上限值desired replicas + maxSurge有差值空間,就會接著建立新的Pods。
  • 如此進行滾動更新, 直到建立的新Pods個數達到desired replicas,並等待它們都Ready,然後再刪除所有剩餘的舊的Pods。至此,滾動流程結束。

對同一個Deployment先後觸發滾動更新,邏輯如何?

我們考慮這個情況,但使用者執行某個滾動更新後,未等待此次滾動更新結束,就繼續執行了一次新的滾動更新請求,這時後臺滾動流程會怎麼樣呢?會亂成一鍋粥麼?

我們繼續以這個例子來看:

# deploy frontend 穩定執行在v2(frontend-888714875)時:
[[email protected] ~]# kubectl get rs -w
NAME                DESIRED   CURRENT   READY     AGE

====執行 kubectl set image deploy frontend php-redis=gcr.io/google-samples/gb-frontend:v3 --record
----備註: v3 --> frontend-776431694
frontend-776431694   0         0         0         6h
frontend-776431694   0         0         0         6h
frontend-776431694   3         0         0         6h
frontend-888714875   23        25        25        5h
frontend-776431694   5         0         0         6h
frontend-888714875   23        25        25        5h
frontend-888714875   23        23        23        5h
frontend-776431694   5         0         0         6h
frontend-776431694   5         3         0         6h
frontend-776431694   5         5         0         6h
frontend-776431694   5         5         1         6h
frontend-776431694   5         5         2         6h
frontend-776431694   5         5         3         6h
frontend-776431694   5         5         4         6h
frontend-776431694   5         5         4         6h
frontend-888714875   22        23        23        5h
frontend-776431694   6         5         4         6h
frontend-888714875   22        23        23        5h
frontend-888714875   22        22        22        5h
frontend-776431694   6         5         4         6h
frontend-776431694   6         6         4         6h
frontend-776431694   6         6         4         6h
frontend-888714875   19        22        22        5h
frontend-776431694   9         6         4         6h
frontend-888714875   19        22        22        5h
frontend-776431694   9         6         4         6h
frontend-888714875   19        19        19        5h
frontend-776431694   9         9         4         6h
frontend-888714875   19        19        19        5h

==== 執行 kubectl set image deploy frontend php-redis=gcr.io/google-samples/gb-frontend:v4 --record ====
----- 備註:v4 --> frontend-3099797709 ----

frontend-3099797709   0         0         0         6h
frontend-3099797709   0         0         0         6h
frontend-776431694   4         9         4         6h
frontend-3099797709   5         0         0         6h
frontend-3099797709   5         0         0         6h
frontend-3099797709   5         5         0         6h
frontend-776431694   4         9         4         6h
frontend-776431694   4         4         4         6h
frontend-3099797709   5         5         0         6h
frontend-3099797709   5         5         1         6h
frontend-3099797709   5         5         2         6h
frontend-3099797709   5         5         3         6h
frontend-3099797709   5         5         4         6h
frontend-3099797709   5         5         4         6h
frontend-776431694   2         4         4         6h
frontend-3099797709   7         5         4         6h
frontend-776431694   2         4         4         6h
frontend-776431694   2         2         2         6h
frontend-776431694   2         2         2         6h
frontend-3099797709   7         5         4         6h
frontend-776431694   0         2         2         6h
frontend-3099797709   7         7         4         6h
frontend-776431694   0         2         2         6h
frontend-3099797709   9         7         4         6h
frontend-776431694   0         0         0         6h
frontend-3099797709   9         7         4         6h
frontend-3099797709   9         9         4         6h
frontend-776431694   0         0         0         6h
frontend-3099797709   9         9         4         6h
frontend-3099797709   9         9         5         6h
frontend-3099797709   9         9         6         6h
frontend-3099797709   9         9         7         6h
frontend-888714875   17        19        19        5h
frontend-3099797709   11        9         7         6h
frontend-888714875   17        19        19        5h
frontend-888714875   17        17        17        5h
frontend-3099797709   11        9         7         6h
frontend-888714875   16        17        17        5h
frontend-3099797709   11        11        7         6h
frontend-3099797709   12        11        7         6h
frontend-888714875   16        17        17        5h
frontend-888714875   16        16        16        5h
frontend-3099797709   12        11        7         6h
frontend-3099797709   12        12        7         6h
frontend-3099797709   12        12        8         6h
frontend-3099797709   12        12        8         6h
frontend-888714875   15        16        16        5h
frontend-3099797709   13        12        8         6h
frontend-888714875   15        16        16        5h
frontend-888714875   15        15        15        5h
frontend-3099797709   13        12        8         6h
frontend-3099797709   13        13        8         6h
frontend-3099797709   13        13        8         6h
frontend-3099797709   13        13        9         6h
frontend-3099797709   13        13        10        6h
frontend-888714875   14        15        15        5h
frontend-3099797709   14        13        10        6h
frontend-888714875   14        15        15        5h
frontend-888714875   14        14        14        5h
frontend-3099797709   14        13        10        6h
frontend-888714875   14        14        14        5h
frontend-3099797709   14        14        11        6h
frontend-3099797709   14        14        12        6h
frontend-3099797709   14        14        12        6h
frontend-3099797709   14        14        12        6h
frontend-888714875   11        14        14        5h
frontend-3099797709   17        14        12        6h
frontend-888714875   11        14        14        5h
frontend-3099797709   17        14        12        6h
frontend-888714875   11        11        11        5h
frontend-3099797709   17        17        12        6h
frontend-888714875   11        11        11        5h
frontend-3099797709   17        17        12        6h
frontend-3099797709   17        17        13        6h
frontend-3099797709   17        17        14        6h
frontend-3099797709   17        17        14        6h
frontend-888714875   10        11        11        5h
frontend-3099797709   18        17        14        6h
frontend-888714875   10        11        11        5h
frontend-888714875   10        10        10        5h
frontend-3099797709   18        17        14        6h
frontend-3099797709   18        18        14        6h
frontend-3099797709   18        18        15        6h
frontend-888714875   9         10        10        5h
frontend-3099797709   18        18        16        6h
frontend-888714875   9         10        10        5h
frontend-3099797709   19        18        16        6h
frontend-3099797709   19        18        16        6h
frontend-888714875   9         9         9         5h
frontend-888714875   7         9         9         5h
frontend-3099797709   19        18        16        6h
frontend-888714875   7         9         9         5h
frontend-3099797709   21        18        16        6h
frontend-888714875   7         9         9         5h
frontend-3099797709   21        19        16        6h
frontend-888714875   7         7         7         5h
frontend-3099797709   21        21        16        6h
frontend-888714875   7         7         7         5h
frontend-3099797709   21        21        17        6h
frontend-3099797709   21        21        18        6h
frontend-3099797709   21        21        18        6h
frontend-888714875   5         7         7         5h
frontend-888714875   5         7         7         5h
frontend-3099797709   23        21        18        6h
frontend-888714875   5         5         5         5h
frontend-3099797709   23        21        18        6h
frontend-3099797709   23        23        18        6h
frontend-3099797709   23        23        18        6h
frontend-3099797709   23        23        19        6h
frontend-3099797709   23        23        20        6h
frontend-3099797709   23        23        20        6h
frontend-888714875   3         5         5         5h
frontend-3099797709   25        23        20        6h
frontend-888714875   3         5         5         5h
frontend-888714875   3         3         3         5h
frontend-3099797709   25        23        20        6h
frontend-888714875   3         3         3         5h
frontend-3099797709   25        25        20        6h
frontend-3099797709   25        25        21        6h
frontend-3099797709   25        25        22        6h
frontend-3099797709   25        25        22        6h
frontend-888714875   2         3         3         5h
frontend-888714875   2         3         3         5h
frontend-888714875   2         2         2         5h
frontend-888714875   2         2         2         5h
frontend-3099797709   25        25        23        6h
frontend-888714875   1         2         2         5h
frontend-888714875   1         2         2         5h
frontend-888714875   1         1         1         5h
frontend-3099797709   25        25        23        6h
frontend-888714875   0         1         1         5h
frontend-888714875   0         1         1         5h
frontend-888714875   0         0         0         5h
frontend-3099797709   25        25        24        6h
frontend-3099797709   25        25        25        6h
frontend-3099797709   25        25        25        6h

說明:
deployment frontend穩定執行在v2版本(RS:frontend-888714875),然後執行kubectl set image觸發滾動更新到v3版本(RS: frontend-776431694), 當v3 RS的desired個數scale up到9個,ready個數為4個時,使用者又執行kubectl set image觸發滾動更新到v4版本(RS: frontend-3099797709)。

說明,我自己是這樣玩的,先建立的v4 RS,然後v3 RS,然後v2 RS。因此按照建立時間從新到舊排序RS為,v2–>v3–>v4。

  • v2到v3的滾動流程同上一小節的描述;
  • 當新的滾動流程觸發後,按照RS建立時間排序,最新(除v4外)的v2的RS保持不動,不會繼續scale down。
  • 然後v4將通過滾動更新的方式把已經scale up的9個最老的v3 RS的pods替換掉,將所有v3的Pods升級到v4。
  • 最後再接著v4 RS滾動更新把v2的RS所有的舊Pods都升級到v4。
  • 整個完整的滾動流程中,都必須遵守maxSurge和maxUnavailable的約束,不能越雷池半步。

設想一個更復雜的場景:如果在上述v4滾動更新替換到半吊子的v3 RS過程中,使用者又觸發了一個滾動更新到v5版本,流程會怎麼樣呢?
不要怕,原理是一樣的,Deployment rolling update總是先把最老的RS滾動更新替換掉,然後逐步把新的RS滾動更新替換掉,直到最最新的那個RS scale down為0,流程就結束了。

理解rollout pause和resume

或許很多人至今還會這麼覺得:整個滾動更新的過程中,一旦使用者執行了kubectl rollout pause deploy/frontend後,正在執行的滾動流程就會立刻停止,然後使用者執行kubectl rollout resume deploy/frontend就會繼續未完成的滾動更新。那你就大錯特錯了!

kubectl rollout pause只會用來停止觸發下一次rollout。什麼意思呢? 上面描述的這個場景,正在執行的滾動歷程是不會停下來的,而是會繼續正常的進行滾動,直到完成。等下一次,使用者再次觸發rollout時,Deployment就不會真的去啟動執行滾動更新了,而是等待使用者執行了kubectl rollout resume,流程才會真正啟動執行。

ReplicaSet和rollout history的關係

前提,你要知道關於--record
Setting the kubectl flag –record to true allows you to record current command in the annotations of the resources being created or updated.

預設情況下,所有通過kubectl xxxx –record都會被kubernetes記錄到etcd進行持久化,這無疑會佔用資源,最重要的是,時間久了,當你kubectl get rs時,會有成百上千的垃圾RS返回給你,那時你可能就眼花繚亂了。

上生產時,我們最好通過設定Deployment的.spec.revisionHistoryLimit來限制最大保留的revision number,比如15個版本,回滾的時候一般只會回滾到最近的幾個版本就足夠了。

執行下面的命令,可以返回某個Deployment的所有record記錄:

$ kubectl rollout history deployment/nginx-deployment
deployments "nginx-deployment"
REVISION    CHANGE-CAUSE
1           kubectl create -f docs/user-guide/nginx-deployment.yaml --record
2           kubectl set image deployment/nginx-deployment nginx=nginx:1.9.1
3           kubectl set image deployment/nginx-deployment nginx=nginx:1.91

然後執行rollout undo命令就可以回滾到to-revision指定的版本。

kubectl rollout undo deployment/nginx-deployment --to-revision=2
deployment "nginx-deployment" rolled back

其實rollout history中記錄的revision都和ReplicaSets一一對應。如果手動delete某個ReplicaSet,對應的rollout history就會被刪除,也就是還說你無法回滾到這個revison了。

roolout history和ReplicaSet的對應關係,可以在kubectl describe rs $RSNAME返回的revision欄位中得到,這裡的revision就對應著roolout history返回的revison。

回滾是如何進行的

使用者通過執行rollout undo並指定--to-revison,可以將Deployment回滾到指定的revision。

kubectl rollout undo deploy frontend --to-revision=7

通過觀察後端RS的資料變化,同樣發現,回滾的時候也是按照滾動的機制進行的,同樣要遵守maxSurge和maxUnavailable的約束。並不是一次性將所有的Pods刪除,然後再一次性建立新的Pods。

[root@master01 ~]# kubectl get rs -w
NAME                   DESIRED   CURRENT   READY     AGE
frontend-888714875   3         0         0         23h
frontend-776431694   8         10        10        23h
frontend-888714875   5         0         0         23h
frontend-776431694   8         10        10        23h
frontend-776431694   8         8         8         23h
frontend-888714875   5         0         0         23h
frontend-888714875   5         3         0         23h
frontend-888714875   5         5         0         23h
frontend-888714875   5         5         1         23h
frontend-888714875   5         5         2         23h
frontend-888714875   5         5         4         23h
frontend-776431694   6         8         8         23h
frontend-888714875   5         5         4         23h
frontend-888714875   5         5         5         23h
frontend-776431694   6         8         8         23h
frontend-888714875   7         5         5         23h
frontend-776431694   6         6         6         23h
frontend-776431694   3         6         6         23h
frontend-888714875   10        5         5         23h
frontend-776431694   3         6         6         23h
frontend-776431694   3         3         3         23h
frontend-888714875   10        5         5         23h
frontend-776431694   3         3         3         23h
frontend-888714875   10        7         5         23h
frontend-888714875   10        10        5         23h
frontend-888714875   10        10        6         23h
frontend-888714875   10        10        7         23h
frontend-888714875   10        10        8         23h
frontend-888714875   10        10        8         23h
frontend-888714875   10        10        9         23h
frontend-888714875   10        10        9         23h
frontend-888714875   10        10        9         23h
frontend-776431694   0         3         3         23h
frontend-776431694   0         3         3         23h
frontend-776431694   0         0         0         23h
frontend-888714875   10        10        10        23h
frontend-888714875   10        10        10        23h

總結

本博文介紹了關於Deployment rolling update那些容易被大家忽略或者誤解的特性,如果看完這篇博文,你覺得“我去! 本來就是這樣子的啊!”,那說明你對Deployment Controller非常熟悉。

  • 介紹了Deployment時與rolling update的相關項;
  • 說明了滾動更新的流程;
  • 介紹了對同一個Deployment先後觸發滾動更新,邏輯如何?
  • 正確理解rollout pause和resume
  • 明白ReplicaSet和rollout history的內在關係
  • 回滾的機制同滾動更新。