Kubernetes 滾動升級

阿新 • • 發佈：2018-12-25

Kubernetes Rolling Upgrade

背景

Kubernetes 是一個很好的容器應用叢集管理工具，尤其是採用ReplicaSet這種自動維護應用生命週期事件的物件後，將容器應用管理的技巧發揮得淋漓盡致。在容器應用管理的諸多特性中，有一個特性是最能體現Kubernetes強大的叢集應用管理能力的，那就是滾動升級。

滾動升級的精髓在於升級過程中依然能夠保持服務的連續性，使外界對於升級的過程是無感知的。整個過程中會有三個狀態，全部舊例項，新舊例項皆有，全部新例項。舊例項個數逐漸減少，新例項個數逐漸增加，最終達到舊例項個數為0，新例項個數達到理想的目標值。

kubernetes 滾動升級

Kubernetes 中採用ReplicaSet（簡稱RS）來管理Pod例項。如果當前叢集中的Pod例項數少於目標值，RS 會拉起新的Pod，反之，則根據策略刪除多餘的Pod。Deployment正是利用了這樣的特性，通過控制兩個RS裡面的Pod，從而實現升級。
滾動升級是一種平滑過渡式的升級，在升級過程中，服務仍然可用。這是kubernetes作為應用服務化管理的關鍵一步。服務無處不在，並且按需使用。這是雲端計算的初衷，對於PaaS平臺來說，應用抽象成服務，遍佈整個叢集，為應用提供隨時隨地可用的服務是PaaS的終極使命。
1.ReplicaSet
關於RS的概念大家都很清楚了，我們來看看在k8s原始碼中的RS。

type ReplicaSetController struct {
    kubeClient clientset.Interface
    podControl controller.PodControlInterface

    // internalPodInformer is used to hold a personal informer.  If we're using
    // a normal shared informer, then the informer will be started for us.  If
    // we have a personal informer, we must start 
 it ourselves.   If you start
    // the controller using NewReplicationManager(passing SharedInformer), this
    // will be null
    internalPodInformer framework.SharedIndexInformer

    // A ReplicaSet is temporarily suspended after creating/deleting these many replicas.
    // It resumes normal action after observing the watch events for them.
    burstReplicas int
    // To allow injection of syncReplicaSet for testing.
    syncHandler func(rsKey string) error

    // A TTLCache of pod creates/deletes each rc expects to see.
    expectations *controller.UIDTrackingControllerExpectations

    // A store of ReplicaSets, populated by the rsController
    rsStore cache.StoreToReplicaSetLister
    // Watches changes to all ReplicaSets
    rsController *framework.Controller
    // A store of pods, populated by the podController
    podStore cache.StoreToPodLister
    // Watches changes to all pods
    podController framework.ControllerInterface
    // podStoreSynced returns true if the pod store has been synced at least once.
    // Added as a member to the struct to allow injection for testing.
    podStoreSynced func() bool

    lookupCache *controller.MatchingCache

    // Controllers that need to be synced
    queue *workqueue.Type

    // garbageCollectorEnabled denotes if the garbage collector is enabled. RC
    // manager behaves differently if GC is enabled.
    garbageCollectorEnabled bool
}

這個結構體位於pkg/controllers/replicaset,這裡我們可以看出，RS最主要的幾個物件，一個是針對Pod的操作物件-podControl.看到這個名字就知道，這個物件是控制RS下面的Pod的生命週期的，我們看看這個PodControl所包含的方法。

// PodControlInterface is an interface that knows how to add or delete pods
// created as an interface to allow testing.
type PodControlInterface interface {
    // CreatePods creates new pods according to the spec.
    CreatePods(namespace string, template *api.PodTemplateSpec, object runtime.Object) error
    // CreatePodsOnNode creates a new pod accorting to the spec on the specified node.
    CreatePodsOnNode(nodeName, namespace string, template *api.PodTemplateSpec, object runtime.Object) error
    // CreatePodsWithControllerRef creates new pods according to the spec, and sets object as the pod's controller.
    CreatePodsWithControllerRef(namespace string, template *api.PodTemplateSpec, object runtime.Object, controllerRef *api.OwnerReference) error
    // DeletePod deletes the pod identified by podID.
    DeletePod(namespace string, podID string, object runtime.Object) error
    // PatchPod patches the pod.
    PatchPod(namespace, name string, data []byte) error
}

這裡我們可以看到，RS可以完全控制Pod.這裡有兩個watch，rsController和podController，他們分別負責watch ETCD中RS和Pod的變化。這裡一個重要的物件不得不提，那就是syncHandler，這個是所有Controller都有的物件。每一個控制器通過Watch來監視ETCD中的變化，使用sync的方式來同步這些物件的狀態，注意這個Handler只是一個委託，實際真正的Handler在建立控制器的時候指定。這種模式不僅僅適用於RS，其他控制器亦如此。
下面的邏輯更加清晰地說明了watch的邏輯。

rsc.rsStore.Store, rsc.rsController = framework.NewInformer(
        &cache.ListWatch{
            ListFunc: func(options api.ListOptions) (runtime.Object, error) {
                return rsc.kubeClient.Extensions().ReplicaSets(api.NamespaceAll).List(options)
            },
            WatchFunc: func(options api.ListOptions) (watch.Interface, error) {
                return rsc.kubeClient.Extensions().ReplicaSets(api.NamespaceAll).Watch(options)
            },
        },
        &extensions.ReplicaSet{},
        // TODO: Can we have much longer period here?
        FullControllerResyncPeriod,
        framework.ResourceEventHandlerFuncs{
            AddFunc:    rsc.enqueueReplicaSet,
            UpdateFunc: rsc.updateRS,
            // This will enter the sync loop and no-op, because the replica set has been deleted from the store.
            // Note that deleting a replica set immediately after scaling it to 0 will not work. The recommended
            // way of achieving this is by performing a `stop` operation on the replica set.
            DeleteFunc: rsc.enqueueReplicaSet,
        },
    )

每次Watch到ETCD中的物件的變化，採取相應的措施，具體來說就是放入佇列，更新或者取出佇列。對於Pod來說，也有相應的處理。

podInformer.AddEventHandler(framework.ResourceEventHandlerFuncs{
        AddFunc: rsc.addPod,
        // This invokes the ReplicaSet for every pod change, eg: host assignment. Though this might seem like
        // overkill the most frequent pod update is status, and the associated ReplicaSet will only list from
        // local storage, so it should be ok.
        UpdateFunc: rsc.updatePod,
        DeleteFunc: rsc.deletePod,
    })

RS基本的內容就這些，在RS的上層是Deployment，這個物件也是一個控制器。

// DeploymentController is responsible for synchronizing Deployment objects stored
// in the system with actual running replica sets and pods.
type DeploymentController struct {
    client        clientset.Interface
    eventRecorder record.EventRecorder

    // To allow injection of syncDeployment for testing.
    syncHandler func(dKey string) error

    // A store of deployments, populated by the dController
    dStore cache.StoreToDeploymentLister
    // Watches changes to all deployments
    dController *framework.Controller
    // A store of ReplicaSets, populated by the rsController
    rsStore cache.StoreToReplicaSetLister
    // Watches changes to all ReplicaSets
    rsController *framework.Controller
    // A store of pods, populated by the podController
    podStore cache.StoreToPodLister
    // Watches changes to all pods
    podController *framework.Controller

    // dStoreSynced returns true if the Deployment store has been synced at least once.
    // Added as a member to the struct to allow injection for testing.
    dStoreSynced func() bool
    // rsStoreSynced returns true if the ReplicaSet store has been synced at least once.
    // Added as a member to the struct to allow injection for testing.
    rsStoreSynced func() bool
    // podStoreSynced returns true if the pod store has been synced at least once.
    // Added as a member to the struct to allow injection for testing.
    podStoreSynced func() bool

    // Deployments that need to be synced
    queue workqueue.RateLimitingInterface
}

對於DeploymentController來說，需要監聽Deployment,RS和Pod。從Controller的建立過程中可以看出來。

dc.dStore.Store, dc.dController = framework.NewInformer(
        &cache.ListWatch{
            ListFunc: func(options api.ListOptions) (runtime.Object, error) {
                return dc.client.Extensions().Deployments(api.NamespaceAll).List(options)
            },
            WatchFunc: func(options api.ListOptions) (watch.Interface, error) {
                return dc.client.Extensions().Deployments(api.NamespaceAll).Watch(options)
            },
        },
        &extensions.Deployment{},
        FullDeploymentResyncPeriod,
        framework.ResourceEventHandlerFuncs{
            AddFunc:    dc.addDeploymentNotification,
            UpdateFunc: dc.updateDeploymentNotification,
            // This will enter the sync loop and no-op, because the deployment has been deleted from the store.
            DeleteFunc: dc.deleteDeploymentNotification,
        },
    )

    dc.rsStore.Store, dc.rsController = framework.NewInformer(
        &cache.ListWatch{
            ListFunc: func(options api.ListOptions) (runtime.Object, error) {
                return dc.client.Extensions().ReplicaSets(api.NamespaceAll).List(options)
            },
            WatchFunc: func(options api.ListOptions) (watch.Interface, error) {
                return dc.client.Extensions().ReplicaSets(api.NamespaceAll).Watch(options)
            },
        },
        &extensions.ReplicaSet{},
        resyncPeriod(),
        framework.ResourceEventHandlerFuncs{
            AddFunc:    dc.addReplicaSet,
            UpdateFunc: dc.updateReplicaSet,
            DeleteFunc: dc.deleteReplicaSet,
        },
    )

    dc.podStore.Indexer, dc.podController = framework.NewIndexerInformer(
        &cache.ListWatch{
            ListFunc: func(options api.ListOptions) (runtime.Object, error) {
                return dc.client.Core().Pods(api.NamespaceAll).List(options)
            },
            WatchFunc: func(options api.ListOptions) (watch.Interface, error) {
                return dc.client.Core().Pods(api.NamespaceAll).Watch(options)
            },
        },
        &api.Pod{},
        resyncPeriod(),
        framework.ResourceEventHandlerFuncs{
            AddFunc:    dc.addPod,
            UpdateFunc: dc.updatePod,
            DeleteFunc: dc.deletePod,
        },
        cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc},
    )

    dc.syncHandler = dc.syncDeployment
    dc.dStoreSynced = dc.dController.HasSynced
    dc.rsStoreSynced = dc.rsController.HasSynced
    dc.podStoreSynced = dc.podController.HasSynced

這裡最核心的就是syncDeployment，因為這裡面有rollingUpdate和rollback的實現。在這裡如果watch到某個Deployment物件的RollbackTo.Revision部位nil,則執行rollingbach。這個Revision是版本號，注意雖然是回滾，但k8s內部記錄的版本號永遠是增長的。
有人會好奇,rollback是怎麼做到的，其實原理很簡單，k8s記錄了各個版本的PodTemplate,把舊的PodTemplate覆蓋新的Template即可。
對於K8S來說，升級有兩種方式，一種是重新構建，一種是滾動升級。

switch d.Spec.Strategy.Type {
    case extensions.RecreateDeploymentStrategyType:
        return dc.rolloutRecreate(d)
    case extensions.RollingUpdateDeploymentStrategyType:
        return dc.rolloutRolling(d)
}

這個rolloutRolling裡面包含了所有的祕密，這裡我們可以看到。

func (dc *DeploymentController) rolloutRolling(deployment *extensions.Deployment) error {
    newRS, oldRSs, err := dc.getAllReplicaSetsAndSyncRevision(deployment, true)
    if err != nil {
        return err
    }
    allRSs := append(oldRSs, newRS)

    // Scale up, if we can.
    scaledUp, err := dc.reconcileNewReplicaSet(allRSs, newRS, deployment)
    if err != nil {
        return err
    }
    if scaledUp {
        // Update DeploymentStatus
        return dc.updateDeploymentStatus(allRSs, newRS, deployment)
    }

    // Scale down, if we can.
    scaledDown, err := dc.reconcileOldReplicaSets(allRSs, controller.FilterActiveReplicaSets(oldRSs), newRS, deployment)
    if err != nil {
        return err
    }
    if scaledDown {
        // Update DeploymentStatus
        return dc.updateDeploymentStatus(allRSs, newRS, deployment)
    }

    dc.cleanupDeployment(oldRSs, deployment)

    // Sync deployment status
    return dc.syncDeploymentStatus(allRSs, newRS, deployment)
}

這裡做了如下幾件事：
1. 查詢新的RS和舊的RS，並計算出新的Revision（這是Revision的最大值）；
2. 對新的RS進行擴容操作；
3. 對舊的RS進行縮容操作；
4. 完成之後，刪掉舊的RS；
5. 通過Deployment狀態到etcd;

至此，我們知道了滾動升級在kubernetes中的原理。其實在傳統的負載均衡應用中，滾動升級的做法很類似，但是在容器環境中，我們有RS，通過這種方法更為便捷。

Kubernetes 滾動升級

Kubernetes Rolling Upgrade

背景

kubernetes 滾動升級

Kubernetes 滾動升級

kubernetes學習記錄（7）——彈性伸縮與滾動升級

kubernetes二次開發：(釋出/滾動升級/回滾) 公共方法的封裝

kubernetes 應用的滾動升級

[daily][pcaman] pacman滾動升級跳過指定包

kubernetes 滾動更新發布及回滾

持續交付與滾動升級

swarm的使用和滾動升級

k8s滾動升級

kubernetes 滾動更新

Kubernetes滾動更新介紹及使用

kubernetes-叢集升級

windows叢集滾動升級 windows server 2012r2 to windows server 2016 開啟巢狀虛擬化

HDFS的滾動升級: Rolling Upgrade

archlinux滾動升級失敗解決方法

hadoop2.4後的滾動升級

騰訊雲容器服務的滾動升級使用簡介

[k8s]kubenets基於rc滾動升級

Kubernetes HA 升級1.13.4

kubernetes叢集升級的正確姿勢

Kubernetes 滾動升級

Kubernetes Rolling Upgrade

背景

kubernetes 滾動升級

相關推薦