1. 程式人生 > >kubernetes Pod親和性

kubernetes Pod親和性

三種排程粘性,主要根據官方文件說明:

NodeSelector(定向排程)、NodeAffinity(Node親和性)、PodAffinity(Pod親和性)。

1.      nodeSelector

提供簡單的pod部署限制,pod選擇一個或多個node的label部署。

①   給node新增label

kubectl label nodes <node-name> <label-key>=<label-value>

②   為pod新增nodeSelector機制

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
imagePullPolicy: IfNotPresent
nodeSelector:
disktype: ssd

③   部署pod

2.      nodeAffinity

該功能是nodeSelector的改進,現在處於beta階段。

主要的改進有以下幾點:

-       語法更多樣(不僅支援“AND”,)

-       不僅可以指定硬條件,還支援軟條件

-       支援pod親和性

 

當nodeAffinity成熟的時候,nodeSelector會被廢棄。

 

requiredDuringSchedulingIgnoredDuringExecution   #硬性強制

preferredDuringSchedulingIgnoredDuringExecution  #

軟性配置

 

IgnoredDuringExecution  表示 ,如果一個pod所在的節點 在Pod執行期間其標籤發生了改變,不再符合該Pod的節點親和性需求,則系統將忽略Node上Label的變化,該pod繼續在該節點上執行。

 

如果同時設定了nodeSelector和nodeAffinity,則需要同時滿足才能成為候選者node。

 

下面看一個例子:

①     該pod只部署在具有label kubernetes.io/e2e-az-name=e2e-az1,kubernetes.io/e2e-az-name=e2e-az2的node上;且會優先選擇具有label another-node-label-key= another-node-label-value的node,當然如果沒有滿足該條件的node,該pod也會部署在其它node上。

②     operator支援In, NotIn, Exists, DoesNotExist, Gt, Lt。可以使用NotIn和DoesNotExist實現node的反親和性,或者使用pod taints與tolerations實現。

③     如果設定了多個nodeSelectorTerms,則只需要匹配其中一個就可以成為候選者node。

④     如果設定了多個matchExpressions,則需要全部匹配才能成為候選者node。

⑤     weight取值範圍是1-100,對於有多個軟條件的情況時,將匹配了改條件的weight相加,取最大的值為最優先候選者node。

# cat pods/pod-with-node-affinity.yaml

pods/pod-with-node-affinity.yaml 

apiVersion: v1
kind: Pod
metadata:
  name: with-node-affinity
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution: #hard條件必須匹配
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/e2e-az-name
            operator: In #支援In, NotIn, Exists, DoesNotExist, Gt, Lt
            values:
            - e2e-az1
            - e2e-az2
      preferredDuringSchedulingIgnoredDuringExecution: #soft條件優先匹配
      - weight: 1  #取值範圍1-100
        preference:
          matchExpressions:
          - key: another-node-label-key
            operator: In
            values:
            - another-node-label-value
  containers:
  - name: with-node-affinity
image: k8s.gcr.io/pause:2.0

3.      Inter-pod affinity and anti-affinity (beta feature)

pod親和性與反親和性是根據pod的label挑選scheduler的候選者node,而不是根據node的label。

 

pod親和性只在一個namespace生效,因為pod具有namespace,所以pod親和性設定隱含了namespace。

 

topologyKey指示作用域,使用node的label的一個key值表示。

 

還可以使用一個namespaces列表限定schedulerr排程時查詢的pod限定,namespaces放在labelSelector和topologyKey同一層,如:

        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: appname
                operator: In
                values:
                - dbpool-server
            topologyKey: kubernetes.io/hostname
            namespaces:  #這樣只會查詢poa-ea和pletest下面的pod,而不是全部
            - poa-ea
            - pletest

注意:Inter-pod affinity and anti-affinity需要消耗大量計算資源,會增加排程時間。如果node數量超過幾百臺的時候不建議使用。

注意:Pod反親和性需要制定topologyKey

下面看一個例子:

①   出於安全考慮,requiredDuringSchedulingIgnoredDuringExecution的anti-affinity,topologyKey不允許為空;

②   For requiredDuringSchedulingIgnoredDuringExecution pod anti-affinity, the admission controller LimitPodHardAntiAffinityTopology was introduced to limit topologyKey to kubernetes.io/hostname. If you want to make it available for custom topologies, you may modify the admission controller, or simply disable it.

③   For preferredDuringSchedulingIgnoredDuringExecution pod anti-affinity, empty topologyKey is interpreted as “all topologies” (“all topologies” here is now limited to the combination of kubernetes.io/hostnamefailure-domain.beta.kubernetes.io/zone and failure-domain.beta.kubernetes.io/region).

pods/pod-with-pod-affinity.yaml  

apiVersion: v1
kind: Pod
metadata:
  name: with-pod-affinity
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: security
            operator: In
            values:
            - S1
        topologyKey: failure-domain.beta.kubernetes.io/zone
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: security
              operator: In
              values:
              - S2
          topologyKey: kubernetes.io/hostname
  containers:
  - name: with-pod-affinity
image: k8s.gcr.io/pause:2.0

4.      使用案例

需求:有一個web-server有3個例項,該web-server會使用到redis做為快取。先需要將redis排程到和web-server同一個node。

①   部署redis,label app=store保證redis和web-server部署到相同的node

apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-cache
spec:
  selector:
    matchLabels:
      app: store
  replicas: 3
  template:
    metadata:
      labels:
        app: store
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - store
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: redis-server
        image: redis:3.2-alpine

②   部署web-server,與redis部署到一起,但是web-server之間不部署到一起。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-server
spec:
  selector:
    matchLabels:
      app: web-store
  replicas: 3
  template:
    metadata:
      labels:
        app: web-store
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - web-store
            topologyKey: "kubernetes.io/hostname"
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - store
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: web-app
        image: nginx:1.12-alpine

5.      參考資料

http://blog.51cto.com/newfly/2066630

https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity