Kubernetes Pod Probes 探針解析

阿新 • • 發佈：2021-07-01

簡介

Pod的探針主要用於決定Pod的生命週期，就是判斷一個Pod什麼時候算是啟動成功，什麼時候可以接收流量，什麼時候需要重啟，這三種功能就對應了三種探針型別。

Pod生命週期中的階段

Value	Description
Pending	The Pod has been accepted by the Kubernetes cluster, but one or more of the containers has not been set up and made ready to run. This includes time a Pod spends waiting to be scheduled as well as the time spent downloading container images over the network.
Running	The Pod has been bound to a node, and all of the containers have been created. At least one container is still running, or is in the process of starting or restarting.
Succeeded	All containers in the Pod have terminated in success, and will not be restarted.
Failed	All containers in the Pod have terminated, and at least one container has terminated in failure. That is, the container either exited with non-zero status or was terminated by the system.
Unknown	For some reason the state of the Pod could not be obtained. This phase typically occurs due to an error in communicating with the node where the Pod should be running.

容器的狀態

Waiting 不處於Running和Terminated狀態的容器就是該狀態。
Running 執行沒有錯誤的狀態，如果有PostStart，表示該PostStart已經執行並且成功結束。
Terminated 正常或者出現錯誤的結束。PreStop已經在該狀態之前執行完。

容器重啟策略

spec.restartPolicy有3中配置，Always, OnFailure, 和 Never. 預設值是 Always. 容器重啟以指數級間隔增長(10s, 20s, 40s...),最大間隔不超過5分鐘。一旦一個容器在沒有任何問題的情況下執行了10分鐘，kubelet將重置該容器的重啟回退計時器。

readinessGates

FEATURE STATE: Kubernetes v1.14 [stable]
你可以自己新增一些Pod條件控制容器的狀態，readinessGates裡面新增的條件，你必須將這些條件新增到status中，否則該條件的狀態是False.

kind: Pod
...
spec:
  readinessGates:
    - conditionType: "www.example.com/feature-1"
status:
  conditions:
    - type: Ready                              # a built in PodCondition
      status: "False"
      lastProbeTime: null
      lastTransitionTime: 2018-01-01T00:00:00Z
    - type: "www.example.com/feature-1"        # an extra PodCondition
      status: "False"
      lastProbeTime: null
      lastTransitionTime: 2018-01-01T00:00:00Z
  containerStatuses:
    - containerID: docker://abcd...
      ready: true
...

容器探針種類

ExecAction

在容器裡執行一條指定的命令，狀態碼為0是成功。

TCPSocketAction

針對Pod在指定埠上的IP地址執行TCP檢查。如果埠開啟，則認為診斷成功。

HTTPGetAction

執行一個HTTP GET請求，200<= 返回碼 <400 時診斷成功。

探針的結果

每種探針有以下三種結果。
Success: The container passed the diagnostic.
Failure: The container failed the diagnostic.
Unknown: The diagnostic failed, so no action should be taken.

探針型別

livenessProbe

FEATURE STATE: Kubernetes v1.0 [stable]
指示容器存活。如果探針失敗，則按照restartPolicy指定的策略進行重啟容器。如果沒有提供該探針配置，預設為Success。

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-exec
spec:
  containers:
  - name: liveness
    image: k8s.gcr.io/busybox
    args:
    - /bin/sh
    - -c
    - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
    livenessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      initialDelaySeconds: 5
      periodSeconds: 5

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - name: liveness
    image: k8s.gcr.io/liveness
    args:
    - /server
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
        httpHeaders:
        - name: Custom-Header
          value: Awesome
      initialDelaySeconds: 3
      periodSeconds: 3

apiVersion: v1
kind: Pod
metadata:
  name: goproxy
  labels:
    app: goproxy
spec:
  containers:
  - name: goproxy
    image: k8s.gcr.io/goproxy:0.1
    ports:
    - containerPort: 8080
    readinessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 10
    livenessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 15
      periodSeconds: 20

readinessProbe

FEATURE STATE: Kubernetes v1.0 [stable]
指示容器已經準備好接收請求。如果探針失敗，則將該Pod的IP從所有關聯該Pod的endpoints中移除。如果配置了該探針，則在初始執行之前預設為Failure。如果沒有配置該探針，預設為Success。

readinessProbe:
  exec:
    command:
    - cat
    - /tmp/healthy
  initialDelaySeconds: 5
  periodSeconds: 5

startupProbe

FEATURE STATE: Kubernetes v1.20 [stable]
指示容器已經啟動成功。在該探針成功之前，其它所有探針都不會執行。如果該探針失敗，kubelet將會刪除該容器並根據策略重啟。如果沒有提供該探針，預設為Success。

ports:
- name: liveness-port
  containerPort: 8080
  hostPort: 8080

livenessProbe:
  httpGet:
    path: /healthz
    port: liveness-port
  failureThreshold: 1
  periodSeconds: 10

startupProbe:
  httpGet:
    path: /healthz
    port: liveness-port
  failureThreshold: 30
  periodSeconds: 10

探針引數

Eexec探針引數

initialDelaySeconds: Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to 0 seconds. Minimum value is 0.
periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
timeoutSeconds: Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
successThreshold: Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup Probes. Minimum value is 1.
failureThreshold: When a probe fails, Kubernetes will try failureThreshold times before giving up. Giving up in case of liveness probe means restarting the container. In case of readiness probe the Pod will be marked Unready. Defaults to 3. Minimum value is 1.

Note:
Before Kubernetes 1.20, the field timeoutSeconds was not respected for exec probes: probes continued running indefinitely, even past their configured deadline, until a result was returned.

This defect was corrected in Kubernetes v1.20. You may have been relying on the previous behavior, even without realizing it, as the default timeout is 1 second. As a cluster administrator, you can disable the feature gate ExecProbeTimeout (set it to false) on each kubelet to restore the behavior from older versions, then remove that override once all the exec probes in the cluster have a timeoutSeconds value set.
If you have pods that are impacted from the default 1 second timeout, you should update their probe timeout so that you're ready for the eventual removal of that feature gate.

With the fix of the defect, for exec probes, on Kubernetes 1.20+ with the dockershim container runtime, the process inside the container may keep running even after probe returned failure because of the timeout.

Caution: Incorrect implementation of readiness probes may result in an ever growing number of processes in the container, and resource starvation if this is left unchecked.

HTTP探針引數

host: Host name to connect to, defaults to the pod IP. You probably want to set "Host" in httpHeaders instead.
scheme: Scheme to use for connecting to the host (HTTP or HTTPS). Defaults to HTTP.
path: Path to access on the HTTP server. Defaults to /.
httpHeaders: Custom headers to set in the request. HTTP allows repeated headers.
port: Name or number of the port to access on the container. Number must be in the range 1 to 65535.

For an HTTP probe, the kubelet sends an HTTP request to the specified path and port to perform the check. The kubelet sends the probe to the pod's IP address, unless the address is overridden by the optional host field in httpGet. If scheme field is set to HTTPS, the kubelet sends an HTTPS request skipping the certificate verification. In most scenarios, you do not want to set the host field. Here's one scenario where you would set it. Suppose the container listens on 127.0.0.1 and the Pod's hostNetwork field is true. Then host, under httpGet, should be set to 127.0.0.1. If your pod relies on virtual hosts, which is probably the more common case, you should not use host, but rather set the Host header in httpHeaders.

For an HTTP probe, the kubelet sends two request headers in addition to the mandatory Host header: User-Agent, and Accept. The default values for these headers are kube-probe/1.21 (where 1.21 is the version of the kubelet ), and / respectively.

You can override the default headers by defining .httpHeaders for the probe; for example

livenessProbe:
  httpGet:
    httpHeaders:
      - name: Accept
        value: application/json

startupProbe:
  httpGet:
    httpHeaders:
      - name: User-Agent
        value: MyUserAgent

You can also remove these two headers by defining them with an empty value.

livenessProbe:
  httpGet:
    httpHeaders:
      - name: Accept
        value: ""

startupProbe:
  httpGet:
    httpHeaders:
      - name: User-Agent
        value: ""

TCP探針引數

For a TCP probe, the kubelet makes the probe connection at the node, not in the pod, which means that you can not use a service name in the host parameter since the kubelet is unable to resolve it.

Probe-level terminationGracePeriodSeconds

FEATURE STATE: Kubernetes v1.21 [alpha]
Prior to release 1.21, the pod-level terminationGracePeriodSeconds was used for terminating a container that failed its liveness or startup probe. This coupling was unintended and may have resulted in failed containers taking an unusually long time to restart when a pod-level terminationGracePeriodSeconds was set.

In 1.21, when the feature flag ProbeTerminationGracePeriod is enabled, users can specify a probe-level terminationGracePeriodSeconds as part of the probe specification. When the feature flag is enabled, and both a pod- and probe-level terminationGracePeriodSeconds are set, the kubelet will use the probe-level value.

For example,

spec:
  terminationGracePeriodSeconds: 3600  # pod-level
  containers:
  - name: test
    image: ...

    ports:
    - name: liveness-port
      containerPort: 8080
      hostPort: 8080

    livenessProbe:
      httpGet:
        path: /healthz
        port: liveness-port
      failureThreshold: 1
      periodSeconds: 60
      # Override pod-level terminationGracePeriodSeconds #
      terminationGracePeriodSeconds: 60

Probe-level terminationGracePeriodSeconds cannot be set for readiness probes. It will be rejected by the API server.

Termination of Pods 流程

使用kubectl刪除一個容器(預設優雅30s)
API Server將該Pod置為"Terminating"狀態。等kubelet同步到這個狀態，kubelet開始在本地刪除該Pod
- 如果設定了preStop，kubelet執行該preStop。如果preStop的執行超過了優雅時間，kubelet延長2s優雅。如果確定preStop需要長時間執行，請調整優雅時間引數terminationGracePeriodSeconds。
- kubelet觸發容器執行時傳送TERM訊號到每一個容器的1號程序。注意：每個容器接收到TERM的順序是不一定的，如果對此有需求，考慮在preStop中配置。
與此同時，Controller將該Pod從endpoint中摘除。
如果優雅超時，kubelet觸發容器執行時強制傳送SIGKILL訊號給每一個還在執行的程序，kubelet也會清理pause容器。
kubelet強制觸發從API Server儲存中刪除該Pod物件，將優雅時間置為0 (immediate deletion)。
API Server刪除該物件，使用kubectl將不再可見該Pod。

Pod GC

Controller會清理狀態為Succeeded or Failed的Pod，見kube-controller的配置引數terminated-pod-gc-threshold。

Hooks

apiVersion: v1
kind: Pod
metadata:
  name: lifecycle-demo
spec:
  containers:
  - name: lifecycle-demo-container
    image: nginx
    lifecycle:
      postStart:
        exec:
          command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"]
      preStop:
        exec:
          command: ["/bin/sh","-c","nginx -s quit; while killall -0 nginx; do sleep 1; done"]

參考

https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/
https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/
https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

|NO.Z.00040|——————————|^^ 部署 ^^|——|KuberNetes&二進位制部署.V18|5臺Server|---------------------------------------|kubernetes驗證|busybox部署|Pod解析service|

[CloudNative：KuberNetes&二進位制部署.V18] [Applications.KuberNetes]