Docker容器和K8s新增Health Check

阿新 • • 發佈：2020-10-29

docker容器啟動後，怎麼確認容器執行正常，怎麼確認可以對外提供服務了，這就需要health check功能了。之前對health check的功能不在意，因為只要映象跑起來了就是健康的，如果有問題就會執行失敗。在連續兩次收到兩個啟動失敗的issue之後，我決定修正一下。遇到的問題是，一個web服務依賴mongo容器啟動，通過docker-compose啟動，雖然設定了depends on, 但有時候還是會遇到mongo容器中db例項還沒有完全初始化，web服務已經啟動連線了，然後返回連線失敗。 ``` version: '3.1' services: mongo: image: mongo:4 restart: always environment: MONGO_INITDB_ROOT_USERNAME: root MONGO_INITDB_ROOT_PASSWORD: example MONGO_INITDB_DATABASE: yapi volumes: - ./mongo-conf:/docker-entrypoint-initdb.d - ./mongo/etc:/etc/mongo - ./mongo/data/db:/data/db yapi: build: context: ./ dockerfile: Dockerfile image: yapi # 第一次啟動使用 # command: "yapi server" # 之後使用下面的命令 command: "node /my-yapi/vendors/server/app.js" depends_on: - mongo ``` 理論上，只有mongo服務啟動後，status變成up，yapi這個服務才會啟動。但確實有人遇到這個問題了。那就看看解決方案。官方文件說depends_on並不會等待db ready， emmm 也沒說depends on的標準是什麼，是依賴service的status up？ ![](https://img2020.cnblogs.com/blog/686418/202010/686418-20201027222621735-1749783539.png) 官方說depends on依賴service是running狀態，如果啟動中的狀態也算running的話，確實有可能db沒有ready。官方的說法是，服務依賴和db依賴是一個分散式系統的話題，服務應該自己解決各種網路問題，畢竟db隨時都有可能斷開，服務應該自己配置重聯策略。官方推薦是服務啟動前檢查db是否已經啟動了，通過ping的形式等待。搞一個`wait-for-it.sh`指令碼前置檢查依賴。 docker-compose.yml ``` version: "2" services: web: build: . ports: - "80:8000" depends_on: - "db" command: ["./wait-for-it.sh", "db:5432", "--", "python", "app.py"] db: image: postgres ``` wait-for-it.sh ``` #!/bin/sh # wait-for-postgres.sh set -e host="$1" shift cmd="$@" until PGPASSWORD=$POSTGRES_PASSWORD psql -h "$host" -U "postgres" -c '\q'; do >&2 echo "Postgres is unavailable - sleeping" sleep 1 done >&2 echo "Postgres is up - executing command" exec $cmd ``` ## Dockerfile中新增Health Check 迴歸標題，上面這個問題讓我想起了健康檢查這個東西。於是有了本文總結。那還是記錄下使用容器映象的時候怎麼作健康檢查吧。 ![](https://img2020.cnblogs.com/blog/686418/202010/686418-20201028212425555-1972730675.png) 在dockerfile中可以新增HEALTHCHECK指令，檢查後面的cmd是否執行成功，成功則表示容器執行健康。 ``` HEALTHCHECK [OPTIONS] CMD command 在容器中執行cmd，返回0表示成功，返回1表示失敗 HEALTHCHECK NONE 取消base映象到當前映象之間所有的health check ``` options - `--interval=DURATION` (default: 30s) healthcheck檢查時間間隔 - `--timeout=DURATION` (default: 30s) 執行cmd超時時間 - `--start-period=DURATION` (default: 0s) 容器啟動後多久開始執行health check - `--retries=N` (default: 3) 連續n次失敗則認為失敗一個檢查80埠的示例 ``` HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 ``` ## Health check在docker-compose.yml中的配置在docker-compose.yml中新增healthcheck節點，內容和dockerfile類似。 ``` version: '3.1' services: mongo: image: mongo:4 healthcheck: test: ["CMD", "netstat -anp | grep 27017"] interval: 2m timeout: 10s retries: 3 ``` ## Docker lib官方health check示例在github上發現了docker library下的healthcheck專案，比如mongo的健康檢查可以這麼做： Dockerfile ``` FROM mongo COPY docker-healthcheck /usr/local/bin/ HEALTHCHECK CMD ["docker-healthcheck"] ``` docker-healthcheck ``` #!/bin/bash set -eo pipefail host="$(hostname --ip-address || echo '127.0.0.1')" if mongo --quiet "$host/test" --eval 'quit(db.runCommand({ ping: 1 }).ok ? 0 : 2)'; then exit 0 fi exit 1 ``` 類色的， mysql ``` #!/bin/bash set -eo pipefail if [ "$MYSQL_RANDOM_ROOT_PASSWORD" ] && [ -z "$MYSQL_USER" ] && [ -z "$MYSQL_PASSWORD" ]; then # there's no way we can guess what the random MySQL password was echo >&2 'healthcheck error: cannot determine random root password (and MYSQL_USER and MYSQL_PASSWORD were not set)' exit 0 fi host="$(hostname --ip-address || echo '127.0.0.1')" user="${MYSQL_USER:-root}" export MYSQL_PWD="${MYSQL_PASSWORD:-$MYSQL_ROOT_PASSWORD}" args=( # force mysql to not use the local "mysqld.sock" (test "external" connectibility) -h"$host" -u"$user" --silent ) if command -v mysqladmin &> /dev/null; then if mysqladmin "${args[@]}" ping > /dev/null; then exit 0 fi else if select="$(echo 'SELECT 1' | mysql "${args[@]}")" && [ "$select" = '1' ]; then exit 0 fi fi exit 1 ``` redis ``` #!/bin/bash set -eo pipefail host="$(hostname -i || echo '127.0.0.1')" if ping="$(redis-cli -h "$host" ping)" && [ "$ping" = 'PONG' ]; then exit 0 fi exit 1 ``` ## K8s中的健康檢查實際上，我們用的更多的是使用k8s的健康檢查來標註容器是否健康。 k8s利用 **Liveness** 和 **Readiness** 探測機制設定更精細的健康檢查，進而實現如下需求： - 零停機部署。 - 避免部署無效的映象。 - 更加安全的滾動升級。每個容器啟動時都會執行一個程序，此程序由 Dockerfile 的 CMD 或 ENTRYPOINT 指定。如果程序退出時返回碼非零，則認為容器發生故障，Kubernetes 就會根據 restartPolicy 重啟容器。在建立Pod時，可以通過liveness和readiness兩種方式來探測Pod內容器的執行情況。liveness可以用來檢查容器內應用的存活的情況來，如果檢查失敗會殺掉容器程序，是否重啟容器則取決於Pod的重啟策略。readiness檢查容器內的應用是否能夠正常對外提供服務，如果探測失敗，則Endpoint Controller會將這個Pod的IP從服務中刪除。探針的檢測方法有三種： - exec：執行一段命令 - HTTPGet：通過一個http請求得到返回的狀態碼 - tcpSocket：測試某個埠是否可以連通每種檢查動作都可能有三種返回狀態。 - Success，表示通過了健康檢查 - Failure，表示沒有通過健康檢查 - Unknown，表示檢查動作失敗 ### Container Exec nginx_pod_exec.yaml： ``` apiVersion: v1 kind: Pod metadata: name: test-exec labels: app: web spec: containers: - name: nginx image: 192.168.56.201:5000/nginx:1.13 ports: - containerPort: 80 args: - /bin/sh - -c - touch /tmp/healthy;sleep 30;rm -rf /tmp/healthy;sleep 600 livenessProbe: exec: command: - cat - /tmp/healthy initialDelaySeconds: 5 periodSeconds: 5 ``` 本例建立了一個容器，通過檢查一個檔案是否存在來判斷容器執行是否正常。容器執行30秒後，將檔案刪除，這樣容器的liveness檢查失敗從而會將容器重啟。 ### HTTP Health Check ``` apiVersion: v1 kind: Pod metadata: labels: test: liveness app: httpd name: liveness-http spec: containers: - name: liveness image: docker.io/httpd ports: - containerPort: 80 livenessProbe: httpGet: path: /index.html port: 80 httpHeaders: - name: X-Custom-Header value: Awesome initialDelaySeconds: 5 periodSeconds: 5 ``` 本例通過建立一個伺服器，通過訪問 index 來判斷服務是否存活。通過手工刪除這個檔案的方式，可以導致檢查失敗，從而重啟容器。 ``` [root@devops-101 ~]# kubectl exec -it liveness-http /bin/sh # # ls bin build cgi-bin conf error htdocs icons include logs modules # ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 11:39 ? 00:00:00 httpd -DFOREGROUND daemon 6 1 0 11:39 ? 00:00:00 httpd -DFOREGROUND daemon 7 1 0 11:39 ? 00:00:00 httpd -DFOREGROUND daemon 8 1 0 11:39 ? 00:00:00 httpd -DFOREGROUND root 90 0 0 11:39 ? 00:00:00 /bin/sh root 94 90 0 11:39 ? 00:00:00 ps -ef # # cd /usr/local/apache2 # ls bin build cgi-bin conf error htdocs icons include logs modules # cd htdocs # ls index.html # rm index.html # command terminated with exit code 137 [root@devops-101 ~]# kubectl describe pod liveness-http Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 1m default-scheduler Successfully assigned default/liveness-http to devops-102 Warning Unhealthy 8s (x3 over 18s) kubelet, devops-102 Liveness probe failed: HTTP probe failed with statuscode: 404 Normal Pulling 7s (x2 over 1m) kubelet, devops-102 pulling image "docker.io/httpd" Normal Killing 7s kubelet, devops-102 Killing container with id docker://liveness:Container failed liveness probe.. Container will be killed and recreated. Normal Pulled 1s (x2 over 1m) kubelet, devops-102 Successfully pulled image "docker.io/httpd" Normal Created 1s (x2 over 1m) kubelet, devops-102 Created container Normal Started 1s (x2 over 1m) kubelet, devops-102 Started container ``` ### TCP Socket 這種方式通過TCP連線來判斷是否存活，Pod編排示例。 ``` apiVersion: v1 kind: Pod metadata: labels: test: liveness app: node name: liveness-tcp spec: containers: - name: goproxy image: docker.io/googlecontainer/goproxy:0.1 ports: - containerPort: 8080 readinessProbe: tcpSocket: port: 8080 initialDelaySeconds: 5 periodSeconds: 10 livenessProbe: tcpSocket: port: 8080 initialDelaySeconds: 15 periodSeconds: 20 ``` ### readiness 檢查例項另一種 readiness配置方式和liveness類似，只要修改livenessProbe改為readinessProbe即可。一些引數解釋 - initialDelaySeconds：檢查開始執行的時間，以容器啟動完成為起點計算 - periodSeconds：檢查執行的週期，預設為10秒，最小為1秒 - timeoutSeconds：檢查超時的時間，預設為1秒，最小為1秒 - successThreshold：從上次檢查失敗後重新認定檢查成功的檢查次數閾值（必須是連續成功），預設為1 - failureThreshold：從上次檢查成功後認定檢查失敗的檢查次數閾值（必須是連續失敗），預設為1 - httpGet的屬性 - host：主機名或IP - scheme：連結型別，HTTP或HTTPS，預設為HTTP - path：請求路徑 - httpHeaders：自定義請求頭 - port：請求埠 ## 參考 - https://docs.docker.com/compose/startup-order/ - https://docs.docker.com/compose/compose-file/#depends_on - https://docs.docker.com/engine/reference/builder/#healthcheck - https://github.com/docker-library/healthcheck - https://www.cnblogs.com/cocowool/p/kubernetes_container_pr

Docker容器和K8s新增Health Check

Docker容器和K8s新增Health Check

Docker02 Docker初識：第一個Docker容器和Docker鏡像

關於docker容器和鏡像的區別

10張圖帶你深入理解Docker容器和鏡像-轉

CentOS7修改Docker容器和鏡像默認存儲位置

docker 容器和鏡像理解

Docker 技巧之刪除Docker容器和映象

k8s的Health Check（健康檢查）

docker 原理 docker容器和映象區別

Docker容器和資料視覺化管理工具Flocker

Docker容器和主機如何互相拷貝傳輸檔案

docker 容器和映象的匯出匯入及遷移

如何在docker容器和宿主機之間複製檔案

Docker容器和映象儲存機制—images—目錄樹結構

啟動或刪除Docker容器和映象

如何遷移docker容器和映象的預設路徑

學習Docker之10張圖帶你深入理解Docker容器和映象

【docker】docker容器和宿主機之間檔案互傳，互相拷貝

Docker--容器和映象的使用

如何解決Docker容器和宿主機時間同步問題

Docker容器和K8s新增Health Check

相關推薦