1. 程式人生 > >kubelet啟動後image被刪除 « Terrence的宅宅幻想

kubelet啟動後image被刪除 « Terrence的宅宅幻想

這兩天在自己測試的K8S群集裡面發生了一個奇怪的現象

某臺node的status是NotReady使用kubectl describe node看到了如下訊息

Conditions:
  Type             Status    LastHeartbeatTime                 LastTransitionTime                Reason                    Message
  ----             ------    -----------------                 ------------------                ------                    -------
  OutOfDisk        Unknown   Thu, 08 Nov 2018 19:21:51 +0900   Thu, 08 Nov 2018 19:22:35 +0900   NodeStatusUnknown         Kubelet stopped posting node status.
  MemoryPressure   Unknown   Thu, 08 Nov 2018 19:21:51 +0900   Thu, 08 Nov 2018 19:22:35 +0900   NodeStatusUnknown         Kubelet stopped posting node status.
  DiskPressure     Unknown   Thu, 08 Nov 2018 19:21:51 +0900   Thu, 08 Nov 2018 19:22:35 +0900   NodeStatusUnknown         Kubelet stopped posting node status.
  PIDPressure      False     Thu, 08 Nov 2018 19:21:51 +0900   Wed, 17 Oct 2018 09:41:55 +0900   KubeletHasSufficientPID   kubelet has sufficient PID available
  Ready            Unknown   Thu, 08 Nov 2018 19:21:51 +0900   Thu, 08 Nov 2018 19:22:35 +0900   NodeStatusUnknown         Kubelet stopped posting node status.

Kubelet stopped posting node status,登入該臺機器發現docker image跟container被清空了

$ sudo docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

$ sudo docker images
REPOSITORY                             TAG                 IMAGE ID            CREATED             SIZE
gcr.io/google_containers/pause-amd64   3.1                 da86e6ba6ca1        10 months ago       742 kB

誒 本來應該有一堆image跟container全都消失了,重開kubelet也沒用

後來直接reboot,短暫回復了一下,但是沒多久docker image就又被清空了

搜尋一下/var/log/messages看到了kubelet一直在嘗試刪除image

...
Nov  9 10:16:39 dev-k8s-node202 kubelet: I1109 10:16:39.224997    3826 image_gc_manager.go:317] attempting to delete unused images
Nov  9 10:16:49 dev-k8s-node202 kubelet: I1109 10:16:49.247641    3826 container_gc.go:85] attempting to delete unused containers
Nov  9 10:16:49 dev-k8s-node202 kubelet: I1109 10:16:49.249513    3826 image_gc_manager.go:317] attempting to delete unused images
Nov  9 10:16:59 dev-k8s-node202 kubelet: I1109 10:16:59.272075    3826 container_gc.go:85] attempting to delete unused containers
Nov  9 10:16:59 dev-k8s-node202 kubelet: I1109 10:16:59.273809    3826 image_gc_manager.go:317] attempting to delete unused images

是disk不夠還是什麼的嘛稍微檢查了一下

$ df -lh
/dev/sda1  291G  239G   41G  86% /

disk使用超過太多的時候kubelet就會開始瘋狂嘗試刪除image

而我的kube-proxy,nginx等等元件都是用docker啟動的

這些元件也因此被一起砍掉,因此整個Node就無法運作

之後把disk清掉之後在reboot就回復正常了