1. 程式人生 > >由於docker pull image失敗,導致k8s pod卡在ContainerCreating狀態

由於docker pull image失敗,導致k8s pod卡在ContainerCreating狀態

由於國內有一些公有云的伺服器,訪問docker.io非常不穩定, 導致pull image一直是失敗的,

nginx.yaml

# cat nginx.yaml 
apiVersion: v1
kind: ReplicationController
metadata:
  name: myweb
spec:
  replicas: 2        
  selector:
    app: myweb
  template:
    metadata:
      labels:
        app: myweb
    spec:
      containers:
        - name: myweb
          #image: registry.cn-shenzhen.aliyuncs.com/yansongda/nginx:latest  #訪問穩定
          image: nginx  # 訪問不穩定的image
          ports:
          - containerPort: 80

檢視pod的events, 最後一行, 一直提示 pulling image "nginx"

# kubectl describe pod myweb-fq
Name:           myweb-fqhxm
Namespace:      default
Node:           test.novalocal/172.16.0.138
Start Time:     Thu, 09 Aug 2018 17:10:58 +0800
Labels:         app=myweb
Annotations:    <none>
Status:         Pending
IP:             
Controlled By:  ReplicationController/myweb
Containers:
  myweb:
    Container ID:   
    Image:          nginx
    Image ID:       
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-s7722 (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  default-token-s7722:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-s7722
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age              From                     Message
  ----     ------                 ----             ----                     -------
  Normal   Scheduled              10s              default-scheduler        Successfully assigned myweb-fqhxm to test.novalocal
  Normal   SuccessfulMountVolume  9s               kubelet, test.novalocal  MountVolume.SetUp succeeded for volume "default-token-s7722"
  Warning  MissingClusterDNS      9s (x2 over 9s)  kubelet, test.novalocal  pod: "myweb-fqhxm_default(21410469-9bb4-11e8-bdef-fa163e875cc8)". kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
  Normal   Pulling                9s               kubelet, test.novalocal  pulling image "nginx"

在nginx映象拉下來之前, pod一直是ContainerCreating   

# kubectl get pods,svc
NAME              READY     STATUS              RESTARTS   AGE
pod/myweb-fqhxm   0/1       ContainerCreating   0          4m
pod/myweb-xn89g   0/1       ContainerCreating   0          4m

NAME                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
service/kubernetes   ClusterIP   10.222.0.1     <none>        443/TCP        1h
service/myweb        NodePort    10.222.9.108   <none>        80:30001/TCP   4m

如果使用國內的映象 

image: registry.cn-shenzhen.aliyuncs.com/yansongda/nginx:latest  #訪問穩定

仍然出來  ContainerCreating  ,  可以查詢一下kubelet的狀態及日誌 , 

# systemctl status kubelet.service
● kubelet.service - Kubernetes API Server
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2018-08-09 17:05:51 CST; 14min ago
     Docs: https://kubernetes.io/doc
 Main PID: 21582 (kubelet)
    Tasks: 0
   Memory: 8.3M
   CGroup: /system.slice/kubelet.service
           ‣ 21582 /usr/bin/kubelet --kubeconfig=/etc/kubernetes/kubeconfig.yaml --logtostderr=false --log-dir=/var/log/kubernetes --v=2 --cgroup-driver=systemd --runtime-cgro
ups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice

Aug 09 17:10:54 test.novalocal kubelet[21582]: E0809 17:10:54.014150   21582 fsHandler.go:121] failed to collect filesystem stats - rootDiskErr: du command failed on /var/lib/
docker/overlay2/a09ba0411b8d5460276094e64a6e70ce92182d027ea72c9031c20eca8096d9e2/diff with output stdout: , stderr: du: cannot access ‘/var/lib/docker/overlay2/a09ba0411b8d546
0276094e64a6e70ce92182d027ea72c9031c20eca8096d9e2/diff’: No such file or directory
Aug 09 17:10:54 test.novalocal kubelet[21582]: - exit status 1, rootInodeErr: cmd [find /var/lib/docker/overlay2/a09ba0411b8d5460276094e64a6e70ce92182d027ea72c9031c20eca8096d9
e2/diff -xdev -printf .] failed. stderr: find: ‘/var/lib/docker/overlay2/a09ba0411b8d5460276094e64a6e70ce92182d027ea72c9031c20eca8096d9e2/diff’: No such file or directory
Aug 09 17:10:54 test.novalocal kubelet[21582]: ; err: exit status 1, extraDiskErr: du command failed on /var/lib/docker/containers/49913bfa94a2725fad871fd7c389443a7d1bed26881f
96097dcaa8f183ebc57e with output stdout: , stderr: du: cannot access ‘/var/lib/docker/containers/49913bfa94a2725fad871fd7c389443a7d1bed26881f96097dcaa8f183ebc57e’: No such fil
e or directory
Aug 09 17:10:54 test.novalocal kubelet[21582]: - exit status 1
Aug 09 17:17:43 test.novalocal kubelet[21582]: E0809 17:17:43.632628   21582 kubelet_pods.go:163] Mount cannot be satisfied for container "myweb", because the volume is missin
g or the volume mounter is nil: {Name:default-token-s7722 ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}
Aug 09 17:17:43 test.novalocal kubelet[21582]: E0809 17:17:43.632730   21582 kuberuntime_manager.go:733] container start failed: CreateContainerConfigError: cannot find volume
 "default-token-s7722" to mount into container "myweb"
Aug 09 17:17:43 test.novalocal kubelet[21582]: E0809 17:17:43.632864   21582 pod_workers.go:186] Error syncing pod 213ff9c6-9bb4-11e8-bdef-fa163e875cc8 ("myweb-xn89g_default(2
13ff9c6-9bb4-11e8-bdef-fa163e875cc8)"), skipping: failed to "StartContainer" for "myweb" with CreateContainerConfigError: "cannot find volume \"default-token-s7722\" to mount
into container \"myweb\""
Aug 09 17:17:48 test.novalocal kubelet[21582]: E0809 17:17:48.819683   21582 kubelet_pods.go:163] Mount cannot be satisfied for container "myweb", because the volume is missin
g or the volume mounter is nil: {Name:default-token-s7722 ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}
Aug 09 17:17:48 test.novalocal kubelet[21582]: E0809 17:17:48.819880   21582 kuberuntime_manager.go:733] container start failed: CreateContainerConfigError: cannot find volume
 "default-token-s7722" to mount into container "myweb"
Aug 09 17:17:48 test.novalocal kubelet[21582]: E0809 17:17:48.819968   21582 pod_workers.go:186] Error syncing pod 21410469-9bb4-11e8-bdef-fa163e875cc8 ("myweb-fqhxm_default(2
1410469-9bb4-11e8-bdef-fa163e875cc8)"), skipping: failed to "StartContainer" for "myweb" with CreateContainerConfigError: "cannot find volume \"default-token-s7722\" to mount
into container \"myweb\""

然而一般情況下, kubelet的狀態是這樣的

# systemctl status kubelet.service
● kubelet.service - Kubernetes API Server
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2018-08-09 17:22:19 CST; 20s ago
     Docs: https://kubernetes.io/doc
 Main PID: 25756 (kubelet)
    Tasks: 0
   Memory: 26.4M
   CGroup: /system.slice/kubelet.service
           ‣ 25756 /usr/bin/kubelet --kubeconfig=/etc/kubernetes/kubeconfig.yaml --logtostderr=false --log-dir=/var/log/kubernetes --v=2 --cgroup-driver=systemd --runtime-c...

Aug 09 17:22:19 test.novalocal systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
Aug 09 17:22:19 test.novalocal systemd[1]: Unit kubelet.service entered failed state.
Aug 09 17:22:19 test.novalocal systemd[1]: kubelet.service failed.
Aug 09 17:22:19 test.novalocal systemd[1]: Started Kubernetes API Server.
Aug 09 17:22:19 test.novalocal systemd[1]: Starting Kubernetes API Server...
Aug 09 17:22:19 test.novalocal kubelet[25756]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet...formation.
Aug 09 17:22:19 test.novalocal kubelet[25756]: Flag --kubelet-cgroups has been deprecated, This parameter should be set via the config file specified by the Kubel...formation.
Aug 09 17:22:19 test.novalocal kubelet[25756]: E0809 17:22:19.782359   25756 kubelet.go:1282] Image garbage collection failed once. Stats initialization may not h...ontainer /
Aug 09 17:22:22 test.novalocal kubelet[25756]: Starting Device Plugin manager
Hint: Some lines were ellipsized, use -l to show in full.

最近在看了一部電影,道恩·強森 主演的 <摩天營救>, 裡面有一句經典臺詞  "重啟 試試"

systemctl restart kubelet.service

小生文筆不好, 只做了一下記錄 :-)