Kubernetes無法刪除pod問題排查

阿新 • • 發佈：2018-12-12

原本由兩臺Kubernetes組成的小叢集，但是今天只開啟了一臺機器，也就是隻有一個節點，造成了無法刪除pod例項的原因。

先檢視一下現在的容器的執行狀態：

[[email protected] ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-controller-lv8md 1/1 Unknown 0 16h
nginx-controller-sb3fx 1/1 Unknown 2 16h

nginx2-1216651254-4b2dw 0/1 ImagePullBackOff 0 8m
nginx2-1216651254-dbtms 0/1 ImagePullBackOff 0 8m
nginx2-1216651254-fhb4r 0/1 ImagePullBackOff 0 8m

檢視有哪些replicationcontroller [簡寫rc]

[[email protected] ~]# kubectl get rc
No resources found.

檢視有哪些services

[[email protected] ~]# kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 10.254.0.1 <none> 443/TCP 2d

看到上面沒有rc，也沒有services，那嘗試這樣刪除所有的pods：

[[email protected] ~]# kubectl delete pods --all
pod "nginx-controller-lv8md" deleted

pod "nginx-controller-sb3fx" deleted
pod "nginx2-1216651254-4b2dw" deleted
pod "nginx2-1216651254-dbtms" deleted
pod "nginx2-1216651254-fhb4r" deleted

但是還是無法刪除，檢視已經部署的容器；

[[email protected] ~]# kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx2 3 3 3 0 16h
[[email protected] ~]# kubectl delete deployment nginx2
deployment "nginx2" deleted

為什麼這三個Pod例項沒有rc或者services呢，因為建立它的時候是使用run來實現的；

但是剩下的兩個例項怎麼刪除呢？

[[email protected] ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-controller-lv8md 1/1 Unknown 0 20h
nginx-controller-sb3fx 1/1 Unknown 2 20h

檢視一下例項的資訊：

[[email protected] ~]# kubectl describe pod nginx-controller-lv8md
Name: nginx-controller-lv8md
Namespace: default
Node: k8s-node/10.0.10.11 #這裡是重點，因為這兩個容器是分配到了k8s-node節點上，而這個節點現在宕機了。
Start Time: Tue, 13 Jun 2017 02:01:45 +0800
Labels: app=nginx
Status: Terminating (expires Mon, 12 Jun 2017 21:46:21 +0800)
Termination Grace Period: 30s
Reason: NodeLost
Message: Node k8s-node which was running pod nginx-controller-lv8md is unresponsive
IP: 172.21.42.3
Controllers: ReplicationController/nginx-controller
Containers:
nginx:
Container ID: docker://03fa59f9efc06e43ed8c9acc7d4c7533983d5733223dbb2efa5f65928d965b5b
Image: reg.docker.lc/share/nginx:latest
Image ID: docker-pullable://reg.docker.lc/share/[email protected]:e5c82328a509aeb7c18c1d7fb36633dc638fcf433f651bdcda59c1cc04d3ee55
Port: 80/TCP
State: Running
Started: Tue, 13 Jun 2017 02:01:47 +0800
Ready: True
Restart Count: 0
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
No volumes.
QoS Class: BestEffort
Tolerations: <none>
No events.

因為這兩個容器的rc，services都已經刪除了，但是還保持這個Unknown狀態是由於目標主機無法響應並返回資訊導致；既然目標主機都已經宕機了，那就直接移除節點；

[[email protected] ~]# kubectl delete node k8s-node
node "k8s-node" deleted
[[email protected] ~]# kubectl get node
NAME STATUS AGE
k8s Ready 2d

因為節點都不存在了，那就沒有了容器的狀態資訊；

[[email protected] ~]# kubectl get pods
No resources found.

上面這樣雖然是刪除了那些已經也錯的例項，但卻刪除了節點。那麼如果節點現在恢復了，是否會自動加入到叢集呢？

主節點上的日誌很快就發現了k8s-node節點了；

Jun 13 15:06:13 k8s kube-controller-manager[34050]: E0613 15:06:13.917313 34050 actual_state_of_world.go:475] Failed to set statusUpdateNeeded to needed true because nodeName="k8s-node" does not exist
Jun 13 15:06:14 k8s kube-controller-manager[34050]: I0613 15:06:14.618864 34050 event.go:217] Event(api.ObjectReference{Kind:"Node", Namespace:"", Name:"k8s-node", UID:"c9864434-5006-11e7-ab16-000c29e9277a", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'RegisteredNode' Node k8s-node event: Registered Node k8s-node in NodeController

節點自動又回來了，因為在配置檔案中已經配置好了，只要節點一上線就會加入到叢集中。

[[email protected] ~]# kubectl get nodes
NAME STATUS AGE
k8s Ready 2d
k8s-node Ready 1m

建立一個Nginx例項：

[[email protected] ~]# kubectl create -f Nginx.yaml
replicationcontroller "nginx-controller" created
service "nginx-service" created

Kubernetes已經把此例項分配到剛啟動的k8s-node節點上了；

[[email protected] ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-controller-zsx2q 1/1 Running 0 6s
[[email protected] ~]# kubectl describe pod nginx-controller-zsx2q |grep "Node"
Node: k8s-node/10.0.10.11

動態擴充套件，新增一個容器數量；

[[email protected] ~]# kubectl scale replicationcontroller --replicas=2 nginx-controller
replicationcontroller "nginx-controller" scaled

檢視叢集現在的狀態資訊；

[[email protected] ~]# kubectl describe svc nginx-service
Name: nginx-service
Namespace: default
Labels: <none>
Selector: app=nginx
Type: ClusterIP
IP: 10.254.132.82
External IPs: 10.0.10.10
Port: <unset> 8000/TCP
Endpoints: 172.21.42.2:80,172.21.93.2:80
Session Affinity: None
No events.

此時再刪除k8s-node節點又會如何呢？

[[email protected] ~]# kubectl delete node k8s-node
node "k8s-node" deleted

已經刪除了一個節點了：

[[email protected] ~]# kubectl get nodes
NAME STATUS AGE
k8s Ready 2d

Pod的數量沒有減少還是兩個：

[[email protected] ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-controller-43qpx 1/1 Running 0 14m
nginx-controller-zsx2q 1/1 Running 0 19m

檢視叢集中的分配資訊時發現，兩個容器的Ip段都是172.21.93的說明來自同一節點；

[[email protected] ~]# kubectl describe svc nginx-service
Name: nginx-service
Namespace: default
Labels: <none>
Selector: app=nginx
Type: ClusterIP
IP: 10.254.132.82
External IPs: 10.0.10.10
Port: <unset> 8000/TCP
Endpoints: 172.21.93.2:80,172.21.93.3:80
Session Affinity: None
No events.

重新加入k8s-node 節點，需要在k8s-node上重啟一下服務：

systemctl restart kubelet

--------------------- 本文來自 yao不ke及的CSDN 部落格，全文地址請點選：https://blog.csdn.net/qq_19674905/article/details/80887461?utm_source=copy

Kubernetes無法刪除pod問題排查

原本由兩臺Kubernetes組成的小叢集，但是今天只開啟了一臺機器，也就是隻有一個節點，造成了無法刪除pod例項的原因。先檢視一下現在的容器的執行狀態： [[email protected] ~]# kubectl get pods NAME

kubernetes 無法刪除 pod 問題的解決

warn efi med left popu warning ret 技術分享 2.3 [摘要] kubernetes 可能會產生垃圾或者僵屍pod，在刪除rc的時候，相應的pod沒有被刪除，手動刪除pod後會自動重新創建，這時一般需要先刪除掉相關聯的resources，實

k8s無法刪除pod解決方法

1.執行命令： kubectl get rc ①如果有rc，先刪除rc，再刪除pod ②如果沒有rc，直接執行：kubectl delete pod <name> --grace

[Kubernetes]node節點pod無法啟動/節點刪除網路重置

node1之前反覆新增過,新增之前需要清除下網路 [email protected]:/var/lib/kubelet# kubectl get pod -o wide NAME READY STATUS

Kubernetes創建pod一直處於ContainerCreating排查和解決

企業信息化管理軟件用k8s創建完pod後，發現無法訪問demo應用，查了一下pods狀態，發現都在containercreationg狀態中。百度了一下，根據網上的方法，查了一下mysql-jn6f2這個pods的詳情其中最主要的問題是：details: (open /etc/docker/cer

kubernetes刪除pod，pod一直處於Terminating狀態

狀態 rac term clu restart cluster force namespace -c 刪除pod，pod一直處於Terminating狀態 [[email protected] deploy_yaml]# kubectl get pod -n

Intellj IDEA光標為insert狀態，無法刪除內容

查看那種狀態 chm img bsp 刪除時間 eight 以前用得是社區版的IDEA，今天裝了14版本的，結果導入項目後，發現打開java文件的光標是win系統下按了insert鍵後的那種寬的光標，並且還無法刪除內容，且按刪除(delete)鍵也只見光標往前移動，但

iTextSharp 合並PDF後，無法刪除已經合並的單個文件

page 關鍵點 fwrite 引用 foreach ntb span read var private void MergePDFFiles(string[] fileList, string outMergeFile) {

無法刪除 NTFS 盤上的文件或文件夾（對Windows文件的各種情況有比較詳細的描述）

將在 binding 新的資源管理器操作權限損壞 windows 導致 port 簡介本文介紹您可能無法刪除 NTFS 文件系統卷上的文件或文件夾的原因，以及如何分析造成此問題的不同原因從而解決此問題。

Microsoft Exchange 錯誤無法刪除郵箱數據庫'SZ Staff'

microsoft 數據庫郵箱本文出自 “yqcd” 博客，請務必保留此出處http://117295.blog.51cto.com/107295/1934132Microsoft Exchange 錯誤無法刪除郵箱數據庫'SZ Staff'

intllij IDE 中git ignore 無法刪除target目錄下的文件

ignore 無法刪除網上一份 class 初始 work net base 原因： git的本地忽略設置必須保證git的遠程倉庫分支上沒有這個要忽略的文件，如果遠程分支上存在這個文件，本地在設置ignore 這個文件，將會失敗，無法commit忽略。（有人說是g

【Oracle】無法刪除當前連接的用戶

-- 無法刪除 acl 當前 lec 查詢 select bsp 無法一、查詢數據庫所有當前連接的用戶 select username, sid, serial# from v$session; 二、從結果列表裏找到對應的用戶 alter system kill s

桌面出現removable storage devices文件夾無法刪除解決辦法

mov 電腦文件夾自動 ble 消失今天 title 無法今天桌面突然出現 removable storage devices 文件夾，且沒有刪除選項。解決辦法：往電腦裏插一下u盤文件夾就會自動消失了。桌面出現removable storage devices

windows 如何刪除fis3的發布路徑[文件名或擴展名太長，目錄層次多無法刪除的問題]

div 如何擴展名 all modules tro 文件夾直接 ins 問題這幾天遇到一個小問題，windows下無法直接刪除fis3的發布目錄dist,因為在執行命令fis3 release -wL 時出現錯誤，導致dist內部嵌套的子目錄太多(47層)；直接刪除

12c DataGuard 無法刪除歸檔日誌

dataguard rman-08137 一、環境描述Oracle 12c 單實例DataGuardRhel 7.3二、測試過程主庫操作1.關閉DG,切換日誌SQL> alter system set log_archive_dest_state_2=defer;System altered.S

Ambari 節點壞掉不要的節點無法刪除解決方法

url div restart ntp 無法刪除 pre host 點擊 2.0 1.配置一臺和壞掉的節點一樣的ip和 hostname 2.安裝ntp服務（根據自己時間同步方案安裝情況而定）使時間同步 3.安裝 ambari-agent 可以去我共享了一個地址下載: h

powershell遍歷文件夾設置權限，解決文件無法刪除的問題。

權限 spa div rsh 遍歷文件 nbsp style everyone 無法刪除 function set-rights ($path) { $p = Get-Item $path; if ($p.Attributes -eq ‘Directory

12c ADG無法刪除備庫歸檔RMAN-08137

service war all standby sync .... file log cap 一、環境描述12c 變化很大，目前上線的系統越來越多，大家需要不斷更新自己的知識庫。 On : 12.1.0.2 version, RAC? can‘t delete archiv

解決SecureCRT下spark-shell中scala無法刪除問題

們的知識庫 crt 解決方法 sdn html ace track 點擊轉自：http://blog.csdn.net/huanbia/article/details/51318278 問題描述當使用SecureCRT來打開Spark-shell的時候，有時

無法刪除數據庫副本或卸載郵件服務器

服務器 Exchange 假設您正在刪除郵箱數據庫的最後一個副本或卸載Exchange服務器，並運行以下錯誤消息正如錯誤消息所建議的，我驗證了可以創建的各種可能郵箱的存在，即普通用戶郵箱，仲裁郵箱，公用文件夾郵箱和存檔郵箱。檢查常規郵箱沒有顯示任何內容：檢查存檔郵箱沒有顯示任何內容：檢查公用文

Kubernetes無法刪除pod問題排查

相關推薦