Kubernetes之kubectl常用命令:故障排查和故障解決

阿新 • • 發佈：2019-02-07

kubectl故障排查相關常用命令

編號	命令	說明
1	version	顯示客戶端和伺服器側版本資訊
2	api-versions	以group/version的格式顯示伺服器側所支援的API版本
3	explain	顯示資源文件資訊
4	get	取得確認物件資訊列表
5	describe	取得確認物件的詳細資訊
6	logs	取得pod中容器的log資訊
7	exec	在容器中執行一條命令
8	cp	從容器考出或向容器考入檔案
9	attach	Attach到一個執行中的容器上

kubectl version

version命令用於確認客戶端和伺服器側的版本資訊，不同的版本的情況變化可能很大，所以故障排除時首先也需要確認的是現場環境的版本資訊。從下面可以清楚地看到，本文驗證時所使用的版本為1.11.2

[[email protected] ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:17:28Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

kubectl api-versions

使用api-versions命令可以列出當前版本的kubernetes的伺服器端所支援的api版本資訊。

[[email protected] ~]# kubectl api-versions
admissionregistration.k8s.io/v1beta1
apiextensions.k8s.io/v1beta1
apiregistration.k8s.io/v1
apiregistration.k8s.io/v1beta1
apps/v1
apps/v1beta1
apps/v1beta2
authentication.k8s.io/v1
authentication.k8s.io/v1beta1
authorization.k8s.io/v1
authorization.k8s.io/v1beta1
autoscaling/v1
autoscaling/v2beta1
batch/v1
batch/v1beta1
certificates.k8s.io/v1beta1
events.k8s.io/v1beta1
extensions/v1beta1
networking.k8s.io/v1
policy/v1beta1
rbac.authorization.k8s.io/v1
rbac.authorization.k8s.io/v1beta1
scheduling.k8s.io/v1beta1
storage.k8s.io/v1
storage.k8s.io/v1beta1
v1

kubectl explain

使用kubectl explain可以和kubectl help一樣進行輔助的功能確認，使用它可以瞭解各個部分的說明和組成部分。比如如下可以看到對rc的說明，在故障排除時作用並不具有太大作用，到是可以多讀讀加深一下對各個部分的理解。

[[email protected] ~]# kubectl explain rc
DESCRIPTION:
ReplicationController represents the configuration of a replication controller.

FIELDS:
   apiVersion   <string>
     APIVersion defines the versioned schema of this representation of an
     object. Servers should convert recognized schemas to the latest internal
     value, and may reject unrecognized values. More info:
     http://releases.k8s.io/HEAD/docs/devel/api-conventions.md#resources

   kind <string>
     Kind is a string value representing the REST resource this object
     represents. Servers may infer this from the endpoint the client submits
     requests to. Cannot be updated. In CamelCase. More info:
     http://releases.k8s.io/HEAD/docs/devel/api-conventions.md#types-kinds

   metadata <Object>
     If the Labels of a ReplicationController are empty, they are defaulted to
     be the same as the Pod(s) that the replication controller manages. Standard
     object's metadata. More info:
     http://releases.k8s.io/HEAD/docs/devel/api-conventions.md#metadata

   spec <Object>
     Spec defines the specification of the desired behavior of the replication
     controller. More info:
     http://releases.k8s.io/HEAD/docs/devel/api-conventions.md#spec-and-status

   status   <Object>
     Status is the most recently observed status of the replication controller.
     This data may be out of date by some window of time. Populated by the
     system. Read-only. More info:
     http://releases.k8s.io/HEAD/docs/devel/api-conventions.md#spec-and-status

explain命令能夠確認的資訊類別

其所能支援的類別如下：

類別
clusters (僅對federation apiservers有效)
componentstatuses (縮寫 cs)
configmaps (縮寫 cm)
daemonsets (縮寫 ds)
deployments (縮寫 deploy)
endpoints (縮寫 ep)
events (縮寫 ev)
horizontalpodautoscalers (縮寫 hpa)
ingresses (縮寫 ing)
jobs
limitranges (縮寫 limits)
namespaces (縮寫 ns)
networkpolicies
nodes (縮寫 no)
persistentvolumeclaims (縮寫 pvc)
persistentvolumes (縮寫 pv)
pods (縮寫 po)
podsecuritypolicies (縮寫 psp)
podtemplates
replicasets (縮寫 rs)
replicationcontrollers (縮寫 rc)
resourcequotas (縮寫 quota)
secrets
serviceaccounts (縮寫 sa)
services (縮寫 svc)
statefulsets
storageclasses
thirdpartyresources

kubectl get

使用get命令確認所創建出來的pod和deployment的資訊

確認pod

可以看到創建出來的pod的所有資訊,也可以使用Kubectl get po進行確認

[[email protected] ~]# kubectl get pods

確認deployment

可以看到創建出來的deployment的所有資訊

[[email protected] ~]# kubectl get deployment

如果希望得到更加詳細一點的資訊，可以加上-o wide引數,比如對pods可以看到此pod在哪個node上執行，此pod的叢集IP是多少也被一併顯示了

[[email protected] ~]# kubectl get pods -o wide

確認node資訊

顯示node的資訊

[[email protected] ~]# kubectl get nodes -o wide

確認namespace資訊

列出所有的namespace

[[email protected] ~]# kubectl get namespaces

get命令能夠確認的資訊類別

使用node/pod/event/namespaces等結合起來，能夠獲取叢集基本資訊和狀況, 其所能支援的類別如下：

類別
clusters (僅對federation apiservers有效)
componentstatuses (縮寫 cs)
configmaps (縮寫 cm)
daemonsets (縮寫 ds)
deployments (縮寫 deploy)
endpoints (縮寫 ep)
events (縮寫 ev)
horizontalpodautoscalers (縮寫 hpa)
ingresses (縮寫 ing)
jobs
limitranges (縮寫 limits)
namespaces (縮寫 ns)
networkpolicies
nodes (縮寫 no)
persistentvolumeclaims (縮寫 pvc)
persistentvolumes (縮寫 pv)
pods (縮寫 po)
podsecuritypolicies (縮寫 psp)
podtemplates
replicasets (縮寫 rs)
replicationcontrollers (縮寫 rc)
resourcequotas (縮寫 quota)
secrets
serviceaccounts (縮寫 sa)
services (縮寫 svc)
statefulsets
storageclasses
thirdpartyresources

kubectl describe

確認node詳細資訊

一般使用get命令取得node資訊，然後使用describe確認詳細資訊。

確認某一pod詳細資訊

[[email protected] tmp]# kubectl describe pod mysql-478535978-1dnm2
Name:       mysql-478535978-1dnm2
Namespace:  default
Node:       192.168.32.133/192.168.32.133
Start Time: Thu, 29 Jun 2017 05:04:21 -0400
Labels:     name=mysql
        pod-template-hash=478535978
Status:     Running
IP:     172.200.44.2
Controllers:    ReplicaSet/mysql-478535978
Containers:
  mysql:
    Container ID:   docker://47ef1495e86f4b69414789e81081fa55b837dafe9e47944894e7cb3733700410
    Image:      192.168.32.131:5000/mysql:5.7.16
    Image ID:       docker-pullable://192.168.32.131:5000/[email protected]:410b279f6827492da7a355135e6e9125849f62eeca76429974a534f021852b58
    Port:       3306/TCP
    State:      Running
      Started:      Thu, 29 Jun 2017 05:04:22 -0400
    Ready:      True
    Restart Count:  0
    Volume Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dzs1w (ro)
    Environment Variables:
      MYSQL_ROOT_PASSWORD:  hello123
Conditions:
  Type      Status
  Initialized   True 
  Ready     True 
  PodScheduled  True 
Volumes:
  default-token-dzs1w:
    Type:   Secret (a volume populated by a Secret)
    SecretName: default-token-dzs1w
QoS Class:  BestEffort
Tolerations:    <none>
No events.
[[email protected] tmp]#

確認deployment詳細資訊

確認某一deployment的詳細資訊

[[email protected] tmp]# kubectl get deployment
NAME        DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
mysql       1         1         1            1           1h
sonarqube   1         1         1            1           1h
[[email protected] tmp]# kubectl describe deployment mysql
Name:           mysql
Namespace:      default
CreationTimestamp:  Thu, 29 Jun 2017 05:04:21 -0400
Labels:         name=mysql
Selector:       name=mysql
Replicas:       1 updated | 1 total | 1 available | 0 unavailable
StrategyType:       RollingUpdate
MinReadySeconds:    0
RollingUpdateStrategy:  1 max unavailable, 1 max surge
Conditions:
  Type      Status  Reason
  ----      ------  ------
  Available     True    MinimumReplicasAvailable
OldReplicaSets: <none>
NewReplicaSet:  mysql-478535978 (1/1 replicas created)
No events.
[[email protected] tmp]#

describe命令能夠確認的資訊

describe命令所能支援的類別如下：

類別
clusters (僅對federation apiservers有效)
componentstatuses (縮寫 cs)
configmaps (縮寫 cm)
daemonsets (縮寫 ds)
deployments (縮寫 deploy)
endpoints (縮寫 ep)
events (縮寫 ev)
horizontalpodautoscalers (縮寫 hpa)
ingresses (縮寫 ing)
jobs
limitranges (縮寫 limits)
namespaces (縮寫 ns)
networkpolicies
nodes (縮寫 no)
persistentvolumeclaims (縮寫 pvc)
persistentvolumes (縮寫 pv)
pods (縮寫 po)
podsecuritypolicies (縮寫 psp)
podtemplates
replicasets (縮寫 rs)
replicationcontrollers (縮寫 rc)
resourcequotas (縮寫 quota)
secrets
serviceaccounts (縮寫 sa)
services (縮寫 svc)
statefulsets
storageclasses
thirdpartyresources

kubectl logs

類似於docker logs，使用kubectl logs能夠取出pod中映象的log，也是故障排除時候的重要資訊

[[email protected] tmp]# kubectl logs mysql-478535978-1dnm2
Initializing database
...
2017-06-29T09:04:37.081939Z 0 [Note] Event Scheduler: Loaded 0 events
2017-06-29T09:04:37.082097Z 0 [Note] mysqld: ready for connections.
Version: '5.7.16'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)

kubectl exec

exec命令用於到容器中執行一條命令，比如下述命令用於到mysql的映象中執行hostname命令

[[email protected] tmp]# kubectl get pods
NAME                         READY     STATUS    RESTARTS   AGE
mysql-478535978-1dnm2        1/1       Running   0          1h
sonarqube-3574384362-m7mdq   1/1       Running   0          1h
[[email protected] tmp]# kubectl exec mysql-478535978-1dnm2 hostname
mysql-478535978-1dnm2
[[email protected] tmp]#

更為常用的方式則是登陸到pod中，在有條件的時候，進行故障發生時的現場確認，這種方式是最為直接有效和快速，但是對許可權要求也較多。

[[email protected] tmp]# kubectl exec -it mysql-478535978-1dnm2 sh
# hostname
mysql-478535978-1dnm2
#

kubectl cp

用於pod和外部的檔案交換，比如如下示例瞭如何在進行內外檔案交換。

在pod中建立一個檔案message.log

[[email protected] tmp]# kubectl exec -it mysql-478535978-1dnm2 sh
# pwd
/
# cd /tmp
# echo "this is a message from `hostname`" >message.log
# cat message.log
this is a message from mysql-478535978-1dnm2
# exit
[[email protected] tmp]#

拷貝出來並確認

[[email protected] tmp]# kubectl cp mysql-478535978-1dnm2:/tmp/message.log message.log
tar: Removing leading `/' from member names
[[email protected] tmp]# cat message.log
this is a message from mysql-478535978-1dnm2
[[email protected] tmp]#

更改message.log並拷貝回pod

[[email protected] tmp]# echo "information added in `hostname`" >>message.log 
[[email protected] tmp]# cat message.log 
this is a message from mysql-478535978-1dnm2
information added in ku8-1
[[email protected] tmp]# kubectl cp message.log mysql-478535978-1dnm2:/tmp/message.log
[[email protected] tmp]#

確認更改後的資訊

[[email protected] tmp]# kubectl exec mysql-478535978-1dnm2 cat /tmp/message.log
this is a message from mysql-478535978-1dnm2
information added in ku8-1
[[email protected] tmp]#

kubectl attach

類似於docker attach的功能，用於取得實時的類似於kubectl logs的資訊

[[email protected] tmp]# kubectl get pods
NAME                         READY     STATUS    RESTARTS   AGE
mysql-478535978-1dnm2        1/1       Running   0          1h
sonarqube-3574384362-m7mdq   1/1       Running   0          1h
[[email protected] tmp]# kubectl attach sonarqube-3574384362-m7mdq
If you don't see a command prompt, try pressing enter.

kubectl cluster-info

使用cluster-info和cluster-info dump也能取出一些資訊，尤其是你需要看整體的全部資訊的時候一條命令一條命令的執行不如kubectl cluster-info dump來的快一些

kubectl故障解決相關常用命令

編號	命令	說明
1	edit	編輯伺服器側資源
2	replace	使用檔名或者標準輸入資源
3	patch	部分更新資源相關資訊
4	apply	使用檔案或者標準輸入更改配置資訊
5	scale	重新設定Deployment/ReplicaSet/RC/Job的size
6	autoscale	Deployment/ReplicaSet/RC的自動擴充套件設定
7	cordon	設定node不可使用
8	uncordon	設定node可以使用
9	drain	設定node進入維護模式

kubectl edit

edit這條命令用於編輯伺服器上的資源，具體是什麼意思，可以通過如下使用方式來確認。

編輯物件確認

使用-o引數指定輸出格式為yaml的nginx的service的設定情況確認，取得現場情況，這也是我們不知道其yaml檔案而只有環境時候能做的事情。

[[email protected] tmp]# kubectl get service |grep nginx
nginx        172.200.229.212   <nodes>       80:31001/TCP   2m
[[email protected] tmp]# kubectl get service nginx -o yaml
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: 2017-06-30T04:50:44Z
  labels:
    name: nginx
  name: nginx
  namespace: default
  resourceVersion: "77068"
  selfLink: /api/v1/namespaces/default/services/nginx
  uid: ad45612a-5d4f-11e7-91ef-000c2933b773
spec:
  clusterIP: 172.200.229.212
  ports:
  - nodePort: 31001
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    name: nginx
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}
[[email protected] tmp]#

使用edit命令對nginx的service設定進行編輯，得到如下資訊

可以看到當前埠為31001，在此編輯中，我們把它修改為31002

[[email protected] tmp]# kubectl edit service nginx
service "nginx" edited
[[email protected] tmp]#

編輯之後確認結果發現，此服務埠已經改變

[[email protected] tmp]# kubectl get service
NAME         CLUSTER-IP        EXTERNAL-IP   PORT(S)        AGE
kubernetes   172.200.0.1       <none>        443/TCP        1d
nginx        172.200.229.212   <nodes>       80:31002/TCP   8m
[[email protected] tmp]#

所使用場景之一，edit編輯的是執行環境的設定而不需要停止服務。

kubectl replace

瞭解到edit用來做什麼之後，我們會立即知道replace就是替換，我們使用上個例子中的service的port，重新把它改回31001

事前確認

確認port資訊為31002

[[email protected] tmp]# kubectl get service
NAME         CLUSTER-IP        EXTERNAL-IP   PORT(S)        AGE
kubernetes   172.200.0.1       <none>        443/TCP        1d
nginx        172.200.229.212   <nodes>       80:31002/TCP   17m
[[email protected] tmp]#

取得當前的nginx的service的設定檔案，然後修改port資訊

[[email protected] tmp]# kubectl get service nginx -o yaml >nginx_forreplace.yaml
[[email protected] tmp]# cp -p nginx_forreplace.yaml nginx_forreplace.yaml.org
[[email protected] tmp]# vi nginx_forreplace.yaml
[[email protected] tmp]# diff nginx_forreplace.yaml nginx_forreplace.yaml.org
15c15
<   - nodePort: 31001
---
>   - nodePort: 31002
[[email protected] tmp]#

執行replace命令

提示被替換了

[[email protected] tmp]# kubectl replace -f nginx_forreplace.yaml
service "nginx" replaced
[[email protected] tmp]#

確認結果

確認之後發現port確實重新變成了31001

[[email protected] tmp]# kubectl get service
NAME         CLUSTER-IP        EXTERNAL-IP   PORT(S)        AGE
kubernetes   172.200.0.1       <none>        443/TCP        1d
nginx        172.200.229.212   <nodes>       80:31001/TCP   20m
[[email protected] tmp]#

kubectl patch

當部分修改一些設定的時候patch非常有用，尤其是在1.2之前的版本，port改來改去好無聊，這次換個image

事前確認

當前port中使用的nginx是alpine的1.12版本

[[email protected] tmp]# kubectl exec nginx-2476590065-1vtsp  -it sh
/ # nginx -v
nginx version: nginx/1.12.0
/ #

執行patch進行替換

[root[email protected] tmp]# kubectl patch pod nginx-2476590065-1vtsp -p '{"spec":{"containers":[{"name":"nginx","image":"192.168.32.131:5000/nginx:1.13-alpine"}]}}'
"nginx-2476590065-1vtsp" patched
[[email protected] tmp]#

確認結果

確認當前pod中的映象已經patch成了1.13

[[email protected] tmp]# kubectl exec nginx-2476590065-1vtsp  -it sh
/ # nginx -v
nginx version: nginx/1.13.1
/ #

kubectl apply

同樣apply命令是用來使用檔案或者標準輸入來更改配置資訊。

事前準備

[[email protected] tmp]# kubectl delete -f nginx/
deployment "nginx" deleted
service "nginx" deleted
[[email protected] tmp]# kubectl create -f nginx/
deployment "nginx" created
service "nginx" created
[[email protected] tmp]#

結果確認

Service的Port設定為了31001

[[email protected] tmp]# kubectl get service
NAME         CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
kubernetes   172.200.0.1      <none>        443/TCP        1d
nginx        172.200.68.154   <nodes>       80:31001/TCP   11s
[[email protected] tmp]#

修改設定檔案

將port修改為31002

[[email protected] tmp]# vi nginx/nginx.yaml 
[[email protected] tmp]# grep 31002 nginx/nginx.yaml 
    nodePort: 31002
[[email protected] tmp]#

執行apply命令

執行設定檔案可以在執行狀態修改port資訊

[[email protected] tmp]# kubectl apply -f nginx/nginx.yaml 
deployment "nginx" configured
service "nginx" configured
[[email protected] tmp]#

結果確認

確認確實將port已經修改為31002了

[[email protected] tmp]# kubectl get service
NAME         CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
kubernetes   172.200.0.1      <none>        443/TCP        1d
nginx        172.200.68.154   <nodes>       80:31002/TCP   1m
[[email protected] tmp]#

kubectl scale

scale命令用於橫向擴充套件，是kubernetes或者swarm這類容器編輯平臺的重要功能之一，讓我們來看看是如何使用的

事前準備

事前設定nginx的replica為一，而經過確認此pod在192.168.32.132上執行

[[email protected] tmp]# kubectl delete -f nginx/
deployment "nginx" deleted
service "nginx" deleted
[[email protected] tmp]# kubectl create -f nginx/
deployment "nginx" created
service "nginx" created
[[email protected] tmp]# 
[[email protected] tmp]# kubectl get pods -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP             NODE
nginx-2476590065-74tpk   1/1       Running   0          17s       172.200.26.2   192.168.32.132
[[email protected] tmp]# kubectl get deployments -o wide
NAME      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
nginx     1         1         1            1           27s
[[email protected] tmp]#

執行scale命令

使用scale命令進行橫向擴充套件，將原本為1的副本，提高到3。

[[email protected] tmp]# kubectl scale --current-replicas=1 --replicas=3 deployment/nginx
deployment "nginx" scaled
[[email protected] tmp]#

通過確認發現已經進行了橫向擴充套件，除了192.168.132.132，另外133和134兩臺機器也各有一個pod運行了起來，這正是scale命令的結果。

[[email protected] tmp]# kubectl get deployment
NAME      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
nginx     3         3         3            3           2m
[[email protected] tmp]# kubectl get pod -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP             NODE
nginx-2476590065-74tpk   1/1       Running   0          2m        172.200.26.2   192.168.32.132
nginx-2476590065-cm5d9   1/1       Running   0          16s       172.200.44.2   192.168.32.133
nginx-2476590065-hmn9j   1/1       Running   0          16s       172.200.59.2   192.168.32.134
[[email protected] tmp]#

kube autoscale

autoscale命令用於自動擴充套件確認，跟scale不同的是前者還是需要手動執行，而autoscale則會根據負載進行調解。而這條命令則可以對Deployment/ReplicaSet/RC進行設定，通過最小值和最大值的指定進行設定，這裡只是給出執行的結果，不再進行實際的驗證。

[[email protected] tmp]# kubectl autoscale deployment nginx --min=2 --max=5
deployment "nginx" autoscaled
[[email protected] tmp]#

當然使用還會有一些限制，比如當前3個，設定最小值為2的話會出現什麼樣的情況？

[[email protected] tmp]# kubectl get pods -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP             NODE
nginx-2476590065-74tpk   1/1       Running   0          5m        172.200.26.2   192.168.32.132
nginx-2476590065-cm5d9   1/1       Running   0          2m        172.200.44.2   192.168.32.133
nginx-2476590065-hmn9j   1/1       Running   0          2m        172.200.59.2   192.168.32.134
[[email protected] tmp]# 
[[email protected] tmp]# kubectl autoscale deployment nginx --min=2 --max=2
Error from server (AlreadyExists): horizontalpodautoscalers.autoscaling "nginx" already exists
[[email protected] tmp]#

kubectl cordon 與 uncordon

在實際維護的時候會出現某個node壞掉，或者做一些處理，暫時不能讓生成的pod在此node上執行，需要通知kubernetes讓其不要建立過來，這條命令就是cordon，uncordon則是取消這個要求。例子如下：

事前準備

建立了一個nginx的pod，跑在192.168.32.133上。

[[email protected] tmp]# kubectl create -f nginx/
deployment "nginx" created
service "nginx" created
[[email protected] tmp]# kubectl get pods -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP             NODE
nginx-2476590065-dnsmw   1/1       Running   0          6s        172.200.44.2   
192.168.32.133
[[email protected] tmp]#

執行scale命令

橫向擴充套件到3個副本，發現利用roundrobin策略每個node上執行起來了一個pod，134這臺機器也有一個。

[[email protected] tmp]# kubectl scale --replicas=3 deployment/nginx
deployment "nginx" scaled
[[email protected] tmp]# kubectl get pods -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP             NODE
nginx-2476590065-550sm   1/1       Running   0          5s        172.200.26.2   192.168.32.132
nginx-2476590065-bt3bc   1/1       Running   0          5s        172.200.59.2   192.168.32.134
nginx-2476590065-dnsmw   1/1       Running   0          17s       172.200.44.2   192.168.32.133
[[email protected] tmp]# kubectl get pods -o wide |grep 192.168.32.134
nginx-2476590065-bt3bc   1/1       Running   0          12s       172.200.59.2   192.168.32.134
[[email protected] tmp]#

執行cordon命令

設定134，使得134不可使用，使用get node確認，其狀態顯示SchedulingDisabled。

[[email protected] tmp]# kubectl cordon 192.168.32.134
node "192.168.32.134" cordoned
[[email protected] tmp]# kubectl get nodes -o wide
NAME             STATUS                     AGE       EXTERNAL-IP
192.168.32.132   Ready                      1d        <none>
192.168.32.133   Ready                      1d        <none>
192.168.32.134   Ready,SchedulingDisabled   1d        <none>
[[email protected] tmp]#

執行scale命令

再次執行橫向擴充套件命令，看是否會有pod漂到134這臺機器上，結果發現只有之前的一個pod，再沒有新的pod漂過去。

[[email protected] tmp]# kubectl scale --replicas=6 deployment/nginx
deployment "nginx" scaled
[[email protected] tmp]# kubectl get pods -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP             NODE
nginx-2476590065-550sm   1/1       Running   0          32s       172.200.26.2   192.168.32.132
nginx-2476590065-7vxvx   1/1       Running   0          3s        172.200.44.3   192.168.32.133
nginx-2476590065-bt3bc   1/1       Running   0          32s       172.200.59.2   192.168.32.134
nginx-2476590065-dnsmw   1/1       Running   0          44s       172.200.44.2   192.168.32.133
nginx-2476590065-fclhj   1/1       Running   0          3s        172.200.44.4   192.168.32.133
nginx-2476590065-fl9fn   1/1       Running   0          3s        172.200.26.3   192.168.32.132
[[email protected] tmp]# kubectl get pods -o wide |grep 192.168.32.134
nginx-2476590065-bt3bc   1/1       Running   0          37s       172.200.59.2   192.168.32.134
[[email protected] tmp]#

執行uncordon命令

使用uncordon命令解除對134機器的限制，通過get node確認狀態也已經正常。

[[email protected] tmp]# kubectl uncordon 192.168.32.134
node "192.168.32.134" uncordoned
[[email protected] tmp]# 
[[email protected] tmp]# kubectl get nodes -o wide
NAME             STATUS    AGE       EXTERNAL-IP
192.168.32.132   Ready     1d        <none>
192.168.32.133   Ready     1d        <none>
192.168.32.134   Ready     1d        <none>
[[email protected] tmp]#

執行scale命令

再次執行scale命令，發現有新的pod可以建立到134node上了。

[[email protected] tmp]# kubectl scale --replicas=10 deployment/nginx
deployment "nginx" scaled
[[email protected] tmp]# kubectl get pods -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP             NODE
nginx-2476590065-550sm   1/1       Running   0          1m        172.200.26.2   192.168.32.132
nginx-2476590065-7vn6z   1/1       Running   0          3s        172.200.44.4   192.168.32.133
nginx-2476590065-7vxvx   1/1       Running   0          35s       172.200.44.3   192.168.32.133
nginx-2476590065-bt3bc   1/1       Running   0          1m        172.200.59.2   192.168.32.134
nginx-2476590065-dnsmw   1/1       Running   0          1m        172.200.44.2   192.168.32.133
nginx-2476590065-fl9fn   1/1       Running   0          35s       172.200.26.3   192.168.32.132
nginx-2476590065-pdx91   1/1       Running   0          3s        172.200.59.3   192.168.32.134
nginx-2476590065-swvwf   1/1       Running   0          3s        172.200.26.5   192.168.32.132
nginx-2476590065-vdq2k   1/1       Running   0          3s        172.200.26.4   192.168.32.132
nginx-2476590065-wdv52   1/1       Running   0          3s        172.200.59.4   192.168.32.134
[[email protected] tmp]#

kubectl drain

drain命令用於對某個node進行設定，是為了設定此node為維護做準備。英文的drain有排幹水的意思，下水道的水之後排幹後才能進行維護。那我們來看一下kubectl”排水”的時候都作了什麼

事前準備

將nginx的副本設定為4，確認發現134上啟動了兩個pod。

[[email protected] tmp]# kubectl create -f nginx/
deployment "nginx" created
service "nginx" created
[[email protected] tmp]# kubectl get pod -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP             NODE
nginx-2476590065-d6h8f   1/1       Running   0          8s        172.200.59.2   192.168.32.134
[[email protected] tmp]# 
[[email protected] tmp]# kubectl get nodes -o wide
NAME             STATUS    AGE       EXTERNAL-IP
192.168.32.132   Ready     1d        <none>
192.168.32.133   Ready     1d        <none>
192.168.32.134   Ready     1d        <none>
[[email protected] tmp]# 
[[email protected] tmp]# kubectl scale --replicas=4 deployment/nginx
deployment "nginx" scaled
[[email protected] tmp]# 
[[email protected] tmp]# kubectl get pods -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP             NODE
nginx-2476590065-9lfzh   1/1       Running   0          12s       172.200.59.3   192.168.32.134
nginx-2476590065-d6h8f   1/1       Running   0          1m        172.200.59.2   192.168.32.134
nginx-2476590065-v8xvf   1/1       Running   0          43s       172.200.26.2   192.168.32.132
nginx-2476590065-z94cq   1/1       Running   0          12s       172.200.44.2   192.168.32.133
[[email protected] tmp]#

執行drain命令

執行drain命令，發現這條命令做了兩件事情:
1. 設定此node不可以使用（cordon)
2. evict了其上的兩個pod

[[email protected] tmp]# kubectl drain 192.168.32.134
node "192.168.32.134" cordoned
pod "nginx-2476590065-d6h8f" evicted
pod "nginx-2476590065-9lfzh" evicted
node "192.168.32.134" drained
[[email protected] tmp]#

結果確認

evict的意思有驅逐和回收的意思，讓我們來看一下evcit這個動作的結果到底是什麼。
結果是134上面已經不再有pod，而在132和133上新生成了兩個pod，用以替代在134上被退場的pod，而這個替代的動作應該是replicas的機制保證的。所以drain的結果就是退場pod和設定node不可用（排水），這樣的狀態則可以進行維護了，執行完後重新uncordon即可。

Kubernetes之kubectl常用命令:故障排查和故障解決

kubectl故障排查相關常用命令

kubectl version

kubectl api-versions

kubectl explain

explain命令能夠確認的資訊類別

kubectl get

確認pod

確認deployment

確認node資訊

確認namespace資訊

get命令能夠確認的資訊類別

kubectl describe

確認node詳細資訊

確認deployment詳細資訊

describe命令能夠確認的資訊

kubectl logs

kubectl exec

kubectl cp

在pod中建立一個檔案message.log

拷貝出來並確認

更改message.log並拷貝回pod

確認更改後的資訊

kubectl attach

kubectl cluster-info

kubectl故障解決相關常用命令

kubectl edit

編輯物件確認

kubectl replace

事前確認

取得當前的nginx的service的設定檔案，然後修改port資訊

執行replace命令

確認結果

kubectl patch

事前確認

執行patch進行替換

確認結果

kubectl apply

事前準備

結果確認

修改設定檔案

執行apply命令

結果確認

kubectl scale

事前準備

執行scale命令

kube autoscale

kubectl cordon 與 uncordon

事前準備

執行scale命令

執行cordon命令

執行scale命令

執行uncordon命令

執行scale命令

kubectl drain

事前準備

執行drain命令

結果確認

相關推薦