1. 程式人生 > 實用技巧 >Kubernetes二進位制部署

Kubernetes二進位制部署

二進位制部署K8S叢集從0到1

介紹:k8s集群系統的各元件需要使用 TLS證書對通訊進行加密,本文件使用 CloudFlare

PKI工具集 cfssl來生成 Certificate Authority(CA)和其他證書。

管理叢集中的TLS

前言

每個Kubernetes叢集都有一個叢集根證書頒發機構(CA)。 叢集中的元件通常使用CA來驗證API server的證書,由API伺服器驗證kubelet客戶端證書等。為了支援這一點,CA證書包被分發到叢集中的每個節點,並作為一個secret附加分發到預設service account上。 或者,你的workload可以使用此CA建立信任。 你的應用程式可以使用類似於ACME草案的協議,使用certificates.k8s.io API請求證書籤名。

叢集中的TLS信任

讓Pod中執行的應用程式信任叢集根CA通常需要一些額外的應用程式配置。 您將需要將CA證書包新增到TLS客戶端或伺服器信任的CA證書列表中。 例如,您可以使用golang TLS配置通過解析證書鏈並將解析的證書新增到tls.Config結構中的Certificates欄位中,CA證書捆綁包將使用預設服務賬戶自動載入到pod中,路徑為/var/run/secrets/kubernetes.io/serviceaccount/ca.crt。 如果您沒有使用預設服務賬戶,請請求叢集管理員構建包含您有權訪問使用的證書包的configmap。

叢集部署

環境規劃

軟體 版本
Linux作業系統 CentOS Linux release 7.6.1810 (Core)
Kubernetes 1.14.2
Docker 18.06.1-ce
Etcd 3.3.1
角色 IP 元件 推薦配置
k8s-master 172.16.4.12 kube-apiserver kube-controller-manager kube-scheduler etcd 8core和16GB記憶體
k8s-node1 172.16.4.13 kubelet kube-proxy docker flannel etcd 根據需要執行的容器數量進行配置
k8s-node2 172.16.4.14 kubelet kube-proxy docker flannel etcd 根據需要執行的容器數量進行配置
元件 使用的證書
etcd ca.pem, server.pem, server-key.pem
kube-apiserver ca.pem, server.pem, server-key.pem
kubelet ca.pem, ca-key.pem
kube-proxy ca.pem, kube-proxy.pem, kube-proxy-key.pem
kubectl ca.pem, admin.pem, admin-key.pem
kube-controller-manager ca.pem, ca-key.pem
flannel ca.pem, server.pem, server-key.pem

環境準備

以下操作需要在master節點和各Node節點上執行

  • 準備必要可用的軟體包(非必須操作)
# 安裝net-tools,可以使用ping,ifconfig等命令
yum install -y net-tools

# 安裝curl,telnet命令
yum install -y curl telnet

# 安裝vim編輯器
yum install -y vim

# 安裝wget下載命令
yum install -y wget

# 安裝lrzsz工具,可以直接拖拽檔案到Xshell中上傳檔案到伺服器或下載檔案到本地。
yum -y install lrzsz
1234567891011121314
  • 關閉防火牆
systemctl stop firewalld
systemctl disable firewalld
12
  • 關閉selinux
sed -i 's/enforcing/disabled' /etc/selinux/config
setenforce 0
# 或者進入到/etc/selinux/config將以下欄位設定並重啟生效:
SELINUX=disabled
1234
  • 關閉swap
swapoff -a # 臨時
vim /etc/fstab #永久
12
  • 確保net.bridge.bridge-nf-call-iptables在sysctl配置為1:
$ cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward =1
net.bridge.bridge-nf-call-ip6tables =1
net.bridge.bridge-nf-call-iptables =1
EOF
$ sysctl --system
123456
  • 新增主機名與IP對應關係(master和node節點都需要配置)
$ vim /etc/hosts
172.16.4.12  k8s-master
172.16.4.13  k8s-node1
172.16.4.14  k8s-node2
1234
  • 同步時間
# yum install ntpdate -y
# ntpdate ntp.api.bz 
12

k8s需要容器執行時(Container Runtime Interface,CRI)的支援,目前官方支援的容器執行時包括:Docker、Containerd、CRI-O和frakti。此處以Docker作為容器執行環境,推薦版本為Docker CE 18.06 或 18.09.

  • 安裝Docker
# 為Docker配置阿里雲源,注意是在/etc/yum.repos.d目錄執行下述命令。
[root@k8s-master yum.repos.d]# wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# update形成快取,並且列出可用源,發現出現docker-ce源。
[root@k8s-master yum.repos.d]# yum update && yum repolist
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.lzu.edu.cn
 * extras: mirrors.nwsuaf.edu.cn
 * updates: mirror.lzu.edu.cn
docker-ce-stable                                                                                                  | 3.5 kB  00:00:00     
(1/2): docker-ce-stable/x86_64/updateinfo                                                                         |   55 B  00:00:00     
(2/2): docker-ce-stable/x86_64/primary_db                                                                         |  28 kB  00:00:00     
No packages marked for update
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.lzu.edu.cn
 * extras: mirrors.nwsuaf.edu.cn
 * updates: mirror.lzu.edu.cn
repo id                                                         repo name                                                          status
base/7/x86_64                                                   CentOS-7 - Base                                                    10,019
docker-ce-stable/x86_64                                         Docker CE Stable - x86_64                                              43
extras/7/x86_64                                                 CentOS-7 - Extras                                                     409
updates/7/x86_64                                                CentOS-7 - Updates                                                  2,076
repolist: 12,547

# 列出可用的docker-ce版本,推薦使用18.06或18.09的穩定版。
yum list docker-ce.x86_64 --showduplicates | sort -r
# 正式安裝docker,此處以docker-ce-18.06.3.ce-3.el7為例。推薦第2種方式。
yum -y install docker-ce-18.06.3.ce-3.el7
# 在此處可能會報錯:Delta RPMs disabled because /usr/bin/applydeltarpm not installed.採用如下命令解決。
yum provides '*/applydeltarpm'
yum install deltarpm -y
# 然後重新執行安裝命令
yum -y install docker-ce-18.06.3.ce-3.el7
# 安裝完成設定docker開機自啟動。
systemctl enable docker
123456789101112131415161718192021222324252627282930313233343536

注意:以下操作都在 master 節點即 172.16.4.12 這臺主機上執行,證書只需要建立一次即可,以後在向叢集中新增新節點時只要將 /etc/kubernetes/ 目錄下的證書拷貝到新節點上即可。

建立TLS證書和祕鑰

  • 採用二進位制原始碼包安裝CFSSL
# 首先建立存放證書的位置
$ mkdir ssl && cd ssl
# 下載用於生成證書的
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
# 用於將證書的json文字匯入
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
# 檢視證書資訊
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
# 修改檔案,使其具備執行許可權
chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
# 將檔案移到/usr/local/bin/cfssl
mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfo
# 如果是普通使用者,可能需要將環境變數設定下
export PATH=/usr/local/bin:$PATH

1234567891011121314151617

建立CA(Certificate Authority)

注意以下命令,仍舊在/root/ssl檔案目錄下執行。

  1. 建立CA配置檔案
# 生成一個預設配置
$ cfssl print-defaults config > config.json
# 生成一個預設簽發證書的配置
$ cfssl print-defaults csr > csr.json
# 根據config.json檔案的格式建立如下的ca-config.json檔案,其中過期時間設定成了 87600h
cat > ca-config.json <<EOF
{
  "signing": {
    "default": {
      "expiry": "87600h"
    },
    "profiles": {
      "kubernetes": {
         "expiry": "87600h",
         "usages": [
            "signing",
            "key encipherment",
            "server auth",
            "client auth"
        ]
      }
    }
  }
}
EOF
12345678910111213141516171819202122232425

欄位說明

  • ca-config.json:可以定義多個 profiles,分別指定不同的過期時間、使用場景等引數;後續在簽名證書時使用某個 profile;
  • signing:表示該證書可用於簽名其它證書;生成的 ca.pem 證書中 CA=TRUE
  • server auth:表示client可以用該 CA 對server提供的證書進行驗證;
  • client auth:表示server可以用該CA對client提供的證書進行驗證;
  1. 建立CA證書籤名請求
# 建立ca-csr.json檔案,內容如下
cat > ca-csr.json <<EOF
{
    "CN": "kubernetes",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "Beijing",
            "ST": "Beijing",
      	    "O": "k8s",
            "OU": "System"
        }
    ],
      "ca": {
    	"expiry": "87600h"
    }
}
EOF

1234567891011121314151617181920212223
  • “CN”:Common Name,kube-apiserver 從證書中提取該欄位作為請求的使用者名稱 (User Name);瀏覽器使用該欄位驗證網站是否合法;
  • “O”:Organization,kube-apiserver 從證書中提取該欄位作為請求使用者所屬的組 (Group);
  1. 生成CA證書和私鑰
[root@k8s-master ~]# cfssl gencert -initca ca-csr.json | cfssljson -bare ca
2019/06/12 11:08:53 [INFO] generating a new CA key and certificate from CSR
2019/06/12 11:08:53 [INFO] generate received request
2019/06/12 11:08:53 [INFO] received CSR
2019/06/12 11:08:53 [INFO] generating key: rsa-2048
2019/06/12 11:08:53 [INFO] encoded CSR
2019/06/12 11:08:53 [INFO] signed certificate with serial number 708489059891717538616716772053407287945320812263
# 此時/root下應該有以下四個檔案。
[root@k8s-master ssl]# ls
ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem

1234567891011
  1. 建立Kubernetes證書

建立Kubernetes證書籤名請求檔案server-csr.json(kubernetes-csr.json),並將受信任的IP修改新增到hosts,比如我的三個節點的IP為:172.16.4.12 172.16.4.13 172.16.4.14

$ cat > server-csr.json <<EOF
{
    "CN": "kubernetes",
    "hosts": [
      "127.0.0.1",
      "172.16.4.12",
      "172.16.4.13",
      "172.16.4.14",
      "10.10.10.1",
      "kubernetes",
      "kubernetes.default",
      "kubernetes.default.svc",
      "kubernetes.default.svc.cluster",
      "kubernetes.default.svc.cluster.local"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "BeiJing",
            "ST": "BeiJing",
            "O": "k8s",
            "OU": "System"
        }
    ]
}
EOF
# 正式生成Kubernetes證書和私鑰
[root@k8s-master ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server
2019/06/12 12:00:45 [INFO] generate received request
2019/06/12 12:00:45 [INFO] received CSR
2019/06/12 12:00:45 [INFO] generating key: rsa-2048
2019/06/12 12:00:45 [INFO] encoded CSR
2019/06/12 12:00:45 [INFO] signed certificate with serial number 276381852717263457656057670704331293435930586226
2019/06/12 12:00:45 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
# 檢視生成的server.pem和server-key.pem
[root@k8s-master ssl]# ls server*
server.csr  server-csr.json  server-key.pem  server.pem

123456789101112131415161718192021222324252627282930313233343536373839404142434445
  • 如果 hosts 欄位不為空則需要指定授權使用該證書的 IP 或域名列表,由於該證書後續被 etcd 叢集和 kubernetes master 叢集使用,所以上面分別指定了 etcd叢集、kubernetes master 叢集的主機 IP 和 kubernetes 服務的服務 IP(一般是 kube-apiserver 指定的 service-cluster-ip-range 網段的第一個IP,如 10.10.10.1)。
  • 這是最小化安裝的kubernetes叢集,包括一個私有映象倉庫,三個節點的kubernetes叢集,以上物理節點的IP也可以更換為主機名。
  1. 建立admin證書

建立admin證書籤名請求檔案,admin-csr.json:

cat > admin-csr.json <<EOF
{
  "CN": "admin",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "L": "BeiJing",
      "ST": "BeiJing",
      "O": "system:masters",
      "OU": "System"
    }
  ]
}
EOF

1234567891011121314151617181920
  • 後續 kube-apiserver 使用 RBAC 對客戶端(如 kubeletkube-proxyPod)請求進行授權;
  • kube-apiserver 預定義了一些 RBAC 使用的 RoleBindings,如 cluster-admin 將 Group system:masters 與 Role cluster-admin 繫結,該 Role 授予了呼叫kube-apiserver所有 API的許可權;
  • O 指定該證書的 Group 為 system:masterskubelet 使用該證書訪問 kube-apiserver 時 ,由於證書被 CA 簽名,所以認證通過,同時由於證書使用者組為經過預授權的 system:masters,所以被授予訪問所有 API 的許可權;

注意:這個admin 證書,是將來生成管理員用的kube config 配置檔案用的,現在我們一般建議使用RBAC 來對kubernetes 進行角色許可權控制, kubernetes 將證書中的CN 欄位 作為User, O 欄位作為 Group(具體參考 Kubernetes中的使用者與身份認證授權中 X509 Client Certs 一段)。

生成admin證書和私鑰

[root@k8s-master ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
2019/06/12 14:52:32 [INFO] generate received request
2019/06/12 14:52:32 [INFO] received CSR
2019/06/12 14:52:32 [INFO] generating key: rsa-2048
2019/06/12 14:52:33 [INFO] encoded CSR
2019/06/12 14:52:33 [INFO] signed certificate with serial number 491769057064087302830652582150890184354925110925
2019/06/12 14:52:33 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
#檢視生成的證書和私鑰
[root@k8s-master ssl]# ls admin*
admin.csr  admin-csr.json  admin-key.pem  admin.pem

1234567891011121314
  1. 建立kube-proxy證書

建立 kube-proxy 證書籤名請求檔案 kube-proxy-csr.json,讓它攜帶證書訪問叢集:

cat > kube-proxy-csr.json <<EOF
{
  "CN": "system:kube-proxy",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "L": "BeiJing",
      "ST": "BeiJing",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF

1234567891011121314151617181920
  • CN 指定該證書的 User 為 system:kube-proxy
  • kube-apiserver 預定義的 RoleBinding system:node-proxier 將User system:kube-proxy 與 Role system:node-proxier 繫結,該 Role 授予了呼叫 kube-apiserver Proxy 相關 API 的許可權;

生成 kube-proxy 客戶端證書和私鑰

[root@k8s-master ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy && ls kube-proxy*
2019/06/12 14:58:09 [INFO] generate received request
2019/06/12 14:58:09 [INFO] received CSR
2019/06/12 14:58:09 [INFO] generating key: rsa-2048
2019/06/12 14:58:09 [INFO] encoded CSR
2019/06/12 14:58:09 [INFO] signed certificate with serial number 175491367066700423717230199623384101585104107636
2019/06/12 14:58:09 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
kube-proxy.csr  kube-proxy-csr.json  kube-proxy-key.pem  kube-proxy.pem

123456789101112
  1. 校驗證書

以server證書為例

使用openssl命令

[root@k8s-master ssl]# openssl x509  -noout -text -in  server.pem
......
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: C=CN, ST=Beijing, L=Beijing, O=k8s, OU=System, CN=kubernetes
        Validity
            Not Before: Jun 12 03:56:00 2019 GMT
            Not After : Jun  9 03:56:00 2029 GMT
        Subject: C=CN, ST=BeiJing, L=BeiJing, O=k8s, OU=System, CN=kubernetes
        ......
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Key Identifier: 
                E9:99:37:41:CC:E9:BA:9A:9F:E6:DE:4A:3E:9F:8B:26:F7:4E:8F:4F
            X509v3 Authority Key Identifier: 
                keyid:CB:97:D5:C3:5F:8A:EB:B5:A8:9D:39:DE:5F:4F:E0:10:8E:4C:DE:A2

            X509v3 Subject Alternative Name: 
                DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster, DNS:kubernetes.default.svc.cluster.local, IP Address:127.0.0.1, IP Address:172.16.4.12, IP Address:172.16.4.13, IP Address:172.16.4.14, IP Address:10.10.10.1
    ......

12345678910111213141516171819202122232425
  • 確認 Issuer 欄位的內容和 ca-csr.json 一致;
  • 確認 Subject 欄位的內容和 server-csr.json 一致;
  • 確認 X509v3 Subject Alternative Name 欄位的內容和 server-csr.json 一致;
  • 確認 X509v3 Key Usage、Extended Key Usage 欄位的內容和 ca-config.json 中 ``kubernetes profile` 一致;

使用 cfssl-certinfo 命令

[root@k8s-master ssl]# cfssl-certinfo -cert server.pem
{
  "subject": {
    "common_name": "kubernetes",
    "country": "CN",
    "organization": "k8s",
    "organizational_unit": "System",
    "locality": "BeiJing",
    "province": "BeiJing",
    "names": [
      "CN",
      "BeiJing",
      "BeiJing",
      "k8s",
      "System",
      "kubernetes"
    ]
  },
  "issuer": {
    "common_name": "kubernetes",
    "country": "CN",
    "organization": "k8s",
    "organizational_unit": "System",
    "locality": "Beijing",
    "province": "Beijing",
    "names": [
      "CN",
      "Beijing",
      "Beijing",
      "k8s",
      "System",
      "kubernetes"
    ]
  },
  "serial_number": "276381852717263457656057670704331293435930586226",
  "sans": [
    "kubernetes",
    "kubernetes.default",
    "kubernetes.default.svc",
    "kubernetes.default.svc.cluster",
    "kubernetes.default.svc.cluster.local",
    "127.0.0.1",
    "172.16.4.12",
    "172.16.4.13",
    "172.16.4.14",
    "10.10.10.1"
  ],
  "not_before": "2019-06-12T03:56:00Z",
  "not_after": "2029-06-09T03:56:00Z",
  "sigalg": "SHA256WithRSA",
  ......
}
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
  1. 分發證書

將生成的證書和祕鑰檔案(字尾名為.pem)拷貝到所有機器的 /etc/kubernetes/ssl目錄下備用;

[root@k8s-master ssl]# mkdir -p /etc/kubernetes/ssl
[root@k8s-master ssl]# cp *.pem /etc/kubernetes/ssl
[root@k8s-master ssl]# ls /etc/kubernetes/ssl/
admin-key.pem  admin.pem  ca-key.pem  ca.pem  kube-proxy-key.pem  kube-proxy.pem  server-key.pem  server.pem
# 留下pem檔案,刪除其餘無用檔案(非必須操作,可以不執行)
ls | grep -v pem |xargs -i rm {}

1234567

建立kubeconfig檔案

以下命令在master節點執行,沒有指定執行目錄,則預設是使用者家目錄,root使用者則在/root下執行。

下載kubectl

注意請下載對應的Kubernetes版本的安裝包。

# 下述網站,如果訪問不了網站,請移步百度雲下載:
wget https://dl.k8s.io/v1.14.3/kubernetes-client-linux-amd64.tar.gz
tar -xzvf kubernetes-client-linux-amd64.tar.gz
cp kubernetes/client/bin/kube* /usr/bin/
chmod a+x /usr/bin/kube*
12345

建立kubectl kubeconfig檔案

# 172.16.4.12是master節點的IP,注意更改。 
# 建立kubeconfig 然後需要指定k8s的api的https的訪問入口 
export KUBE_APISERVER="https://172.16.4.12:6443"
# 設定叢集引數
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER}
# 設定客戶端認證引數
kubectl config set-credentials admin \
  --client-certificate=/etc/kubernetes/ssl/admin.pem \
  --embed-certs=true \
  --client-key=/etc/kubernetes/ssl/admin-key.pem
# 設定上下文引數
kubectl config set-context kubernetes \
  --cluster=kubernetes \
  --user=admin
# 設定預設上下文
kubectl config use-context kubernetes
12345678910111213141516171819
  • admin.pem 證書 OU 欄位值為 system:masterskube-apiserver 預定義的 RoleBinding cluster-admin 將 Group system:masters 與 Role cluster-admin繫結,該 Role 授予了呼叫kube-apiserver 相關 API 的許可權;
  • 生成的 kubeconfig 被儲存到 ~/.kube/config 檔案;

注意:~/.kube/config檔案擁有對該叢集的最高許可權,請妥善保管。

kubeletkube-proxy 等 Node 機器上的程序與 Master 機器的 kube-apiserver 程序通訊時需要認證和授權;

以下操作只需要在master節點上執行,生成的*.kubeconfig檔案可以直接拷貝到node節點的/etc/kubernetes目錄下。

建立TLS Bootstrapping token

Token auth file

Token可以是任意的包含128 bit的字串,可以使用安全的隨機數發生器生成。

export BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ')
cat > token.csv <<EOF
${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:kubelet-bootstrap"
EOF
1234

後三行是一句,直接複製上面的指令碼執行即可。

注意:在進行後續操作前請檢查 token.csv 檔案,確認其中的 ${BOOTSTRAP_TOKEN}環境變數已經被真實的值替換。

BOOTSTRAP_TOKEN 將被寫入到 kube-apiserver 使用的 token.csv 檔案和 kubelet 使用的 bootstrap.kubeconfig 檔案,如果後續重新生成了 BOOTSTRAP_TOKEN,則需要

  1. 更新 token.csv 檔案,分發到所有機器 (master 和 node)的 /etc/kubernetes/ 目錄下,分發到node節點上非必需;
  2. 重新生成 bootstrap.kubeconfig 檔案,分發到所有 node 機器的 /etc/kubernetes/ 目錄下;
  3. 重啟 kube-apiserver 和 kubelet 程序;
  4. 重新 approve kubelet 的 csr 請求;
cp token.csv /etc/kubernetes/

12

建立kubelet bootstrapping kubeconfig檔案

執行下面的命令時需要先安裝kubectl命令

# 在執行之前,可以先安裝kubectl 自動補全命令工具。
yum install -y bash-completion
source /usr/share/bash-completion/bash_completion
source <(kubectl completion bash)

# 進到執行目錄/etc/kubernetes下。
cd /etc/kubernetes
export KUBE_APISERVER="https://172.16.4.12:6443"

# 設定叢集引數
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER} \
  --kubeconfig=bootstrap.kubeconfig

# 設定客戶端認證引數
kubectl config set-credentials kubelet-bootstrap \
  --token=${BOOTSTRAP_TOKEN} \
  --kubeconfig=bootstrap.kubeconfig

# 設定上下文引數
kubectl config set-context default \
  --cluster=kubernetes \
  --user=kubelet-bootstrap \
  --kubeconfig=bootstrap.kubeconfig

# 設定預設上下文
kubectl config use-context default --kubeconfig=bootstrap.kubeconfig

123456789101112131415161718192021222324252627282930
  • --embed-certstrue 時表示將 certificate-authority 證書寫入到生成的 bootstrap.kubeconfig 檔案中;
  • 設定客戶端認證引數時沒有指定祕鑰和證書,後續由 kube-apiserver 自動生成;

建立kube-proxy kubeconfig檔案

export KUBE_APISERVER="https://172.16.4.12:6443"
# 設定叢集引數
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER} \
  --kubeconfig=kube-proxy.kubeconfig
# 設定客戶端認證引數
kubectl config set-credentials kube-proxy \
  --client-certificate=/etc/kubernetes/ssl/kube-proxy.pem \
  --client-key=/etc/kubernetes/ssl/kube-proxy-key.pem \
  --embed-certs=true \
  --kubeconfig=kube-proxy.kubeconfig
# 設定上下文引數
kubectl config set-context default \
  --cluster=kubernetes \
  --user=kube-proxy \
  --kubeconfig=kube-proxy.kubeconfig
# 設定預設上下文
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig

123456789101112131415161718192021
  • 設定叢集引數和客戶端認證引數時 --embed-certs 都為 true,這會將 certificate-authorityclient-certificateclient-key 指向的證書檔案內容寫入到生成的 kube-proxy.kubeconfig 檔案中;
  • kube-proxy.pem 證書中 CN 為 system:kube-proxykube-apiserver 預定義的 RoleBinding cluster-admin 將User system:kube-proxy 與 Role system:node-proxier 繫結,該 Role 授予了呼叫 kube-apiserver Proxy 相關 API 的許可權;

分發kubeconfig檔案

將兩個 kubeconfig 檔案分發到所有 Node 機器的 /etc/kubernetes/ 目錄下:

# 現在可以把其他節點加入互信,首先需要生成證書,三次回車即可。
ssh-keygen
# 檢視生成的證書
ls /root/.ssh/
id_rsa  id_rsa.pub
# 將生成的證書拷貝到node1和node2
ssh-copy-id [email protected]
# 輸入節點使用者的密碼即可訪問。同樣方式加入node2為互信。
# 把kubeconfig檔案拷貝到node節點的/etc/kubernetes,該目錄需要事先手動建立好。
scp bootstrap.kubeconfig kube-proxy.kubeconfig [email protected]:/etc/kubernetes
scp bootstrap.kubeconfig kube-proxy.kubeconfig [email protected]:/etc/kubernetes

123456789101112

建立 ETCD HA叢集

etcd服務作為k8s叢集的主資料庫,在安裝k8s各服務之前需要首先安裝和啟動。kuberntes 系統使用 etcd 儲存所有資料,本文件介紹部署一個三節點高可用 etcd 叢集的步驟,這三個節點複用 kubernetes master 機器,分別命名為k8s-masterk8s-node1k8s-node2

角色 IP
k8s-master 172.16.4.12
k8s-node1 172.16.4.13
k8s-node2 172.16.4.14

TLS認證檔案

需要為 etcd 叢集建立加密通訊的 TLS 證書,這裡複用以前建立的 kubernetes 證書:

# 將/root/ssl下的ca.pem, server-key.pem, server.pem複製到/etc/kubernetes/ssl
cp ca.pem server-key.pem server.pem /etc/kubernetes/ssl

123
  • kubernetes 證書的 hosts 欄位列表中包含上面三臺機器的 IP,否則後續證書校驗會失敗;

下載二進位制檔案

二進位制包下載地址:此文最新為etcd-v3.3.13,讀者可以到https://github.com/coreos/etcd/releases頁面下載最新版本的二進位制檔案。

wget https://github.com/etcd-io/etcd/releases/download/v3.3.13/etcd-v3.3.13-linux-amd64.tar.gz
tar zxvf etcd-v3.3.13-linux-amd64.tar.gz
mv etcd-v3.3.13-linux-amd64/etcd* /usr/local/bin

1234

或者直接使用yum命令安裝:

yum install etcd
1

注意:若使用yum安裝,預設etcd命令將在/usr/bin目錄下,注意修改下面的etcd.service檔案中的啟動命令地址為/usr/bin/etcd

建立etcd的資料目錄

mkdir -p /var/lib/etcd/default.etcd
1

建立etcd的systemd unit檔案

在/usr/lib/systemd/system/目錄下建立檔案etcd.service,內容如下。注意替換IP地址為你自己的etcd叢集的主機IP。

[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \
  --name ${ETCD_NAME} \
  --cert-file=/etc/kubernetes/ssl/server.pem \
  --key-file=/etc/kubernetes/ssl/server-key.pem \
  --peer-cert-file=/etc/kubernetes/ssl/server.pem \
  --peer-key-file=/etc/kubernetes/ssl/server-key.pem \
  --trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
  --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
  --initial-advertise-peer-urls ${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
  --listen-peer-urls=${ETCD_LISTEN_PEER_URLS} \
  --listen-client-urls=${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
  --advertise-client-urls=${ETCD_ADVERTISE_CLIENT_URLS} \
  --initial-cluster-token=${ETCD_INITIAL_CLUSTER_TOKEN} \
  --initial-cluster etcd-master=https://172.16.4.12:2380,etcd-node1=https://172.16.4.13:2380,etcd-node2=https://172.16.4.14:2380 \
  --initial-cluster-state=new \
  --data-dir=${ETCD_DATA_DIR}
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
123456789101112131415161718192021222324252627282930313233
  • 指定 etcd 的工作目錄為 /var/lib/etcd,資料目錄為 /var/lib/etcd需在啟動服務前建立這個目錄,否則啟動服務的時候會報錯“Failed at step CHDIR spawning /usr/bin/etcd: No such file or directory”;
  • 為了保證通訊安全,需要指定 etcd 的公私鑰(cert-file和key-file)、Peers 通訊的公私鑰和 CA 證書(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客戶端的CA證書(trusted-ca-file);
  • 建立 server.pem 證書時使用的 server-csr.json 檔案的 hosts 欄位包含所有 etcd 節點的IP,否則證書校驗會出錯;
  • --initial-cluster-state 值為 new 時,--name 的引數值必須位於 --initial-cluster 列表中;

建立etcd的環境變數配置檔案/etc/etcd/etcd.conf

mkdir -p  /etc/etcd
touch etcd.conf
12

寫入內容如下:

# [member]
ETCD_NAME=etcd-master
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://172.16.4.12:2380"
ETCD_LISTEN_CLIENT_URLS="https://172.16.4.12:2379"

#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.16.4.12:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="https://172.16.4.12:2379"


123456789101112

這是172.16.4.12節點的配置,其他兩個etcd節點只要將上面的IP地址改成相應節點的IP地址即可。ETCD_NAME換成對應節點的etcd-node1 etcd-node2。

部署node節點的etcd

# 1. 從master節點傳送TLS認證檔案到各節點。注意需要在各節點上事先建立/etc/kubernetes/ssl目錄。
scp /etc/kubernetes/ssl/*.pem [email protected]:/etc/kubernetes/ssl/
scp /etc/kubernetes/ssl/*.pem [email protected]:/etc/kubernetes/ssl/

# 2. 把master節點的etcd和etcdctl命令直接傳到各節點上,
scp /usr/local/bin/etcd* [email protected]:/usr/local/bin/
scp /usr/local/bin/etcd* [email protected]:/usr/local/bin/

# 3. 把etcd配置檔案上傳至各node節點上。注意事先在各節點上建立好/etc/etcd目錄。
scp /etc/etcd/etcd.conf [email protected]:/etc/etcd/
scp /etc/etcd/etcd.conf [email protected]:/etc/etcd/
# 4. 需要修改/etc/etcd/etcd.conf的相應引數。以k8s-node1(IP:172.16.4.13)為例:
# [member]
ETCD_NAME=etcd-node1
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_PEER_URLS="https://172.16.4.13:2380"
ETCD_LISTEN_CLIENT_URLS="https://172.16.4.13:2379"

#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.16.4.13:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="https://172.16.4.13:2379"
# 上述檔案主要是修改ETCD_NAME和對應的IP為節點IP即可。同樣修改node2的配置檔案。

# 5. 把/usr/lib/systemd/system/etcd.service的etcd服務配置檔案上傳至各節點。
scp /usr/lib/systemd/system/etcd.service [email protected]:/usr/lib/systemd/system/
scp /usr/lib/systemd/system/etcd.service [email protected]:/usr/lib/systemd/system/

12345678910111213141516171819202122232425262728

啟動服務

systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
systemctl status etcd

12345

在所有的 kubernetes master 節點重複上面的步驟,直到所有機器的 etcd 服務都已啟動。

注意:如果日誌中出現連線異常資訊,請確認所有節點防火牆是否開放2379,2380埠。 以centos7為例:

firewall-cmd --zone=public --add-port=2380/tcp --permanent
firewall-cmd --zone=public --add-port=2379/tcp --permanent
firewall-cmd --reload

1234

驗證服務

在任一 kubernetes master 機器上執行如下命令:

[root@k8s-master ~]# etcdctl \
> --ca-file=/etc/kubernetes/ssl/ca.pem \
> --cert-file=/etc/kubernetes/ssl/server.pem \
> --key-file=/etc/kubernetes/ssl/server-key.pem \
> cluster-health
member 287080ba42f94faf is healthy: got healthy result from https://172.16.4.13:2379
member 47e558f4adb3f7b4 is healthy: got healthy result from https://172.16.4.12:2379
member e531bd3c75e44025 is healthy: got healthy result from https://172.16.4.14:2379
cluster is healthy

12345678910

結果最後一行為 cluster is healthy 時表示叢集服務正常。

部署Master節點

kubernetes master 節點包含的元件:

  • kube-apiserver
  • kube-scheduler
  • kube-controller-manager

目前這三個元件需要部署在同一臺機器上。

  • kube-schedulerkube-controller-managerkube-apiserver 三者的功能緊密相關;
  • 同時只能有一個 kube-schedulerkube-controller-manager 程序處於工作狀態,如果執行多個,則需要通過選舉產生一個 leader;

TLS證書檔案

以下pem證書檔案我們在”建立TLS證書和祕鑰“這一步中已經建立過了,token.csv檔案在“建立kubeconfig檔案”的時候建立。我們再檢查一下。

[root@k8s-master ~]# ls /etc/kubernetes/ssl/
admin-key.pem  admin.pem  ca-key.pem  ca.pem  kube-proxy-key.pem  kube-proxy.pem  server-key.pem  server.pem

123

下載最新版本的二進位制檔案

從https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG.md頁面clientserver tarball 檔案server的 tarball kubernetes-server-linux-amd64.tar.gz 已經包含了 client(kubectl) 二進位制檔案,所以不用單獨下載kubernetes-client-linux-amd64.tar.gz檔案;

wget https://dl.k8s.io/v1.14.3/kubernetes-server-linux-amd64.tar.gz
# 如果官網訪問不到,可以移步百度雲:連結:https://pan.baidu.com/s/1G6e981Q48mMVWD9Ho_j-7Q 提取碼:uvc1 下載。
tar -xzvf kubernetes-server-linux-amd64.tar.gz
cd kubernetes
tar -xzvf  kubernetes-src.tar.gz

123456

將二進位制檔案拷貝到指定路徑

[root@k8s-master kubernetes]# cp -r server/bin/{kube-apiserver,kube-controller-manager,kube-scheduler,kubectl,kube-proxy,kubelet} /usr/local/bin/

12

配置和啟動kube-apiserver

(1)建立kube-apiserver的service配置檔案

service配置檔案/usr/lib/systemd/system/kube-apiserver.service內容:

[Unit]
Description=Kubernetes API Service
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
After=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/apiserver
ExecStart=/usr/local/bin/kube-apiserver \
        $KUBE_LOGTOSTDERR \
        $KUBE_LOG_LEVEL \
        $KUBE_ETCD_SERVERS \
        $KUBE_API_ADDRESS \
        $KUBE_API_PORT \
        $KUBELET_PORT \
        $KUBE_ALLOW_PRIV \
        $KUBE_SERVICE_ADDRESSES \
        $KUBE_ADMISSION_CONTROL \
        $KUBE_API_ARGS
Restart=on-failure
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

123456789101112131415161718192021222324252627

(2) 建立/etc/kubernetes/config檔案內容為:

###
# kubernetes system config
#
# The following values are used to configure various aspects of all
# kubernetes services, including
#
#   kube-apiserver.service
#   kube-controller-manager.service
#   kube-scheduler.service
#   kubelet.service
#   kube-proxy.service
# logging to stderr means we get it in the systemd journal
KUBE_LOGTOSTDERR="--logtostderr=true"

# journal message level, 0 is debug
KUBE_LOG_LEVEL="--v=0"

# Should this cluster be allowed to run privileged docker containers
KUBE_ALLOW_PRIV="--allow-privileged=true"

# How the controller-manager, scheduler, and proxy find the apiserver
KUBE_MASTER="--master=http://172.16.4.12:8080"

1234567891011121314151617181920212223

該配置檔案同時被kube-apiserver、kube-controller-manager、kube-scheduler、kubelet、kube-proxy使用。

apiserver配置檔案/etc/kubernetes/apiserver內容為:

###
### kubernetes system config
###
### The following values are used to configure the kube-apiserver
###
##
### The address on the local server to listen to.
KUBE_API_ADDRESS="--advertise-address=172.16.4.12 --bind-address=172.16.4.12 --insecure-bind-address=172.16.4.12"
##
### The port on the local server to listen on.
##KUBE_API_PORT="--port=8080"
##
### Port minions listen on
##KUBELET_PORT="--kubelet-port=10250"
##
### Comma separated list of nodes in the etcd cluster
KUBE_ETCD_SERVERS="--etcd-servers=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379"
##
### Address range to use for services
KUBE_SERVICE_ADDRESSES="--service-cluster-ip-range=10.10.10.0/24"
##
### default admission control policies
KUBE_ADMISSION_CONTROL="--admission-control=ServiceAccount,NamespaceLifecycle,NamespaceExists,LimitRanger,ResourceQuota"
##
### Add your own!
KUBE_API_ARGS="--authorization-mode=RBAC \
--runtime-config=rbac.authorization.k8s.io/v1beta1 \
--kubelet-https=true \
--enable-bootstrap-token-auth \
--token-auth-file=/etc/kubernetes/token.csv \
--service-node-port-range=30000-50000 \
--tls-cert-file=/etc/kubernetes/ssl/server.pem \
--tls-private-key-file=/etc/kubernetes/ssl/server-key.pem \
--client-ca-file=/etc/kubernetes/ssl/ca.pem \
--service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \
--etcd-cafile=/etc/kubernetes/ssl/ca.pem \
--etcd-certfile=/etc/kubernetes/ssl/server.pem \
--etcd-keyfile=/etc/kubernetes/ssl/server-key.pem \
--enable-swagger-ui=true \
--apiserver-count=3 \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/var/lib/audit.log \
--event-ttl=1h"

12345678910111213141516171819202122232425262728293031323334353637383940414243444546
  • 如果中途修改過--service-cluster-ip-range地址,則必須將default名稱空間的kubernetes的service給刪除,使用命令:kubectl delete service kubernetes,然後系統會自動用新的ip重建這個service,不然apiserver的log有報錯the cluster IP x.x.x.x for service kubernetes/default is not within the service CIDR x.x.x.x/24; please recreate
  • --authorization-mode=RBAC 指定在安全埠使用 RBAC 授權模式,拒絕未通過授權的請求;
  • kube-scheduler、kube-controller-manager 一般和 kube-apiserver 部署在同一臺機器上,它們使用非安全埠和 kube-apiserver通訊;
  • kubelet、kube-proxy、kubectl 部署在其它 Node 節點上,如果通過安全埠訪問 kube-apiserver,則必須先通過 TLS 證書認證,再通過 RBAC 授權;
  • kube-proxy、kubectl 通過在使用的證書裡指定相關的 User、Group 來達到通過 RBAC 授權的目的;
  • 如果使用了 kubelet TLS Boostrap 機制,則不能再指定 --kubelet-certificate-authority--kubelet-client-certificate--kubelet-client-key 選項,否則後續 kube-apiserver 校驗 kubelet 證書時出現 ”x509: certificate signed by unknown authority“ 錯誤;
  • --admission-control 值必須包含 ServiceAccount
  • --bind-address 不能為 127.0.0.1
  • runtime-config配置為rbac.authorization.k8s.io/v1beta1,表示執行時的apiVersion;
  • --service-cluster-ip-range 指定 Service Cluster IP 地址段,該地址段不能路由可達;
  • 預設情況下 kubernetes 物件儲存在 etcd/registry 路徑下,可以通過 --etcd-prefix 引數進行調整;
  • 如果需要開通http的無認證的介面,則可以增加以下兩個引數:--insecure-port=8080 --insecure-bind-address=127.0.0.1。注意,生產上不要繫結到非127.0.0.1的地址上。

注意:完整 unit 見 kube-apiserver.service可以根據自身叢集需求修改引數。

(3)啟動kube-apiserver

systemctl daemon-reload
systemctl enable kube-apiserver
systemctl start kube-apiserver
systemctl status kube-apiserver

12345

配置和啟動kube-controller-manager

(1)建立kube-controller-manager的service配置檔案

檔案路徑/usr/lib/systemd/system/kube-controller-manager.service

[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/controller-manager
ExecStart=/usr/local/bin/kube-controller-manager \
        $KUBE_LOGTOSTDERR \
        $KUBE_LOG_LEVEL \
        $KUBE_MASTER \
        $KUBE_CONTROLLER_MANAGER_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

123456789101112131415161718

(2)配置檔案/etc/kubernetes/controller-manager

###
# The following values are used to configure the kubernetes controller-manager

# defaults from config and apiserver should be adequate

# Add your own!
KUBE_CONTROLLER_MANAGER_ARGS="--address=127.0.0.1 \
--service-cluster-ip-range=10.10.10.0/24 \
--cluster-name=kubernetes \
--cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \
--cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem  \
--service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem \
--root-ca-file=/etc/kubernetes/ssl/ca.pem \
--leader-elect=true"


12345678910111213141516
  • --service-cluster-ip-range 引數指定 Cluster 中 Service 的CIDR範圍,該網路在各 Node 間必須路由不可達,必須和 kube-apiserver 中的引數一致;
  • --cluster-signing-* 指定的證書和私鑰檔案用來簽名為 TLS BootStrap 建立的證書和私鑰;
  • --root-ca-file 用來對 kube-apiserver 證書進行校驗,指定該引數後,才會在Pod 容器的 ServiceAccount 中放置該 CA 證書檔案
  • --address 值必須為 127.0.0.1,kube-apiserver 期望 scheduler 和 controller-manager 在同一臺機器;

(3)啟動kube-controller-manager

systemctl daemon-reload
systemctl enable kube-controller-manager
systemctl start kube-controller-manager
systemctl status kube-controller-manager

12345

我們啟動每個元件後可以通過執行命令kubectl get cs,來檢視各個元件的狀態;

[root@k8s-master ~]# kubectl get cs
NAME                 STATUS      MESSAGE                                                                                     ERROR
scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused   
controller-manager   Healthy     ok                                                                                          
etcd-0               Healthy     {"health":"true"}                                                                           
etcd-2               Healthy     {"health":"true"}                                                                           
etcd-1               Healthy     {"health":"true"}  

12345678

配置和啟動kube-scheduler

(1)建立kube-scheduler的service的配置檔案

檔案路徑/usr/lib/systemd/system/kube-scheduler.service

[Unit]
Description=Kubernetes Scheduler Plugin
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/scheduler
ExecStart=/usr/local/bin/kube-scheduler \
            $KUBE_LOGTOSTDERR \
            $KUBE_LOG_LEVEL \
            $KUBE_MASTER \
            $KUBE_SCHEDULER_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
1234567891011121314151617

(2) 配置檔案/etc/kubernetes/scheduler

###
# kubernetes scheduler config

# default config should be adequate

# Add your own!
KUBE_SCHEDULER_ARGS="--leader-elect=true --address=127.0.0.1"
1234567
  • --address 值必須為 127.0.0.1,因為當前 kube-apiserver 期望 scheduler 和 controller-manager 在同一臺機器;

注意:完整 unit 見 kube-scheduler.service可以根據自身叢集情況新增引數。

(3) 啟動kube-scheduler

systemctl daemon-reload
systemctl enable kube-scheduler
systemctl start kube-scheduler
systemctl status kube-scheduler

12345

驗證master節點功能

[root@k8s-master ~]# kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
controller-manager   Healthy   ok                  
scheduler            Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"}   
etcd-1               Healthy   {"health":"true"} 
# 此時發現,ERROR那一欄再沒有報錯了。
12345678

安裝flannel網路外掛

所有的node節點都需要安裝網路外掛才能讓所有的Pod加入到同一個區域網中,本文是安裝flannel網路外掛的參考文件。

建議直接使用yum安裝flanneld,除非對版本有特殊需求,預設安裝的是0.7.1版本的flannel。

(1)安裝flannel

# 檢視預設安裝的flannel版本,下面顯示是0.7.1.個人建議安裝較新版本。
[root@k8s-master ~]# yum list flannel --showduplicates | sort -r
 * updates: mirror.lzu.edu.cn
Loading mirror speeds from cached hostfile
Loaded plugins: fastestmirror
flannel.x86_64                        0.7.1-4.el7                         extras
 * extras: mirror.lzu.edu.cn
 * base: mirror.lzu.edu.cn
Available Packages

[root@k8s-master ~]# wget https://github.com/coreos/flannel/releases/download/v0.11.0/flannel-v0.11.0-linux-amd64.tar.gz

# 解壓檔案,可以看到產生flanneld和mk-docker-opts.sh兩個可執行檔案。
[root@k8s-master ~]# tar zxvf flannel-v0.11.0-linux-amd64.tar.gz
flanneld
mk-docker-opts.sh
README.md
# 把兩個可執行檔案傳至node1和node2中
[root@k8s-master ~]# scp flanneld [email protected]:/usr/bin/ 
flanneld                                                                                                           100%   34MB  62.9MB/s   00:00    
[root@k8s-master ~]# scp flanneld [email protected]:/usr/bin/ 
flanneld                                                                                                           100%   34MB 121.0MB/s   00:00    
[root@k8s-master ~]# scp mk-docker-opts.sh [email protected]:/usr/libexec/flannel
mk-docker-opts.sh                                                                                                      100% 2139     1.2MB/s   00:00  
[root@k8s-master ~]# scp mk-docker-opts.sh [email protected]:/usr/libexec/flannel
mk-docker-opts.sh                                                                                                      100% 2139     1.1MB/s   00:00  

123456789101112131415161718192021222324252627
  • 注意在node節點上一定要實現建立好盛放flanneld和mk-docker-opts.sh的目錄。

(2)/etc/sysconfig/flanneld配置檔案:

# Flanneld configuration options  

# etcd url location.  Point this to the server where etcd runs
FLANNEL_ETCD_ENDPOINTS="https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379"

# etcd config key.  This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_PREFIX="/kube-centos/network"

# Any additional options that you want to pass
FLANNEL_OPTIONS="-etcd-cafile=/etc/kubernetes/ssl/ca.pem -etcd-certfile=/etc/kubernetes/ssl/server.pem -etcd-keyfile=/etc/kubernetes/ssl/server-key.pem"

123456789101112

(3)建立service配置檔案/usr/lib/systemd/system/flanneld.service

[Unit]
Description=Flanneld overlay address etcd agent
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service

[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/flanneld
EnvironmentFile=-/etc/sysconfig/docker-network
ExecStart=/usr/bin/flanneld --ip-masq \
  -etcd-endpoints=${FLANNEL_ETCD_ENDPOINTS} \
  -etcd-prefix=${FLANNEL_ETCD_PREFIX} \
  $FLANNEL_OPTIONS
ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
Restart=on-failure

[Install]
WantedBy=multi-user.target
RequiredBy=docker.service

1234567891011121314151617181920212223
  • 注意如果是多網絡卡(例如vagrant環境),則需要在FLANNEL_OPTIONS中增加指定的外網出口的網絡卡,例如-iface=eth1

(4)在etcd中建立網路配置

執行下面的命令為docker分配IP地址段。

etcdctl --endpoints=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379 \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/server.pem \
  --key-file=/etc/kubernetes/ssl/server-key.pem
mkdir -p /kube-centos/network

[root@k8s-master network]# etcdctl --endpoints=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379   --ca-file=/etc/kubernetes/ssl/ca.pem   --cert-file=/etc/kubernetes/ssl/server.pem   --key-file=/etc/kubernetes/ssl/server-key.pem   mk /kube-centos/network/config '{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}'

[root@k8s-master network]# etcdctl --endpoints=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379   --ca-file=/etc/kubernetes/ssl/ca.pem   --cert-file=/etc/kubernetes/ssl/server.pem   --key-file=/etc/kubernetes/ssl/server-key.pem   set /kube-centos/network/config '{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}'
{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}

1234567891011

(5)啟動flannel

systemctl daemon-reload
systemctl enable flanneld
systemctl start flanneld
systemctl status flanneld
1234

現在查詢etcd中的內容可以看到:

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
   --ca-file=/etc/kubernetes/ssl/ca.pem \
   --cert-file=/etc/kubernetes/ssl/server.pem \
   --key-file=/etc/kubernetes/ssl/server-key.pem \
   ls /kube-centos/network/subnets
/kube-centos/network/subnets/172.30.20.0-24
/kube-centos/network/subnets/172.30.69.0-24
/kube-centos/network/subnets/172.30.53.0-24

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
   --ca-file=/etc/kubernetes/ssl/ca.pem \
   --cert-file=/etc/kubernetes/ssl/server.pem \
   --key-file=/etc/kubernetes/ssl/server-key.pem \
   get /kube-centos/network/config
{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
   --ca-file=/etc/kubernetes/ssl/ca.pem \
   --cert-file=/etc/kubernetes/ssl/server.pem \
   --key-file=/etc/kubernetes/ssl/server-key.pem \
   get /kube-centos/network/subnets/172.30.20.0-24
{"PublicIP":"172.16.4.13","BackendType":"vxlan","BackendData":{"VtepMAC":"5e:ef:ff:37:0a:d2"}}

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
    --ca-file=/etc/kubernetes/ssl/ca.pem \
    --cert-file=/etc/kubernetes/ssl/server.pem \
    --key-file=/etc/kubernetes/ssl/server-key.pem \
   get /kube-centos/network/subnets/172.30.53.0-24
{"PublicIP":"172.16.4.12","BackendType":"vxlan","BackendData":{"VtepMAC":"e2:e6:b9:23:79:a2"}}

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
>     --ca-file=/etc/kubernetes/ssl/ca.pem \
>     --cert-file=/etc/kubernetes/ssl/server.pem \
>     --key-file=/etc/kubernetes/ssl/server-key.pem \
>    get /kube-centos/network/subnets/172.30.69.0-24
{"PublicIP":"172.16.4.14","BackendType":"vxlan","BackendData":{"VtepMAC":"06:0e:58:69:a0:41"}}

12345678910111213141516171819202122232425262728293031323334353637

同時還可以檢視到其他資訊:

# 1. 比如可以檢視到flannel網路的資訊
[root@k8s-master ~]# ifconfig
.......

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.30.53.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::e0e6:b9ff:fe23:79a2  prefixlen 64  scopeid 0x20<link>
        ether e2:e6:b9:23:79:a2  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

.......
# 2. 可以檢視到分配的子網的檔案。
[root@k8s-master ~]# cat /run/flannel/docker
DOCKER_OPT_BIP="--bip=172.30.53.1/24"
DOCKER_OPT_IPMASQ="--ip-masq=false"
DOCKER_OPT_MTU="--mtu=1450"
DOCKER_NETWORK_OPTIONS=" --bip=172.30.53.1/24 --ip-masq=false --mtu=1450"

123456789101112131415161718192021

(6)將docker應用於flannel

# 需要修改/usr/lib/systemd/system/docker.servce的ExecStart欄位,引入上述\$DOCKER_NETWORK_OPTIONS欄位,docker.service詳細配置見下。

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify

# add by gzr
EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network

# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd  $DOCKER_NETWORK_OPTIONS
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

123456789101112131415161718192021222324252627282930313233343536373839404142434445
# 重啟docker使得配置生效。
[root@k8s-master ~]# systemctl daemon-reload && systemctl restart docker.service
# 再次檢視docker和flannel的網路,會發現兩者在同一網段
[root@k8s-master ~]# ifconfig 
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.30.53.1  netmask 255.255.255.0  broadcast 172.30.53.255
        ether 02:42:1e:aa:8b:0f  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

......

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.30.53.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::e0e6:b9ff:fe23:79a2  prefixlen 64  scopeid 0x20<link>
        ether e2:e6:b9:23:79:a2  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

.....
# 同理,可以應用到其他各節點上,以node1為例。
[root@k8s-node1 ~]# vim /usr/lib/systemd/system/docker.service 
[root@k8s-node1 ~]# systemctl daemon-reload && systemctl restart docker
[root@k8s-node1 ~]# ifconfig
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.30.20.1  netmask 255.255.255.0  broadcast 172.30.20.255
        inet6 fe80::42:23ff:fe7f:6a70  prefixlen 64  scopeid 0x20<link>
        ether 02:42:23:7f:6a:70  txqueuelen 0  (Ethernet)
        RX packets 18  bytes 2244 (2.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 48  bytes 3469 (3.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

......

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.30.20.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::5cef:ffff:fe37:ad2  prefixlen 64  scopeid 0x20<link>
        ether 5e:ef:ff:37:0a:d2  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

......

veth82301fa: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet6 fe80::6855:cfff:fe99:5143  prefixlen 64  scopeid 0x20<link>
        ether 6a:55:cf:99:51:43  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 7  bytes 586 (586.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859

部署node節點

# 把master節點上的flanneld.service檔案分發到各node節點上。
scp /usr/lib/systemd/system/flanneld.service [email protected]:/usr/lib/systemd/system
scp /usr/lib/systemd/system/flanneld.service [email protected]:/usr/lib/systemd/system
# 重新啟動flanneld
systemctl daemon-reload
systemctl enable flanneld
systemctl start flanneld
systemctl status flanneld

123456789

配置Docker

不論您使用何種方式安裝的flannel,將以下配置加入到/var/lib/systemd/systemc/docker.service中可確保萬無一失。

# 待加入內容
EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network

# 最終完整的docker.service檔案內容如下
[root@k8s-master ~]# cat /usr/lib/systemd/system/docker.service 
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify

# add by gzr
EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network

# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253

(2)啟動docker

安裝配置kubelet

(1)檢查是否禁用sawp

[root@k8s-master ~]# free
              total        used        free      shared  buff/cache   available
Mem:       32753848      730892    27176072      377880     4846884    31116660
Swap:             0           0           0

12345
  • 或者進入/etc/fstab目錄,將swap系統註釋掉。

kubelet 啟動時向 kube-apiserver 傳送 TLS bootstrapping 請求,需要先將 bootstrap token 檔案中的 kubelet-bootstrap 使用者賦予 system:node-bootstrapper cluster 角色(role), 然後 kubelet 才能有許可權建立認證請求(certificate signing requests):

(2)從master節點的/usr/local/bin將kubelet和kube-proxy檔案傳至各節點

[root@k8s-master ~]# scp /usr/local/bin/kubelet [email protected]:/usr/local/bin/ 
[root@k8s-master ~]# scp /usr/local/bin/kubelet [email protected]:/usr/local/bin/ 


1234

(3)在master節點上建立角色。

# 需要在master端建立許可權分配角色, 然後在node節點上再重新啟動kubelet服務
[root@k8s-master kubernetes]# kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
clusterrolebinding.rbac.authorization.k8s.io/kubelet-bootstrap created

1234

(4)建立kubelet服務

第一種方式:在node節點上建立執行指令碼

# 建立kubelet的配置檔案和kubelet.service檔案,此處採用建立指令碼kubelet.sh的一鍵執行。
#!/bin/bash

NODE_ADDRESS=${1:-"172.16.4.13"}
DNS_SERVER_IP=${2:-"10.10.10.2"}

cat <<EOF >/etc/kubernetes/kubelet

KUBELET_ARGS="--logtostderr=true \\
--v=4 \\
--address=${NODE_ADDRESS} \\
--hostname-override=${NODE_ADDRESS} \\
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\
--bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig \\
--api-servers=172.16.4.12 \\
--cert-dir=/etc/kubernetes/ssl \\
--allow-privileged=true \\
--cluster-dns=${DNS_SERVER_IP} \\
--cluster-domain=cluster.local \\
--fail-swap-on=false \\
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google-containers/pause-amd64:3.0"

EOF

cat <<EOF >/usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
After=docker.service
Requires=docker.service

[Service]
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/bin/kubelet \$KUBELET_ARGS
Restart=on-failure
KillMode=process

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable kubelet
systemctl restart kubelet && systemctl status kubelet

1234567891011121314151617181920212223242526272829303132333435363738394041424344

2)執行指令碼

chmod +x kubelet.sh
./kubelet.sh 172.16.4.14 10.10.10.2
# 同時還可以檢視生成的kubelet.service檔案
[root@k8s-node2 ~]# cat /usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
After=docker.service
Requires=docker.service

[Service]
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/bin/kubelet $KUBELET_ARGS
Restart=on-failure
KillMode=process

[Install]
WantedBy=multi-user.target

123456789101112131415161718
  • 注意:在node1上執行kubelet.sh指令碼,傳入172.16.4.13(node1 IP)和 10.10.10.2(DNS伺服器IP)。在其他節點執行指令碼時,記得替換相應的引數。

或者採用第二種方式

1)建立kubelet的配置檔案/etc/kubernetes/kubelet,內容如下:

###
## kubernetes kubelet (minion) config
#
## The address for the info server to serve on (set to 0.0.0.0 or "" for all interfaces)
KUBELET_ADDRESS="--address=172.16.4.12"
#
## The port for the info server to serve on
#KUBELET_PORT="--port=10250"
#
## You may leave this blank to use the actual hostname
KUBELET_HOSTNAME="--hostname-override=172.16.4.12"
#
## location of the api-server
## COMMENT THIS ON KUBERNETES 1.8+
KUBELET_API_SERVER="--api-servers=http://172.16.4.12:8080"
#
## pod infrastructure container
KUBELET_POD_INFRA_CONTAINER="--pod-infra-container-image=jimmysong/pause-amd64:3.0"
#
## Add your own!
KUBELET_ARGS="--cgroup-driver=systemd \
--cluster-dns=10.10.10.2 \
--bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig \
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \
--require-kubeconfig \
--cert-dir=/etc/kubernetes/ssl \
--cluster-domain=cluster.local \
--hairpin-mode promiscuous-bridge \
--serialize-image-pulls=false"

123456789101112131415161718192021222324252627282930
  • 如果使用systemd方式啟動,則需要額外增加兩個引數--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice
  • --address 不能設定為 127.0.0.1,否則後續 Pods 訪問 kubelet 的 API 介面時會失敗,因為 Pods 訪問的 127.0.0.1 指向自己而不是 kubelet;
  • "--cgroup-driver 配置成 systemd,不要使用cgroup,否則在 CentOS 系統中 kubelet 將啟動失敗(保持docker和kubelet中的cgroup driver配置一致即可,不一定非使用systemd)。
  • --bootstrap-kubeconfig 指向 bootstrap kubeconfig 檔案,kubelet 使用該檔案中的使用者名稱和 token 向 kube-apiserver 傳送 TLS Bootstrapping 請求;
  • 管理員通過了 CSR 請求後,kubelet 自動在 --cert-dir 目錄建立證書和私鑰檔案(kubelet-client.crtkubelet-client.key),然後寫入 --kubeconfig 檔案;
  • 建議在 --kubeconfig 配置檔案中指定 kube-apiserver 地址,如果未指定 --api-servers 選項,則必須指定 --require-kubeconfig 選項後才從配置檔案中讀取 kube-apiserver 的地址,否則 kubelet 啟動後將找不到 kube-apiserver (日誌中提示未找到 API Server),kubectl get nodes 不會返回對應的 Node 資訊; --require-kubeconfig 在1.10版本被移除,參看PR
  • --cluster-dns 指定 kubedns 的 Service IP(可以先分配,後續建立 kubedns 服務時指定該 IP),--cluster-domain 指定域名字尾,這兩個引數同時指定後才會生效;
  • --cluster-domain 指定 pod 啟動時 /etc/resolve.conf 檔案中的 search domain ,起初我們將其配置成了 cluster.local.,這樣在解析 service 的 DNS 名稱時是正常的,可是在解析 headless service 中的 FQDN pod name 的時候卻錯誤,因此我們將其修改為 cluster.local,去掉最後面的 ”點號“ 就可以解決該問題,關於 kubernetes 中的域名/服務名稱解析請參見我的另一篇文章。
  • --kubeconfig=/etc/kubernetes/kubelet.kubeconfig中指定的kubelet.kubeconfig檔案在第一次啟動kubelet之前並不存在,請看下文,當通過CSR請求後會自動生成kubelet.kubeconfig檔案,如果你的節點上已經生成了~/.kube/config檔案,你可以將該檔案拷貝到該路徑下,並重命名為kubelet.kubeconfig,所有node節點可以共用同一個kubelet.kubeconfig檔案,這樣新新增的節點就不需要再建立CSR請求就能自動新增到kubernetes叢集中。同樣,在任意能夠訪問到kubernetes叢集的主機上使用kubectl --kubeconfig命令操作叢集時,只要使用~/.kube/config檔案就可以通過許可權認證,因為這裡面已經有認證資訊並認為你是admin使用者,對叢集擁有所有許可權。
  • KUBELET_POD_INFRA_CONTAINER 是基礎映象容器,這裡我用的是私有映象倉庫地址,大家部署的時候需要修改為自己的映象。可以使用Google的pause映象gcr.io/google_containers/pause-amd64:3.0,這個映象只有300多K。

2) 建立kubelet的service配置檔案

檔案位置:/usr/lib/systemd/system/kubelet.service

[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/var/lib/kubelet
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/bin/kubelet \
            $KUBE_LOGTOSTDERR \
            $KUBE_LOG_LEVEL \
            $KUBELET_API_SERVER \
            $KUBELET_ADDRESS \
            $KUBELET_PORT \
            $KUBELET_HOSTNAME \
            $KUBE_ALLOW_PRIV \
            $KUBELET_POD_INFRA_CONTAINER \
            $KUBELET_ARGS
Restart=on-failure

[Install]
WantedBy=multi-user.target

12345678910111213141516171819202122232425

注意:上述兩種方式都可以建立kubelet服務,個人建議採用指令碼一鍵式執行所有任務,採用第二種方式配置時,需要手動建立工作目錄:/var/lib/kubelet。此處不再演示。

(5)通過kubelet的TLS證書請求

kubelet 首次啟動時向 kube-apiserver 傳送證書籤名請求,必須通過後 kubernetes 系統才會將該 Node 加入到叢集。

1)在master節點上檢視未授權的CSR請求

[root@k8s-master ~]# kubectl get csr
NAME                                                   AGE    REQUESTOR           CONDITION
node-csr-4799pnHJjREEcWDGgSFvNaoyfcn4HiOML9cpEI1IbMs   3h6m   kubelet-bootstrap   Pending
node-csr-e3mql7Dm878tLhPUxu2pzg8e8eM17Togc6lHQX-mXZs   3h     kubelet-bootstrap   Pending

12345

2)通過CSR請求

[root@k8s-master ~]# kubectl certificate approve node-csr-4799pnHJjREEcWDGgSFvNaoyfcn4HiOML9cpEI1IbMs
certificatesigningrequest.certificates.k8s.io/node-csr-4799pnHJjREEcWDGgSFvNaoyfcn4HiOML9cpEI1IbMs approved
[root@k8s-master ~]# kubectl certificate approve node-csr-e3mql7Dm878tLhPUxu2pzg8e8eM17Togc6lHQX-mXZs
certificatesigningrequest.certificates.k8s.io/node-csr-e3mql7Dm878tLhPUxu2pzg8e8eM17Togc6lHQX-mXZs approved
# 授權後發現兩個node節點的csr已經approved.

123456

3)自動生成了 kubelet kubeconfig 檔案和公私鑰

[root@k8s-node1 ~]# ls -l /etc/kubernetes/kubelet.kubeconfig
-rw------- 1 root root 2294 Jun 14 15:19 /etc/kubernetes/kubelet.kubeconfig
[root@k8s-node1 ~]# ls -l /etc/kubernetes/ssl/kubelet*
-rw------- 1 root root 1273 Jun 14 15:19 /etc/kubernetes/ssl/kubelet-client-2019-06-14-15-19-10.pem
lrwxrwxrwx 1 root root   58 Jun 14 15:19 /etc/kubernetes/ssl/kubelet-client-current.pem -> /etc/kubernetes/ssl/kubelet-client-2019-06-14-15-19-10.pem
-rw-r--r-- 1 root root 2177 Jun 14 11:50 /etc/kubernetes/ssl/kubelet.crt
-rw------- 1 root root 1679 Jun 14 11:50 /etc/kubernetes/ssl/kubelet.key


123456789

假如你更新kubernetes的證書,只要沒有更新token.csv,當重啟kubelet後,該node就會自動加入到kuberentes叢集中,而不會重新發送certificaterequest,也不需要在master節點上執行kubectl certificate approve操作。前提是不要刪除node節點上的/etc/kubernetes/ssl/kubelet*/etc/kubernetes/kubelet.kubeconfig檔案。否則kubelet啟動時會提示找不到證書而失敗。

[root@k8s-master ~]# scp /etc/kubernetes/token.csv [email protected]:/etc/kubernetes/   
[root@k8s-master ~]# scp /etc/kubernetes/token.csv [email protected]:/etc/kubernetes/

123

注意:如果啟動kubelet的時候見到證書相關的報錯,有個trick可以解決這個問題,可以將master節點上的~/.kube/config檔案(該檔案在

[安裝kubectl命令列工具]:

這一步中將會自動生成)拷貝到node節點的/etc/kubernetes/kubelet.kubeconfig位置,這樣就不需要通過CSR,當kubelet啟動後就會自動加入的叢集中。注意同時記得也把.kube/config中的內容複製貼上到/etc/kubernetes/kubelet.kubeconfig中,替換原先內容。

[root@k8s-master ~]# cat .kube/config 
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUR2akNDQXFhZ0F3SUJBZ0lVZkJtL2lzNG1EcHdqa0M0aVFFTWF5SVJaVHVjd0RRWUpLb1pJaHZjTkFRRUwKQlFBd1pURUxNQWtHQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFXcHBibWN4RURBT0JnTlZCQWNUQjBKbAphV3BwYm1jeEREQUtCZ05WQkFvVEEyczRjekVQTUEwR0ExVUVDeE1HVTNsemRHVnRNUk13RVFZRFZRUURFd3ByCmRXSmxjbTVsZEdWek1CNFhEVEU1TURZeE1qQXpNRFF3TUZvWERUSTVNRFl3T1RBek1EUXdNRm93WlRFTE1Ba0cKQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFXcHBibWN4RURBT0JnTlZCQWNUQjBKbGFXcHBibWN4RERBSwpCZ05WQkFvVEEyczRjekVQTUEwR0ExVUVDeE1HVTNsemRHVnRNUk13RVFZRFZRUURFd3ByZFdKbGNtNWxkR1Z6Ck1JSUJJakFOQmdrcWhraUc5dzBCQVFFRkFBT0NBUThBTUlJQkNnS0NBUUVBOEZQK2p0ZUZseUNPVDc0ZzRmd1UKeDl0bDY3dGVabDVwTDg4ZStESzJMclBJZDRXMDRvVDdiWTdKQVlLT3dPTkM4RjA5MzNqSjVBdmxaZmppTkJCaQp2OTlhYU5tSkdxeWozMkZaaDdhTkYrb3Fab3BYdUdvdmNpcHhYTWlXbzNlVHpWVUh3d2FBeUdmTS9BQnE0WUY0ClprSVV5UkJaK29OVXduY0tNaStOR2p6WVJyc2owZEJRR0ROZUJ6OEgzbCtjd1U1WmpZdEdFUFArMmFhZ1k5bG0KbjhyOUFna2owcW9uOEdQTFlRb2RDYzliSWZqQmVNaGIzaHJGMjJqMDhzWTczNzh3MzN5VWRHdjg1YWpuUlp6UgpIYkN6UytYRGJMTTh2aGh6dVZoQmt5NXNrWXB6M0hCNGkrTnJPR1Fmdm4yWkY0ZFh4UVUyek1Dc2NMSVppdGg0Ckt3SURBUUFCbzJZd1pEQU9CZ05WSFE4QkFmOEVCQU1DQVFZd0VnWURWUjBUQVFIL0JBZ3dCZ0VCL3dJQkFqQWQKQmdOVkhRNEVGZ1FVeTVmVncxK0s2N1dvblRuZVgwL2dFSTVNM3FJd0h3WURWUjBqQkJnd0ZvQVV5NWZWdzErSwo2N1dvblRuZVgwL2dFSTVNM3FJd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFOb3ZXa1ovd3pEWTZSNDlNNnpDCkhoZlZtVGk2dUZwS24wSmtvMVUzcHA5WTlTTDFMaXVvK3VwUjdJOCsvUXd2Wm95VkFWMTl4Y2hRQ25RSWhRMEgKVWtybXljS0crdWtsSUFUS3ZHenpzNW1aY0NQOGswNnBSSHdvWFhRd0ZhSFBpNnFZWDBtaW10YUc4REdzTk01RwpQeHdZZUZncXBLQU9Tb0psNmw5bXErQnhtWEoyZS8raXJMc3N1amlPKzJsdnpGOU5vU29Yd1RqUGZndXhRU3VFCnZlSS9pTXBGV1o0WnlCYWJKYkw5dXBldm53RTA2RXQrM2g2N3JKOU5mZ2N5MVhNSU0xeGo1QXpzRXgwVE5ETGkKWGlOQ0Zram9zWlA3U3dZdE5ncHNuZmhEandHRUJLbXV1S3BXR280ZWNac2lMQXgwOTNaeTdKM2dqVDF6dGlFUwpzQlE9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
    server: https://172.16.4.12:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: admin
  name: kubernetes
current-context: kubernetes
kind: Config
preferences: {}
users:
- name: admin
  user:
    client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUQzVENDQXNXZ0F3SUJBZ0lVVmlPdjZ6aFlHMzIzdWRZS2RFWEcvRVJENW8wd0RRWUpLb1pJaHZjTkFRRUwKQlFBd1pURUxNQWtHQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFXcHBibWN4RURBT0JnTlZCQWNUQjBKbAphV3BwYm1jeEREQUtCZ05WQkFvVEEyczRjekVQTUEwR0ExVUVDeE1HVTNsemRHVnRNUk13RVFZRFZRUURFd3ByCmRXSmxjbTVsZEdWek1CNFhEVEU1TURZeE1qQTJORGd3TUZvWERUSTVNRFl3T1RBMk5EZ3dNRm93YXpFTE1Ba0cKQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFVcHBibWN4RURBT0JnTlZCQWNUQjBKbGFVcHBibWN4RnpBVgpCZ05WQkFvVERuTjVjM1JsYlRwdFlYTjBaWEp6TVE4d0RRWURWUVFMRXdaVGVYTjBaVzB4RGpBTUJnTlZCQU1UCkJXRmtiV2x1TUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUFuL29MQVpCcENUdWUKci95eU15a1NYelBpWk9mVFdZQmEwNjR6c2Y1Y1Z0UEt2cnlCSjVHVlVSUlFUc2F3eWdFdnFBSXI3TUJrb21GOQpBeFVNaFNxdlFjNkFYemQzcjRMNW1CWGQxZ3FoWVNNR2lJL3hEMG5RaEF1azBFbVVONWY5ZENZRmNMMTVBVnZSCituN2wwaVcvVzlBRjRqbXRtYUtLVUdsUU9vNzQ3anNCYWRndU9SVHBMSkwxUGw3SlVLZnFBWktEbFVXZnpwZXcKOE1ETVMzN1FodmVQc24va2RwUVZ0bzlJZWcwSFhBcXlmZHNaZjZKeGdaS1FmUUNyYlJEMkd2L29OVVRlYnpWMwpWVm9ueEpUYmFrZFNuOHR0cCtLWFlzTUYvQy8wR29sL1JkS1Mrc0t4Z2hUUWdJMG5CZXJBM0x0dGp6WVpySWJBClo0RXBRNmc0ZFFJREFRQUJvMzh3ZlRBT0JnTlZIUThCQWY4RUJBTUNCYUF3SFFZRFZSMGxCQll3RkFZSUt3WUIKQlFVSEF3RUdDQ3NHQVFVRkJ3TUNNQXdHQTFVZEV3RUIvd1FDTUFBd0hRWURWUjBPQkJZRUZCQThrdnFaVDhRRApaSnIvTUk2L2ZWalpLdVFkTUI4R0ExVWRJd1FZTUJhQUZNdVgxY05maXV1MXFKMDUzbDlQNEJDT1RONmlNQTBHCkNTcUdTSWIzRFFFQkN3VUFBNElCQVFDMnZzVDUwZVFjRGo3RVUwMmZQZU9DYmJ6cFZWazEzM3NteGI1OW83YUgKRDhONFgvc3dHVlYzU0V1bVNMelJYWDJSYUsyUU04OUg5ZDlpRkV2ZzIvbjY3VThZeVlYczN0TG9Ua29NbzlUZgpaM0FNN0NyM0V5cWx6OGZsM3p4cmtINnd1UFp6VWNXV29vMUJvR1VCbEM1Mi9EbFpQMkZCbHRTcWtVL21EQ3IxCnJJWkFYYjZDbXNNZG1SQzMrYWwxamVUak9MZEcwMUd6dlBZdEdsQ0p2dHRJNzBuVkR3Nkh3QUpkRVN0UUh0cWsKakpCK3NZU2NSWDg1YTlsUXVIU21DY0kyQWxZQXFkK0t2NnNKNUVFZnpwWHNUVXdya0tKbjJ0UTN2UVNLaEgyawpabUx2N0MvcWV6YnJvc3pGeHNZWEtRelZiODVIVkxBbXo2UVhYV1I2Q0ZzMAotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
    client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBbi9vTEFaQnBDVHVlci95eU15a1NYelBpWk9mVFdZQmEwNjR6c2Y1Y1Z0UEt2cnlCCko1R1ZVUlJRVHNhd3lnRXZxQUlyN01Ca29tRjlBeFVNaFNxdlFjNkFYemQzcjRMNW1CWGQxZ3FoWVNNR2lJL3gKRDBuUWhBdWswRW1VTjVmOWRDWUZjTDE1QVZ2UituN2wwaVcvVzlBRjRqbXRtYUtLVUdsUU9vNzQ3anNCYWRndQpPUlRwTEpMMVBsN0pVS2ZxQVpLRGxVV2Z6cGV3OE1ETVMzN1FodmVQc24va2RwUVZ0bzlJZWcwSFhBcXlmZHNaCmY2SnhnWktRZlFDcmJSRDJHdi9vTlVUZWJ6VjNWVm9ueEpUYmFrZFNuOHR0cCtLWFlzTUYvQy8wR29sL1JkS1MKK3NLeGdoVFFnSTBuQmVyQTNMdHRqellackliQVo0RXBRNmc0ZFFJREFRQUJBb0lCQUE1cXFDZEI3bFZJckNwTAo2WHMyemxNS0IvTHorVlh0ZlVIcVJ2cFpZOVRuVFRRWEpNUitHQ2l3WGZSYmIzOGswRGloeVhlU2R2OHpMZUxqCk9MZWZleC9CRGt5R1lTRE4rdFE3MUR2L3hUOU51cjcveWNlSTdXT1k4UWRjT2lFd2IwVFNVRmN5bS84RldVenIKdHFaVGhJVXZuL2dkSG9uajNmY1ZKb2ZBYnFwNVBrLzVQd2hFSU5Pdm1FTFZFQWl6VnBWVmwxNzRCSGJBRHU1Sgp2Nm9xc0h3SUhwNC9ZbGo2NHhFVUZ1ZFA2Tkp0M1B5Uk14dW5RcWd3SWZ1bktuTklRQmZEVUswSklLK1luZmlJClgrM1lQam5sWFU3UnhYRHRFa3pVWTFSTTdVOHJndHhiNWRQWnhocGgyOFlFVnJBVW5RS2RSTWdCVVNad3hWRUYKeFZqWmVwa0NnWUVBeEtHdXExeElHNTZxL2RHeGxDODZTMlp3SkxGajdydTkrMkxEVlZsL2h1NzBIekJ6dFFyNwpMUGhUZnl2SkVqNTcwQTlDbk4ybndjVEQ2U1dqbkNDbW9ESk10Ti9iZlJaMThkZTU4b0JCRDZ5S0JGbmV1eWkwCk1oVWFmSzN5M091bGkxMjBKS3lQb2hvN1lyWUxNazc1UzVEeVRGMlEyV3JYY0VQaTlVRzNkNzhDZ1lFQTBFY3YKTUhDbE9XZ1hJUVNXNCtreFVEVXRiOFZPVnpwYjd3UWZCQ3RmSTlvTDBnVWdBd1M2U0lub2tET3ozdEl4aXdkQQpWZTVzMklHbVAzNS9qdm5FbThnaE1XbEZ3eHB5ZUxKK0hraTl1dFNPblJGWHYvMk9JdjBYbE01RlY5blBmZ01NCkMxQ09zZklKaVREaXJFOGQrR2cxV010dWxkVGo4Z0JKazRQRXZNc0NnWUJoNHA4aWZVa0VQdU9lZ1hJbWM3QlEKY3NsbTZzdjF2NDVmQTVaNytaYkxwRTd3Njl6ZUJuNXRyNTFaVklHL1RFMjBrTFEzaFB5TE1KbmFpYnM5OE44aQpKb2diRHNta0pyZEdVbjhsNG9VQStZS25rZG1ZVURZTUxJZElCQXcvd0N0a0NweXdHUnRUdGoxVDhZMzNXR3N3CkhCTVN3dzFsdnBOTE52Qlg2WVFjM3dLQmdHOHAvenJJZExjK0lsSWlJL01EREtuMXFBbW04cGhGOHJtUXBvbFEKS05oMjBhWkh5LzB3Y2NpenFxZ0VvSFZHRk9GU2Zua2U1NE5yTjNOZUxmRCt5SHdwQmVaY2ZMcVVqQkoxbWpESgp2RkpTanNld2NQaHMrWWNkTkkvY3hGQU9WZHU0L3Aydlltb0JlQ3Q4SncrMnJwVmQ4Vk15U1JTNWF1eElVUHpsCjhJU2ZBb0dBVituYjJ3UGtwOVJ0NFVpdmR0MEdtRjErQ052YzNzY3JYb3RaZkt0TkhoT0o2UTZtUkluc2tpRWgKVnFQRjZ6U1BnVmdrT1hmU0xVQ3Y2cGdWR2J5d0plRWo1SElQRHFuU25vNFErZFl2TXozcWN5d1hLbFEyUjZpcAo3VE0wWHNJaGFMRDFmWUNjaDhGVHNiZHNrQUNZUHpzeEdBa1l2TnRDcDI5WExCRmZWbkE9Ci0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg==
# 分發.kube/config到各節點。
[root@k8s-master ~]# scp .kube/config [email protected]:/etc/kubernetes/ 
[root@k8s-master ~]# scp .kube/config [email protected]:/etc/kubernetes/
# 比如在node2的/etc/kubernetes/目錄下則出現了config檔案。
[root@k8s-node2 ~]# ls /etc/kubernetes/
bin  bootstrap.kubeconfig  config  kubelet  kubelet.kubeconfig  kube-proxy.kubeconfig  ssl  token.csv
1234567891011121314151617181920212223242526

配置kube-proxy

指令碼方式配置

(1)編寫kube-proxy.sh指令碼內容如下(在各node上編寫該指令碼):

#!/bin/bash

NODE_ADDRESS=${1:-"172.16.4.13"}

cat <<EOF >/etc/kubernetes/kube-proxy

KUBE_PROXY_ARGS="--logtostderr=true \
--v=4 \
--hostname-override=${NODE_ADDRESS} \
--kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig"

EOF

cat <<EOF >/usr/lib/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Proxy
After=network.target

[Service]
EnvironmentFile=-/etc/kubernetes/kube-proxy
ExecStart=/usr/local/bin/kube-proxy \$KUBE_PROXY_ARGS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload && systemctl enable kube-proxy
systemctl restart kube-proxy && systemctl status kube-proxy
1234567891011121314151617181920212223242526272829
  • --hostname-override 引數值必須與 kubelet 的值一致,否則 kube-proxy 啟動後會找不到該 Node,從而不會建立任何 iptables 規則;
  • kube-proxy 根據 --cluster-cidr 判斷叢集內部和外部流量,指定 --cluster-cidr--masquerade-all 選項後 kube-proxy 才會對訪問 Service IP 的請求做 SNAT;
  • --kubeconfig 指定的配置檔案嵌入了 kube-apiserver 的地址、使用者名稱、證書、祕鑰等請求和認證資訊;
  • 預定義的 RoleBinding cluster-admin 將User system:kube-proxy 與 Role system:node-proxier 繫結,該 Role 授予了呼叫 kube-apiserver Proxy 相關 API 的許可權;

完整 unit 見 kube-proxy.service

(2)執行指令碼

# 首先將前端master的kube-proxy命令拷貝至各個節點。
[root@k8s-master ~]# scp /usr/local/bin/kube-proxy [email protected]:/usr/local/bin/   
[root@k8s-master ~]# scp /usr/local/bin/kube-proxy [email protected]:/usr/local/bin/
# 並在各個節點上更改執行許可權。
chmod +x kube-proxy.sh
[root@k8s-node2 ~]# ./kube-proxy.sh 172.16.4.14
Created symlink from /etc/systemd/system/multi-user.target.wants/kube-proxy.service to /usr/lib/systemd/system/kube-proxy.service.
● kube-proxy.service - Kubernetes Proxy
   Loaded: loaded (/usr/lib/systemd/system/kube-proxy.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2019-06-14 16:01:47 CST; 39ms ago
 Main PID: 117068 (kube-proxy)
    Tasks: 10
   Memory: 8.8M
   CGroup: /system.slice/kube-proxy.service
           └─117068 /usr/local/bin/kube-proxy --logtostderr=true --v=4 --hostname-override=172.16.4.14 --kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig

Jun 14 16:01:47 k8s-node2 systemd[1]: Started Kubernetes Proxy.
1234567891011121314151617

(3) --kubeconfig=/etc/kubernetes/kubelet.kubeconfig中指定的kubelet.kubeconfig檔案在第一次啟動kubelet之前並不存在,請看下文,當通過CSR請求後會自動生成kubelet.kubeconfig檔案,如果你的節點上已經生成了~/.kube/config檔案,你可以將該檔案拷貝到該路徑下,並重命名為kubelet.kubeconfig,所有node節點可以共用同一個kubelet.kubeconfig檔案,這樣新新增的節點就不需要再建立CSR請求就能自動新增到kubernetes叢集中。同樣,在任意能夠訪問到kubernetes叢集的主機上使用kubectl --kubeconfig命令操作叢集時,只要使用~/.kube/config`檔案就可以通過許可權認證,因為這裡面已經有認證資訊並認為你是admin使用者,對叢集擁有所有許可權。

[root@k8s-master ~]# scp .kube/config [email protected]:/etc/kubernetes/
[root@k8s-node1 ~]# mv config kubelet.kubeconfig
[root@k8s-master ~]# scp .kube/config [email protected]:/etc/kubernetes/
[root@k8s-node2 ~]# mv config kubelet.kubeconfig

12345

驗證測試

# 以下操作在master節點上執行。
[root@k8s-master ~]# kubectl get nodes
NAME          STATUS   ROLES    AGE     VERSION
172.16.4.13   Ready    <none>   66s     v1.14.3
172.16.4.14   Ready    <none>   7m14s   v1.14.3

[root@k8s-master ~]# kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   
etcd-1               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"} 

# 以nginx服務測試叢集可用性
[root@k8s-master ~]# kubectl run nginx --replicas=3 --labels="run=load-balancer-example" --image=nginx  --port=80
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
deployment.apps/nginx created
[root@k8s-master ~]# kubectl expose deployment nginx --type=NodePort --name=example-service
service/example-service exposed

[root@k8s-master ~]# kubectl describe svc example-service
Name:                     example-service
Namespace:                default
Labels:                   run=load-balancer-example
Annotations:              <none>
Selector:                 run=load-balancer-example
Type:                     NodePort
IP:                       10.10.10.222
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
NodePort:                 <unset>  40905/TCP
Endpoints:                172.17.0.2:80,172.17.0.2:80,172.17.0.3:80
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

# 在node節點上訪問
[root@k8s-node1 ~]# curl "10.10.10.222:80"
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

# 外網測試訪問
[root@k8s-master ~]# kubectl get svc
NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
example-service   NodePort    10.10.10.222   <none>        80:40905/TCP   6m26s
kubernetes        ClusterIP   10.10.10.1     <none>        443/TCP        21h
# 由上可知,服務暴露外網埠為40905.輸入172.16.4.12:40905即可訪問。

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172

DNS服務搭建與配置

從k8s v1.11版本開始,Kubernetes叢集的DNS服務由CoreDNS提供。它是CNCF基金會的一個專案,使用Go語言實現的高效能、外掛式、易擴充套件的DNS服務端。它解決了KubeDNS的一些問題,如dnsmasq的安全漏洞,externalName不能使用stubDomains設定等。

安裝CoreDNS外掛

官方的yaml檔案目錄:https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/coredns

在部署CoreDNS應用前,至少需要建立一個ConfigMap,一個Deployment和一個Service共3個資源物件。在啟用了RBAC的叢集中,還可以設定ServiceAccount、ClusterRole、ClusterRoleBinding對CoreDNS容器進行許可權限制。

(1)為了起到映象加速的作用,首先將docker的配置源更改為國內阿里雲

cat << EOF > /etc/docker/daemon.json
{
      "registry-mirrors":["https://registry.docker-cn.com","https://h23rao59.mirror.aliyuncs.com"]
}
EOF

# 重新載入配置並重啟docker
[root@k8s-master ~]# systemctl daemon-reload && systemctl restart docker
12345678

(2)此處將svc,configmap,ServiceAccount等寫在一個yaml檔案裡,coredns.yaml內容見下。

[root@k8s-master ~]# cat coredns.yaml 
apiVersion: v1
kind: ServiceAccount
metadata:
  name: coredns
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
rules:
- apiGroups:
  - ""
  resources:
  - endpoints
  - services
  - pods
  - namespaces
  verbs:
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:coredns
subjects:
- kind: ServiceAccount
  name: coredns
  namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           upstream
           fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        proxy . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    k8s-app: kube-dns
  name: coredns
  namespace: kube-system
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kube-dns
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      containers:
      - args:
        - -conf
        - /etc/coredns/Corefile
        image:  docker.io/fengyunpan/coredns:1.2.6
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 5
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        name: coredns
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9153
          name: metrics
          protocol: TCP
        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - all
          procMount: Default
          readOnlyRootFilesystem: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/coredns
          name: config-volume
          readOnly: true
      dnsPolicy: Default
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: coredns
      serviceAccountName: coredns
      terminationGracePeriodSeconds: 30
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
      volumes:
      - configMap:
          defaultMode: 420
          items:
          - key: Corefile
            path: Corefile
          name: coredns
        name: config-volume

---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: KubeDNS
  name: kube-dns
  namespace: kube-system
spec:
  selector:
    k8s-app: coredns
  clusterIP: 10.10.10.2
  ports:
  - name: dns
    port: 53
    protocol: UDP
    targetPort: 53
  - name: dns-tcp
    port: 53
    protocol: TCP
    targetPort: 53
  selector:
    k8s-app: kube-dns

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184
  • clusterIP: 10.10.10.2是我叢集各節點的DNS伺服器IP,注意修改。並且在各node節點的kubelet的啟動引數中加入以下兩個引數:
    • –cluster-dns=10.10.10.2:為DNS服務的ClusterIP地址。
    • –cluster-domain=cluster.local:為在DNS服務中設定的域名

然後重啟kubelet服務。

(3)通過kubectl create建立CoreDNS服務。

[root@k8s-master ~]# kubectl create -f coredns.yaml 
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.extensions/coredns created
service/kube-dns created
[root@k8s-master ~]# kubectl get all -n kube-system
NAME                           READY   STATUS    RESTARTS   AGE
pod/coredns-5fc7b65789-rqk6f   1/1     Running   0          20s

NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
service/kube-dns   ClusterIP   10.10.10.2   <none>        53/UDP,53/TCP   20s

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/coredns   1/1     1            1           20s

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/coredns-5fc7b65789   1         1         1       20s

1234567891011121314151617181920

(4)驗證DNS服務

接下來使用一個帶有nslookup工具的Pod來驗證DNS服務是否能正常工作:

  • 建立busybox.yaml內容如下:

    [root@k8s-master ~]# cat busybox.yaml 
    apiVersion: v1
    kind: Pod
    metadata:
      name: busybox
      namespace: default
    spec:
      containers:
      - name: busybox
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/busybox
        command:
          - sleep
          - "3600"
        imagePullPolicy: IfNotPresent
      restartPolicy: Always
    
    12345678910111213141516
    
  • 採用kubectl apply命令建立pod

    [root@k8s-master ~]# kubectl apply -f busybox.yaml
    pod/busybox created
    # 採用kubectl describe命令發現busybox建立成功
    [root@k8s-master ~]# kubectl describe po/busybox
    .......
    Events:
      Type    Reason     Age   From                  Message
      ----    ------     ----  ----                  -------
      Normal  Scheduled  4s    default-scheduler     Successfully assigned default/busybox to 172.16.4.13
      Normal  Pulling    4s    kubelet, 172.16.4.13  Pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/busybox"
      Normal  Pulled     1s    kubelet, 172.16.4.13  Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/google_containers/busybox"
      Normal  Created    1s    kubelet, 172.16.4.13  Created container busybox
      Normal  Started    1s    kubelet, 172.16.4.13  Started container busybox
    
    1234567891011121314
    
  • 在容器成功啟動後,通過kubectl exec <contaier_name> nslookup進行測試。

[root@k8s-master ~]# kubectl exec busybox -- nslookup kubernetes
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes
Address 1: 10.10.10.1 kubernetes.default.svc.cluster.local

1234567

注意:如果某個Service屬於不同的名稱空間,那麼在進行Service查詢時,需要補充Namespace的名稱,組合完整的域名,下面以查詢kube-dns服務為例,將其所在的Namespace“kube-system”補充在服務名之後,用“.”連線為”kube-dns.kube-system“,即可查詢成功:

# 錯誤案例,沒有指定namespace
[root@k8s-master ~]# kubectl exec busybox -- nslookup kube-dns
nslookup: can't resolve 'kube-dns'
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

command terminated with exit code 1

# 成功案例。
[root@k8s-master ~]# kubectl exec busybox -- nslookup kube-dns.kube-system
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

Name:      kube-dns.kube-system
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

12345678910111213141516

安裝dashboard外掛

Kubernetes的Web UI網頁管理工具kubernetes-dashboard可提供部署應用、資源物件管理、容器日誌查詢、系統監控等常用的叢集管理功能。為了在頁面上顯示系統資源的使用情況,要求部署Metrics Server。參考:

dashboard官方檔案目錄:

https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dashboard

由於 kube-apiserver 啟用了 RBAC 授權,而官方原始碼目錄的 dashboard-controller.yaml 沒有定義授權的 ServiceAccount,所以後續訪問 API server 的 API 時會被拒絕,不過從k8s v.18.3中官方文件提供了dashboard.rbac.yaml檔案。

(1)建立部署檔案kubernetes-dashboard.yaml,其內容如下:

# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# ------------------- Dashboard Secret ------------------- #

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-certs
  namespace: kube-system
type: Opaque

---
# ------------------- Dashboard Service Account ------------------- #

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system

---
# ------------------- Dashboard Role & Role Binding ------------------- #

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: kubernetes-dashboard-minimal
  namespace: kube-system
rules:
  # Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret.
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["create"]
  # Allow Dashboard to create 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["create"]
  # Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"]
  verbs: ["get", "update", "delete"]
  # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["kubernetes-dashboard-settings"]
  verbs: ["get", "update"]
  # Allow Dashboard to get metrics from heapster.
- apiGroups: [""]
  resources: ["services"]
  resourceNames: ["heapster"]
  verbs: ["proxy"]
- apiGroups: [""]
  resources: ["services/proxy"]
  resourceNames: ["heapster", "http:heapster:", "https:heapster:"]
  verbs: ["get"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kubernetes-dashboard-minimal
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubernetes-dashboard-minimal
subjects:
- kind: ServiceAccount
  name: kubernetes-dashboard
  namespace: kube-system

---
# ------------------- Dashboard Deployment ------------------- #

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
    spec:
      containers:
      - name: kubernetes-dashboard
        image: lizhenliang/kubernetes-dashboard-amd64:v1.10.1
        ports:
        - containerPort: 8443
          protocol: TCP
        args:
          - --auto-generate-certificates
          # Uncomment the following line to manually specify Kubernetes API server Host
          # If not specified, Dashboard will attempt to auto discover the API server and connect
          # to it. Uncomment only if the default does not work.
          # - --apiserver-host=http://my-address:port
        volumeMounts:
        - name: kubernetes-dashboard-certs
          mountPath: /certs
          # Create on-disk volume to store exec logs
        - mountPath: /tmp
          name: tmp-volume
        livenessProbe:
          httpGet:
            scheme: HTTPS
            path: /
            port: 8443
          initialDelaySeconds: 30
          timeoutSeconds: 30
      volumes:
      - name: kubernetes-dashboard-certs
        secret:
          secretName: kubernetes-dashboard-certs
      - name: tmp-volume
        emptyDir: {}
      serviceAccountName: kubernetes-dashboard
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule

---
# ------------------- Dashboard Service ------------------- #

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
spec:
  type: NodePort
  ports:
    - port: 443
      targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163

(2)檢視建立狀態

[root@k8s-master ~]# kubectl get all -n kube-system | grep dashboard

pod/kubernetes-dashboard-7df98d85bd-jbwh2   1/1     Running   0          18m

service/kubernetes-dashboard   NodePort    10.10.10.91   <none>        443:41498/TCP   18m

deployment.apps/kubernetes-dashboard   1/1     1            1           18m
replicaset.apps/kubernetes-dashboard-7df98d85bd   1         1         1       18m

123456789

(3)此時可以通過node節點的31116埠進行訪問。輸入:https://172.16.4.13:41498https://172.16.4.14:41498.

並且通過之前的CoreDNS能夠解析到其服務的IP地址:

[root@k8s-master ~]# kubectl get svc -n kube-system
NAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)         AGE
kube-dns               ClusterIP   10.10.10.2    <none>        53/UDP,53/TCP   5h38m
kubernetes-dashboard   NodePort    10.10.10.91   <none>        443:41498/TCP   26m
[root@k8s-master ~]# kubectl exec busybox -- nslookup kubernetes-dashboard.kube-system
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes-dashboard.kube-system
Address 1: 10.10.10.91 kubernetes-dashboard.kube-system.svc.cluster.local

1234567891011

(4)建立SA並繫結cluster-admin管理員叢集角色

[root@k8s-master ~]# kubectl create serviceaccount dashboard-admin -n kube-system
serviceaccount/dashboard-admin created
[root@k8s-master ~]# kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created
# 檢視已建立的serviceaccount
[root@k8s-master ~]# kubectl get secret -n kube-system | grep admin
dashboard-admin-token-69zsx        kubernetes.io/service-account-token   3      65s
# 檢視生成的token的具體資訊並將token值複製到瀏覽器中,採用令牌登入。
[root@k8s-master ~]# kubectl describe secret dashboard-admin-token-69zsx -n kube-system
Name:         dashboard-admin-token-69zsx
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: dashboard-admin
              kubernetes.io/service-account.uid: dfe59297-8f46-11e9-b92b-e67418705759

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1359 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tNjl6c3giLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiZGZlNTkyOTctOGY0Ni0xMWU5LWI5MmItZTY3NDE4NzA1NzU5Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.Wl6WiT6MZ-37ArWhPuhudac5S1Y8v2GxiUdNcy4hIwHQ1EdtzaAlvpx1mLZsQoDYJCeM6swVtNgJwhO5ESZAYQVi9xCrXsQcEDIeBkjyzpu6U4XHmab7SuS0_KEsGXhe57XKq86ogK9bAyNvNWE497V2giJJy5eR6CHKH3GR6mIwTQDSKEf-GfDfs9SHvQxRjchsrYLJLS3B_XfZyNHFXcieMZHy7V7Ehx2jMzwh6WNk6Mqk5N-IlZQRxmTBHTe3i9efN8r7CjvRhZdKc5iF6V4eG0QWkxR95WOzgV2QCCyLh4xEJw895FlHFJ1oTR2sUIRugnzyfqZaPQxdXcrc7Q
12345678910111213141516171819202122

(5)在瀏覽器中選擇token方式登入,即可檢視到叢集的狀態:

  • 注意:訪問dashboard實際上有三種方式,上述過程只演示了第一種方式:
    • kubernetes-dashboard 服務暴露了 NodePort,可以使用 http://NodeIP:nodePort 地址訪問 dashboard。
    • 通過 API server 訪問 dashboard(https 6443埠和http 8080埠方式)。
    • 通過 kubectl proxy 訪問 dashboard。

採用kubectl proxy訪問dashboard

(1)啟動代理

[root@k8s-master ~]# kubectl proxy --address='172.16.4.12' --port=8086 --accept-hosts='^*$'
Starting to serve on 172.16.4.12:8086

123

(2)訪問dashboard

訪問URL:http://172.16.4.12:8086/ui 自動跳轉到:http://172.16.4.12:8086/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=default

安裝heapster外掛

準備映象

heapster release 頁面 下載最新版本的 heapster。

wget https://github.com/kubernetes-retired/heapster/archive/v1.5.4.tar.gz
tar zxvf heapster-1.5.4.tar.gz 
[root@k8s-master ~]# cd heapster-1.5.4/deploy/kube-config/influxdb/ && ls
grafana.yaml  heapster.yaml  influxdb.yaml

12345

(1)我們修改heapster.yaml後內容如下:

# ------------------- Heapster Service Account ------------------- #

apiVersion: v1
kind: ServiceAccount
metadata:
  name: heapster
  namespace: kube-system

---
# ------------------- Heapster Role & Role Binding ------------------- #

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: heapster
subjects:
  - kind: ServiceAccount
    name: heapster
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
---
# ------------------- Heapster Deployment ------------------- #
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: heapster
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: heapster
    spec:
      serviceAccountName: heapster
      containers:
      - name: heapster
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-amd64:v1.5.3
        imagePullPolicy: IfNotPresent
        command:
        - /heapster
        - --source=kubernetes:https://kubernetes.default
        - --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
---
# ------------------- Heapster Service ------------------- #

apiVersion: v1
kind: Service
metadata:
  labels:
    task: monitoring
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: Heapster
  name: heapster
  namespace: kube-system
spec:
  ports:
  - port: 80
    targetPort: 8082
  selector:
    k8s-app: heapster

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768

(2)我們修改influxdb.yaml後內容如下:

[root@k8s-master influxdb]# cat influxdb.yaml 
# ------------------- Influxdb Deployment ------------------- #
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-influxdb
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: influxdb
    spec:
      containers:
      - name: influxdb
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-influxdb-amd64:v1.3.3
        volumeMounts:
        - mountPath: /data
          name: influxdb-storage
      volumes:
      - name: influxdb-storage
        emptyDir: {}
---
# ------------------- Influxdb Service ------------------- #

apiVersion: v1
kind: Service
metadata:
  labels:
    task: monitoring
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-influxdb
  name: monitoring-influxdb
  namespace: kube-system
spec:
  type: NodePort
  ports:
  - port: 8086
    targetPort: 8086
    name: http
  - port: 8083
    targetPort: 8083
    name: admin
  selector:
    k8s-app: influxdb
---
#-------------------Influxdb Cm-----------------#
apiVersion: v1
kind: ConfigMap
metadata:
  name: influxdb-config
  namespace: kube-system
data:
  config.toml: |
    reporting-disabled = true
    bind-address = ":8088"
    [meta]
      dir = "/data/meta"
      retention-autocreate = true
      logging-enabled = true
    [data]
      dir = "/data/data"
      wal-dir = "/data/wal"
      query-log-enabled = true
      cache-max-memory-size = 1073741824
      cache-snapshot-memory-size = 26214400
      cache-snapshot-write-cold-duration = "10m0s"
      compact-full-write-cold-duration = "4h0m0s"
      max-series-per-database = 1000000
      max-values-per-tag = 100000
      trace-logging-enabled = false
    [coordinator]
      write-timeout = "10s"
      max-concurrent-queries = 0
      query-timeout = "0s"
      log-queries-after = "0s"
      max-select-point = 0
      max-select-series = 0
      max-select-buckets = 0
    [retention]
      enabled = true
      check-interval = "30m0s"
    [admin]
      enabled = true
      bind-address = ":8083"
      https-enabled = false
      https-certificate = "/etc/ssl/influxdb.pem"
    [shard-precreation]
      enabled = true
      check-interval = "10m0s"
      advance-period = "30m0s"
    [monitor]
      store-enabled = true
      store-database = "_internal"
      store-interval = "10s"
    [subscriber]
      enabled = true
      http-timeout = "30s"
      insecure-skip-verify = false
      ca-certs = ""
      write-concurrency = 40
      write-buffer-size = 1000
    [http]
      enabled = true
      bind-address = ":8086"
      auth-enabled = false
      log-enabled = true
      write-tracing = false
      pprof-enabled = false
      https-enabled = false
      https-certificate = "/etc/ssl/influxdb.pem"
      https-private-key = ""
      max-row-limit = 10000
      max-connection-limit = 0
      shared-secret = ""
      realm = "InfluxDB"
      unix-socket-enabled = false
      bind-socket = "/var/run/influxdb.sock"
    [[graphite]]
      enabled = false
      bind-address = ":2003"
      database = "graphite"
      retention-policy = ""
      protocol = "tcp"
      batch-size = 5000
      batch-pending = 10
      batch-timeout = "1s"
      consistency-level = "one"
      separator = "."
      udp-read-buffer = 0
    [[collectd]]
      enabled = false
      bind-address = ":25826"
      database = "collectd"
      retention-policy = ""
      batch-size = 5000
      batch-pending = 10
      batch-timeout = "10s"
      read-buffer = 0
      typesdb = "/usr/share/collectd/types.db"
    [[opentsdb]]
      enabled = false
      bind-address = ":4242"
      database = "opentsdb"
      retention-policy = ""
      consistency-level = "one"
      tls-enabled = false
      certificate = "/etc/ssl/influxdb.pem"
      batch-size = 1000
      batch-pending = 5
      batch-timeout = "1s"
      log-point-errors = true
    [[udp]]
      enabled = false
      bind-address = ":8089"
      database = "udp"
      retention-policy = ""
      batch-size = 5000
      batch-pending = 10
      read-buffer = 0
      batch-timeout = "1s"
      precision = ""
    [continuous_queries]
      log-enabled = true
      enabled = true
      run-interval = "1s"

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171

(3)我們修改grafana.yaml後文件內容如下:

[root@k8s-master influxdb]# cat grafana.yaml 
#------------Grafana Deployment----------------#

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-grafana
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: grafana
    spec:
      containers:
      - name: grafana
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-grafana-amd64:v4.4.3
        ports:
        - containerPort: 3000
          protocol: TCP
        volumeMounts:
        #- mountPath: /etc/ssl/certs
        #  name: ca-certificates
        #  readOnly: true
        - mountPath: /var
          name: grafana-storage
        env:
        - name: INFLUXDB_HOST
          value: monitoring-influxdb
        #- name: GF_SERVER_HTTP_PORT
        - name: GRAFANA_PORT
          value: "3000"
          # The following env variables are required to make Grafana accessible via
          # the kubernetes api-server proxy. On production clusters, we recommend
          # removing these env variables, setup auth for grafana, and expose the grafana
          # service using a LoadBalancer or a public IP.
        - name: GF_AUTH_BASIC_ENABLED
          value: "false"
        - name: GF_AUTH_ANONYMOUS_ENABLED
          value: "true"
        - name: GF_AUTH_ANONYMOUS_ORG_ROLE
          value: Admin
        - name: GF_SERVER_ROOT_URL
          # If you're only using the API Server proxy, set this value instead:
          value: /api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
          # value: /
      volumes:
      # - name: ca-certificates
      #  hostPath:
      #    path: /etc/ssl/certs
      - name: grafana-storage
        emptyDir: {}
---
#------------Grafana Service----------------#

apiVersion: v1
kind: Service
metadata:
  labels:
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-grafana
  name: monitoring-grafana
  namespace: kube-system
spec:
  # In a production setup, we recommend accessing Grafana through an external Loadbalancer
  # or through a public IP.
  # type: LoadBalancer
  # You could also use NodePort to expose the service at a randomly-generated port
  # type: NodePort
  ports:
  - port: 80
    targetPort: 3000
  selector:
    k8s-app: grafana

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879

執行所有定義檔案

[root@k8s-master influxdb]# pwd
/root/heapster-1.5.4/deploy/kube-config/influxdb

[root@k8s-master influxdb]# ls
grafana.yaml  heapster.yaml  influxdb.yaml

[root@k8s-master influxdb]# kubectl create -f .
deployment.extensions/monitoring-grafana created
service/monitoring-grafana created
serviceaccount/heapster created
clusterrolebinding.rbac.authorization.k8s.io/heapster created
service/heapster created
deployment.extensions/heapster created
deployment.extensions/monitoring-influxdb created
service/monitoring-influxdb created
configmap/influxdb-config created
Error from server (AlreadyExists): error when creating "heapster.yaml": serviceaccounts "heapster" already exists
Error from server (AlreadyExists): error when creating "heapster.yaml": clusterrolebindings.rbac.authorization.k8s.io "heapster" already exists
Error from server (AlreadyExists): error when creating "heapster.yaml": services "heapster" already exists

1234567891011121314151617181920

檢查執行結果

# 檢查Deployment
[root@k8s-master influxdb]# kubectl get deployments -n kube-system | grep -E 'heapster|monitoring'
heapster               1/1     1            1           10m
monitoring-grafana     1/1     1            1           10m
monitoring-influxdb    1/1     1            1           10m

# 檢查Pods
[root@k8s-master influxdb]# kubectl get pods -n kube-system | grep -E 'heapster|monitoring'
heapster-75d646bf58-9x9tz               1/1     Running   0          10m
monitoring-grafana-77997bd67d-5khvp     1/1     Running   0          10m
monitoring-influxdb-7d6c5fb944-jmrv6    1/1     Running   0          10m

123456789101112

訪問各dashboard介面

錯誤一:system:anonymous問題

訪問dashboard網頁時,可能出現以下問題:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {
    
  },
  "status": "Failure",
  "message": "services \"heapster\" is forbidden: User \"system:anonymous\" cannot get resource \"services/proxy\" in API group \"\" in the namespace \"kube-system\"",
  "reason": "Forbidden",
  "details": {
    "name": "heapster",
    "kind": "services"
  },
  "code": 403
}

12345678910111213141516

分析問題Kubernetes API Server新增了–anonymous-auth選項,允許匿名請求訪問secure port。沒有被其他authentication方法拒絕的請求即Anonymous requests, 這樣的匿名請求的usernamesystem:anonymous, 歸屬的組為system:unauthenticated。並且該選線是預設的。這樣一來,當採用chrome瀏覽器訪問dashboard UI時很可能無法彈出使用者名稱、密碼輸入對話方塊,導致後續authorization失敗。為了保證使用者名稱、密碼輸入對話方塊的彈出,需要將–anonymous-auth設定為false

  • 再次訪問dashboard發現多了CPU使用率和記憶體使用率的表格:

    (2)訪問grafana頁面
  • 通過kube-apiserver訪問:

獲取 monitoring-grafana 服務 URL

[root@k8s-master ~]# kubectl cluster-info
Kubernetes master is running at https://172.16.4.12:6443
Heapster is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
monitoring-grafana is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
monitoring-influxdb is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-influxdb:http/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

123456789

訪問瀏覽器URL:https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy

  • 通過kubectl proxy訪問:

建立代理

[root@k8s-master ~]# kubectl proxy --address='172.16.4.12' --port=8084 --accept-hosts='^*$'
Starting to serve on 172.16.4.12:8084

123

訪問influxdb admin UI

獲取 influxdb http 8086 對映的 NodePort

[root@k8s-master influxdb]# kubectl get svc -n kube-system|grep influxdb
monitoring-influxdb    NodePort    10.10.10.154   <none>        8086:43444/TCP,8083:49123/TCP   53m

123

通過 kube-apiserver 的非安全埠訪問 influxdb 的 admin UI 介面: http://172.16.4.12:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:8083/

在頁面的 “Connection Settings” 的 Host 中輸入 node IP, Port 中輸入 8086 對映的 nodePort 如上面的 43444,點選 “Save” 即可(我的叢集中的地址是172.16.4.12:32299)

  • 錯誤一:通過kube-apiserver訪問不到influxdb dashboard,出現yaml檔案內容。
{
  "kind": "Service",
  "apiVersion": "v1",
  "metadata": {
    "name": "monitoring-influxdb",
    "namespace": "kube-system",
    "selfLink": "/api/v1/namespaces/kube-system/services/monitoring-influxdb",
    "uid": "22c9ab6c-8f72-11e9-b92b-e67418705759",
    "resourceVersion": "215237",
    "creationTimestamp": "2019-06-15T13:33:18Z",
    "labels": {
      "kubernetes.io/cluster-service": "true",
      "kubernetes.io/name": "monitoring-influxdb",
      "task": "monitoring"
    }
  },
  "spec": {
    "ports": [
      {
        "name": "http",
        "protocol": "TCP",
        "port": 8086,
        "targetPort": 8086,
        "nodePort": 43444
      },
      {
        "name": "admin",
        "protocol": "TCP",
        "port": 8083,
        "targetPort": 8083,
        "nodePort": 49123
      }
    ],
    "selector": {
      "k8s-app": "influxdb"
    },
    "clusterIP": "10.10.10.154",
    "type": "NodePort",
    "sessionAffinity": "None",
    "externalTrafficPolicy": "Cluster"
  },
  "status": {
    "loadBalancer": {
      
    }
  }
}

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748

安裝EFK外掛

在Kubernetes叢集中,一個完整的應用或服務都會涉及為數眾多的元件執行,各元件所在的Node及例項數量都是可變的。日誌子系統如果不做集中化管理,則會給系統的運維支撐造成很大的困難,因此有必要在叢集層面對日誌進行統一收集和檢索等工作。

在容器中輸出到控制檯的日誌,都會以“*-json.log”的命名方式儲存到/var/lib/docker/containers/目錄下,這就為日誌採集和後續處理奠定了基礎。

Kubernetes推薦用Fluentd+Elasticsearch+Kibana完成對系統和容器日誌的採集、查詢和展現工作。

部署統一日誌管理系統,需要以下兩個前提條件:

  • API Server正確配置了CA證書。
  • DNS服務啟動、執行。

系統部署架構


我們通過在每臺node上部署一個以DaemonSet方式執行的fluentd來收集每臺node上的日誌。Fluentd將docker日誌目錄/var/lib/docker/containers/var/log目錄掛載到Pod中,然後Pod會在node節點的/var/log/pods目錄中建立新的目錄,可以區別不同的容器日誌輸出,該目錄下有一個日誌檔案連結到/var/lib/docker/contianers目錄下的容器日誌輸出。注意:兩個目錄下的日誌都會彙集到ElasticSearch叢集,最終通過Kibana完成和使用者的互動工作。

這裡有一個特殊需求:Fluentd必須在每個Node上執行,為了滿足這一需求,我們通過以下幾種方式部署Fluentd。

  • 直接在Node主機上部署Fluentd.
  • 利用kubelet的–config引數,為每個node都載入Fluentd Pod。
  • 利用DaemonSet讓Fluentd Pod在每個Node上執行。

官方檔案目錄:https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/fluentd-elasticsearch

配置EFK服務配置檔案

建立目錄盛放檔案

[root@k8s-master ~]# mkdir EFK && cd EFK

12

配置EFK-RABC服務

[root@k8s-master EFK]# cat efk-rbac.yaml 
apiVersion: v1
kind: ServiceAccount
metadata:
  name: efk
  namespace: kube-system

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: efk
subjects:
  - kind: ServiceAccount
    name: efk
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
# 注意配置的ServiceAccount為efk。

1234567891011121314151617181920212223

配置ElasticSearch服務

# 此處將官方的三個文件合併成了一個elasticsearch.yaml,內容如下:

[root@k8s-master EFK]# cat elasticsearch.yaml 
#------------ElasticSearch RBAC---------#

apiVersion: v1
kind: ServiceAccount
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: elasticsearch-logging
  labels:
    k8s-app: elasticsearch-logging
    addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
  - ""
  resources:
  - "services"
  - "namespaces"
  - "endpoints"
  verbs:
  - "get"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: kube-system
  name: elasticsearch-logging
  labels:
    k8s-app: elasticsearch-logging
    addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
  name: elasticsearch-logging
  namespace: kube-system
  apiGroup: ""
roleRef:
  kind: ClusterRole
  name: elasticsearch-logging
  apiGroup: ""
---

# -----------ElasticSearch Service--------------#
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "Elasticsearch"
spec:
  ports:
  - port: 9200
    protocol: TCP
    targetPort: db
  selector:
    k8s-app: elasticsearch-logging
---

#-------------------ElasticSearch StatefulSet-------#
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    version: v6.6.1
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  serviceName: elasticsearch-logging
  replicas: 2
  selector:
    matchLabels:
      k8s-app: elasticsearch-logging
      version: v6.7.2
  template:
    metadata:
      labels:
        k8s-app: elasticsearch-logging
        version: v6.7.2
    spec:
      serviceAccountName: elasticsearch-logging
      containers:
      - image: docker.elastic.co/elasticsearch/elasticsearch:6.6.1
        name: elasticsearch-logging
        resources:
          # need more cpu upon initialization, therefore burstable class
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        volumeMounts:
        - name: elasticsearch-logging
          mountPath: /data
        env:
        - name: "NAMESPACE"
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: ES_JAVA_OPTS
          value: -Xms1024m -Xmx1024m
      volumes:
      - name: elasticsearch-logging
        emptyDir: {}
       # Elasticsearch requires vm.max_map_count to be at least 262144.
       # If your OS already sets up this number to a higher value, feel free
       # to remove this init container.
      initContainers:
      - image: alpine:3.6
        command: ["/sbin/sysctl", "-w", "vm.max_map_count=262144"]
        name: elasticsearch-logging-init
        securityContext:
          privileged: true

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133

配置Fluentd服務的configmap,此處通過td-agent建立

# td-agent提供了一個官方文件:個人感覺繁瑣,可以直接採用其指令碼安裝。
curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh

# 正式配置configmap,其配置檔案如下,可以自己手動建立。
[root@k8s-master fluentd-es-image]# cat td-agent.conf 
kind: ConfigMap
apiVersion: v1
metadata:
  name: td-agent-config
  namespace: kube-system
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  td-agent.conf: |
    <filter kubernetes.**>
      @type kubernetes_metadata
      tls-cert-file /etc/kubernetes/ssl/server.pem
      tls-private-key-file /etc/kubernetes/ssl/server-key.pem
      client-ca-file /etc/kubernetes/ssl/ca.pem
      service-account-key-file /etc/kubernetes/ssl/ca-key.pem
    </filter>

    <match **>
      @id elasticsearch
      @type elasticsearch
      @log_level info
      type_name _doc
      include_tag_key true
      host 172.16.4.12
      port 9200
      logstash_format true
      <buffer>
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 2M
        queue_limit_length 8
        overflow_action block
      </buffer>
    </match>

    <source>
    type null  
    type tail
    path /var/log/containers/*.log
    pos_file /var/log/es-containers.log.pos
    time_format %Y-%m-%dT%H:%M:%S.%NZ
    tag kubernetes.*
    format json
    read_from_head true
    </source>
    
# 注意將configmap建立在kube-system的名稱空間下。
kubectl create configmap td-agent-config --from-file=./td-agent.conf -n kube-system

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
# 建立fluentd的DaemonSet
[root@k8s-master EFK]# cat fluentd.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd-es-v1.22
  namespace: kube-system
  labels:
    k8s-app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    version: v1.22
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-es
        kubernetes.io/cluster-service: "true"
        version: v1.22
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      serviceAccountName: efk
      containers:
      - name: fluentd-es
        image: travix/fluentd-elasticsearch:1.22
        command:
          - '/bin/sh'
          - '-c'
          - '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log'
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      nodeSelector:
        beta.kubernetes.io/fluentd-ds-ready: "true"
      tolerations:
      - key : "node.alpha.kubernetes.io/ismaster"
        effect: "NoSchedule"
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
# 此處採用了一個dockerhub上公共映象,官方映象需要翻牆。

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657

配置Kibana服務

[root@k8s-master EFK]# cat kibana.yaml 
#---------------Kibana Deployment-------------------#

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: kibana-logging
  template:
    metadata:
      labels:
        k8s-app: kibana-logging
      annotations:
        seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
    spec:
      serviceAccountName: efk
      containers:
      - name: kibana-logging
        image: docker.elastic.co/kibana/kibana-oss:6.6.1
        resources:
          # keep request = limit to keep this container in guaranteed class
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        env:
          - name: "ELASTICSEARCH_URL"
            value: "http://172.16.4.12:9200"
          # modified by gzr
          #  value: "http://elasticsearch-logging:9200"
          - name: "SERVER_BASEPATH"
            value: "/api/v1/proxy/namespaces/kube-system/services/kibana-logging/proxy"
        ports:
        - containerPort: 5601
          name: ui
          protocol: TCP
---

#------------------Kibana Service---------------------#

apiVersion: v1
kind: Service
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "Kibana"
spec:
  ports:
  - port: 5601
    protocol: TCP
    targetPort: ui
  selector:
    k8s-app: kibana-logging


1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768

給Node設定標籤

定義 DaemonSet fluentd-es-v1.22 時設定了 nodeSelector beta.kubernetes.io/fluentd-ds-ready=true ,所以需要在期望執行 fluentd 的 Node 上設定該標籤;

[root@k8s-master EFK]# kubectl get nodes
NAME          STATUS   ROLES    AGE     VERSION
172.16.4.12   Ready    <none>   18h     v1.14.3
172.16.4.13   Ready    <none>   2d15h   v1.14.3
172.16.4.14   Ready    <none>   2d15h   v1.14.3

[root@k8s-master EFK]#kubectl label nodes 172.16.4.14 beta.kubernetes.io/fluentd-ds-ready=true
node "172.16.4.14" labeled

[root@k8s-master EFK]#kubectl label nodes 172.16.4.13 beta.kubernetes.io/fluentd-ds-ready=true
node "172.16.4.13" labeled

[root@k8s-master EFK]#kubectl label nodes 172.16.4.12 beta.kubernetes.io/fluentd-ds-ready=true
node "172.16.4.12" labeled

123456789101112131415

執行定義的檔案

[root@k8s-master EFK]# kubectl create -f .
serviceaccount/efk created
clusterrolebinding.rbac.authorization.k8s.io/efk created
service/elasticsearch-logging created
serviceaccount/elasticsearch-logging created
clusterrole.rbac.authorization.k8s.io/elasticsearch-logging created
clusterrolebinding.rbac.authorization.k8s.io/elasticsearch-logging created
statefulset.apps/elasticsearch-logging created
daemonset.extensions/fluentd-es-v1.22 created
deployment.apps/kibana-logging created
service/kibana-logging created

123456789101112

驗證執行結果

[root@k8s-master EFK]# kubectl get po -n kube-system -o wide| grep -E 'elastic|fluentd|kibana'
elasticsearch-logging-0                 1/1     Running            0          115m    172.30.69.5   172.16.4.14   <none>           <none>
elasticsearch-logging-1                 1/1     Running            0          115m    172.30.20.8   172.16.4.13   <none>           <none>
fluentd-es-v1.22-4bmtm                  0/1     CrashLoopBackOff   16         58m     172.30.53.2   172.16.4.12   <none>           <none>
fluentd-es-v1.22-f9hml                  1/1     Running            0          58m     172.30.69.6   172.16.4.14   <none>           <none>
fluentd-es-v1.22-x9rf4                  1/1     Running            0          58m     172.30.20.9   172.16.4.13   <none>           <none>
kibana-logging-7db9f954ff-mkbhr         1/1     Running            0          25s     172.30.69.7   172.16.4.14   <none>           <none>

12345678

kibana Pod 第一次啟動時會用較長時間(10-20分鐘)來優化和 Cache 狀態頁面,可以 tailf 該 Pod 的日誌觀察進度。

[root@k8s-master EFK]# kubectl logs kibana-logging-7db9f954ff-mkbhr -n kube-system
{"type":"log","@timestamp":"2019-06-18T09:23:33Z","tags":["plugin","warning"],"pid":1,"path":"/usr/share/kibana/src/legacy/core_plugins/ems_util","message":"Skipping non-plugin directory at /usr/share/kibana/src/legacy/core_plugins/ems_util"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["warning","elasticsearch","config","deprecation"],"pid":1,"message":"Config key \"url\" is deprecated. It has been replaced with \"hosts\""}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from yellow to green - Ready","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["listening","info"],"pid":1,"message":"Server running at http://0:5601"}
[root@k8s-master EFK]# kubectl logs kibana-logging-7db9f954ff-mkbhr -n kube-system
{"type":"log","@timestamp":"2019-06-18T09:23:33Z","tags":["plugin","warning"],"pid":1,"path":"/usr/share/kibana/src/legacy/core_plugins/ems_util","message":"Skipping non-plugin directory at /usr/share/kibana/src/legacy/core_plugins/ems_util"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["warning","elasticsearch","config","deprecation"],"pid":1,"message":"Config key \"url\" is deprecated. It has been replaced with \"hosts\""}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from yellow to green - Ready","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["listening","info"],"pid":1,"message":"Server running at http://0:5601"}
......

123456789101112131415161718192021222324

訪問kibana

  1. 通過kube-apiserver訪問:

獲取kibana服務URL

[root@k8s-master ~]# kubectl cluster-info
Kubernetes master is running at https://172.16.4.12:6443
Elasticsearch is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/elasticsearch-logging/proxy
Heapster is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/heapster/proxy
Kibana is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kibana-logging/proxy
KubeDNS is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
monitoring-grafana is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
monitoring-influxdb is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-influxdb:http/proxy

123456789

瀏覽器訪問URL:https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kibana-logging/proxy

  • 錯誤1:Kibana did not load properly. Check the server output for more information.


解決辦法:

  • 錯誤2:訪問kibana,出現503錯誤,具體內容如下:
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {
    
  },
  "status": "Failure",
  "message": "no endpoints available for service \"kibana-logging\"",
  "reason": "ServiceUnavailable",
  "code": 503
}

123456789101112