使用kubeadm部署k8s集群01-初始化
2018/1/3
節點配置
- master x3
OS
- version: centos7
swapoff
### 阿裏雲默認:off
hosts
### 每個節點上配置:
[root@tvm-00 ~]# cat /etc/hosts
### k8s master @envDev
10.10.9.67 tvm-00
10.10.9.68 tvm-01
10.10.9.69 tvm-02
Docker
- version: latest(17.09.1-ce)
安裝
### 安裝 [root@tvm-00 ~]# yum -y install yum-utils [root@tvm-00 ~]# yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo [root@tvm-00 ~]# yum makecache fast ### 可以直接 yum -y install docker-ce 來安裝,但如果要保持版本一致,應該指定完整的包名,例如: [root@tvm-00 ~]# yum -y install docker-ce-17.09.1.ce-1.el7.centos.x86_64 ### 個性化配置 [root@tvm-00 ~]# mkdir -p /data2/docker [root@tvm-00 ~]# mkdir -p /etc/docker; tee /etc/docker/daemon.json <<-‘EOF‘ { "exec-opts": ["native.cgroupdriver=cgroupfs"], "graph": "/data/docker", "storage-driver": "overlay", "log-driver": "json-file", "log-opts": { "max-size": "100m" }, "registry-mirrors": ["https://xxx.mirror.aliyuncs.com"] } EOF ### 註意:此處設置了 docker 的 cgroupdriver 和 k8s 保持一致 ### 參考文檔:#2,#3(文末) [root@tvm-00 ~]# systemctl daemon-reload && systemctl enable docker && systemctl start docker
鏡像
registry mirror
- 在阿裏雲上開通容器鏡像服務後,可以找到一個專屬的加速地址
- 已經在上一步配置 docker 時使用
kubeadm 需要下述鏡像
- 提前 pull 到本地,如果網絡慢,可考慮通過 docker save && docker load 操作分發鏡像到各節點
### 針對下述鏡像: gcr.io/google_containers/kube-apiserver-amd64:v1.9.0 gcr.io/google_containers/kube-controller-manager-amd64:v1.9.0 gcr.io/google_containers/kube-scheduler-amd64:v1.9.0 gcr.io/google_containers/kube-proxy-amd64:v1.9.0 gcr.io/google_containers/etcd-amd64:3.1.10 gcr.io/google_containers/pause-amd64:3.0 gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.7 gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.7 gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.7
制作 master 節點用的 image 壓縮包
[root@tvm-00 ~]# cd ~/k8s_install/master/gcr.io
[root@tvm-00 gcr.io]# docker save -o gcr.io-all.tar \
gcr.io/google_containers/kube-apiserver-amd64:v1.9.0 \
gcr.io/google_containers/kube-controller-manager-amd64:v1.9.0 \
gcr.io/google_containers/kube-scheduler-amd64:v1.9.0 \
gcr.io/google_containers/kube-proxy-amd64:v1.9.0 \
gcr.io/google_containers/pause-amd64:3.0 \
gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.7 \
gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.7 \
gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.7
制作 worker 節點用的 image 壓縮包
[root@tvm-00 gcr.io]# docker save -o gcr.io-worker.tar \
gcr.io/google_containers/kube-proxy-amd64:v1.9.0 \
gcr.io/google_containers/pause-amd64:3.0
[root@tvm-00 gcr.io]# ls
gcr.io-all.tar gcr.io-worker.tar
同步到目標節點後,導入鏡像:
[root@tvm-00 ~]# docker load -i gcr.io-all.tar
root@tvm-00 ~]# docker load -i gcr.io-worker.tar
##### private registry
- 使用阿裏雲鏡像服務
### 準備好配置 k8s 集群所需的基礎服務
- version: 1.9.0
- 所有節點安裝 kubelet kubeadm kubectl 這3個服務
- 參考文檔:#2(文末)
##### 系統配置調整
```bash
### 禁用SELinux
[root@tvm-00 ~]# getenforce
Disabled
### 如果不是 Disabled 則:
[root@tvm-00 ~]# setenforce 0
### 系統參數
[root@tvm-00 ~]# cat <<‘_EOF‘ > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
_EOF
[root@tvm-00 ~]# sysctl --system
下載 rpm 包後本地安裝
- 因為墻的存在,你懂的。當然了,最好你擁有自己的本地 yum 源來緩存這些包
### 安裝 [root@tvm-00 ~]# cd ~/k8s_install/k8s_rpms_1.9 [root@tvm-00 k8s_rpms_1.9]# ls k8s/kubeadm-1.9.0-0.x86_64.rpm k8s/kubectl-1.9.0-0.x86_64.rpm k8s/kubelet-1.9.0-0.x86_64.rpm k8s/kubernetes-cni-0.6.0-0.x86_64.rpm k8s/socat-1.7.3.2-2.el7.x86_64.rpm
[root@tvm-00 k8s_rpms_1.9]# yum localinstall *.rpm -y
[root@tvm-00 k8s_rpms_1.9]# systemctl enable kubelet
##### cgroupfs vs systemd
- 參考文檔:#3(文末)
```bash
### 調整 --cgroup-driver 來適配 docker 服務默認采用的 cgroupfs 驅動:
[root@tvm-00 ~]# sed -i ‘s#--cgroup-driver=systemd#--cgroup-driver=cgroupfs#‘ /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
[root@tvm-00 ~]# systemctl daemon-reload
###### 因為,在 centos7 上使用 --cgroup-driver=systemd 將導致後續 kube-dns 服務異常,實例:
### (容器 kubedns 異常的實例)
[root@tvm-00 ~]# kubectl logs -n kube-system --tail=20 kube-dns-6f4fd4bdf-ntcgn -c kubedns
container_linux.go:265: starting container process caused "process_linux.go:284: applying cgroup configuration for process caused \"No such device or address\""
### (容器 sidecar 異常)
[root@tvm-00 ~]# kubectl logs -n kube-system --tail=1 kube-dns-6f4fd4bdf-ntcgn -c sidecar
W1226 06:21:40.170896 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:44903->127.0.0.1:53: read: connection refused
### (容器 dnsmasq 無異常)
[root@tvm-00 ~]# kubectl logs -n kube-system --tail=20 kube-dns-6f4fd4bdf-ntcgn -c dnsmasq
I1226 06:21:40.214148 1 main.go:76] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I1226 06:21:40.214233 1 nanny.go:94] Starting dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053]
I1226 06:21:40.222440 1 nanny.go:119]
W1226 06:21:40.222453 1 nanny.go:120] Got EOF from stdout
I1226 06:21:40.222537 1 nanny.go:116] dnsmasq[9]: started, version 2.78 cachesize 1000
### (輸出略)
初始化 k8s 集群
- 初始化前
- 如果報錯,請參考 reset 文檔
- 執行初始化
- 查看 k8s 集群的信息
- 附加組件之 network plugins - calico
- 要先傳遞 --pod-network-cidr 給 kubeadm init
- 要配置網段 CALICO_IPV4POOL_CIDR
-
初始化前
### 註意1:因為是離線安裝,參數中指定了版本 --kubernetes-version=v1.9.0 ### 註意2:指定了 CIDR 是因為後續要使用的網絡組件為 calico 需要先定義好網段來避免未來可能的沖突(後續定義 calico 配置時還會用到這個網段) --pod-network-cidr=172.30.0.0/20
下述 IP 地址池滿足小型集群的需求
網段: 172.30.0.0/20
主機列表: 172.30.0.1 - 172.30.15.254 = 4094 個
##### 如果報錯,請參考 reset 文檔
- 參考文檔:#4(文末)
```bash
[root@tvm-00 ~]# kubeadm reset
[preflight] Running pre-flight checks.
[reset] Stopping the kubelet service.
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Removing kubernetes-managed containers.
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/run/kubernetes /var/lib/etcd]
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
執行初始化
[root@tvm-00 ~]# kubeadm init --pod-network-cidr=172.30.0.0/20 --kubernetes-version=v1.9.0
[init] Using Kubernetes version: v1.9.0
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
[WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 17.09.1-ce. Max validated version: 17.03
[WARNING FileExisting-crictl]: crictl not found in system path
[preflight] Starting the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [tvm-00 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.10.9.67]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
### (輸出略)
Your Kubernetes master has initialized successfully!
查看 k8s 集群的信息
### 為了方便執行 kubectl 指令,需要如下操作:
[root@tvm-00 ~]# mkdir -p $HOME/.kube
[root@tvm-00 ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
### 查看節點信息:
[root@tvm-00 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
tvm-00 NotReady master 19h v1.9.0
### 查看日誌:
[root@tvm-00 ~]# journalctl -xeu kubelet
### 查看集群信息:
[root@tvm-00 ~]# kubectl cluster-info
Kubernetes master is running at https://10.10.9.67:6443
KubeDNS is running at https://10.10.9.67:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use ‘kubectl cluster-info dump‘.
附加組件之 network plugins - calico
- 準備 calico 需要的下述鏡像
- 提前 pull 到本地,在 worker 節點上也需要 node 和 cni 這2個鏡像
[root@tvm-00 ~]# grep image calico.yaml |uniq |sed -e ‘s#^.*image: quay.io#docker pull quay.io#g‘ docker pull quay.io/coreos/etcd:v3.1.10 docker pull quay.io/calico/node:v2.6.5 docker pull quay.io/calico/cni:v1.11.2 docker pull quay.io/calico/kube-controllers:v1.0.2
- 提前 pull 到本地,在 worker 節點上也需要 node 和 cni 這2個鏡像
可以將鏡像保存下來,拷貝到其他節點上直接 docker load 即可
[root@tvm-00 ~]# cd ~/k8s_install/master/network/
[root@tvm-00 network]# docker save -o calico-all.tar quay.io/coreos/etcd quay.io/calico/node quay.io/calico/cni quay.io/calico/kube-controllers
[root@tvm-00 network]# docker save -o calico-worker.tar quay.io/calico/node quay.io/calico/cni
[root@tvm-00 network]# ls
calico-all.tar calico-worker.tar calico.yaml
- 部署 calico
```bash
### 準備 calico.yaml 配置文件
[root@tvm-00 ~]# mkdir -p ~/k8s_install/master/network
[root@tvm-00 ~]# cd !$
[root@tvm-00 network]# curl -so calico.yaml https://docs.projectcalico.org/v2.6/getting-started/kubernetes/installation/hosted/kubeadm/1.6/calico.yaml
[root@tvm-00 network]# sed -i ‘s#192.168.0.0/16#172.30.0.0/20#‘ calico.yaml
### 部署
[root@tvm-00 network]# kubectl apply -f calico.yaml
configmap "calico-config" created
daemonset "calico-etcd" created
service "calico-etcd" created
daemonset "calico-node" created
deployment "calico-kube-controllers" created
deployment "calico-policy-controller" created
clusterrolebinding "calico-cni-plugin" created
clusterrole "calico-cni-plugin" created
serviceaccount "calico-cni-plugin" created
clusterrolebinding "calico-kube-controllers" created
clusterrole "calico-kube-controllers" created
serviceaccount "calico-kube-controllers" created
### 確認 kube-dns pod is Running
[root@tvm-00 ~]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-etcd-djrtb 1/1 Running 1 1d
kube-system calico-kube-controllers-d6c6b9b8-7ssrn 1/1 Running 1 1d
kube-system calico-node-mff7x 2/2 Running 3 1d
kube-system etcd-tvm-00 1/1 Running 1 4h
kube-system kube-apiserver-tvm-00 1/1 Running 0 2m
kube-system kube-controller-manager-tvm-00 1/1 Running 2 3d
kube-system kube-dns-6f4fd4bdf-ntcgn 3/3 Running 7 3d
kube-system kube-proxy-pfmh8 1/1 Running 1 3d
kube-system kube-scheduler-tvm-00 1/1 Running 2 3d
### 確認集群 nodes 的狀態
[root@tvm-00 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
tvm-00 Ready master 2d v1.9.0
將另外 2 個節點加入 k8s 集群
- kubeadm token
### 註意:kubeadm init 輸出的 join 指令中 token 只有 24h 的有效期,如果過期後,需要重新生成,具體請參考: [root@tvm-00 ~]# kubeadm token create --print-join-command kubeadm join --token 84d7d1.e4ed7451c620436e 10.10.9.67:6443 --discovery-token-ca-cert-hash sha256:42cfdc412e731793ce2fa20aad1d8163ee8e6e5c05c30765f204ff086823c653
[root@tvm-00 ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
84d7d1.e4ed7451c620436e 23h 2017-12-26T14:46:16+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
- kubeadm join
```bash
[root@tvm-00 ~]# kubeadm join --token 84d7d1.e4ed7451c620436e 10.10.9.67:6443 --discovery-token-ca-cert-hash sha256:42cfdc412e731793ce2fa20aad1d8163ee8e6e5c05c30765f204ff086823c653
- 查看 cluster 信息
[root@tvm-00 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION tvm-00 Ready master 3d v1.9.0 tvm-01 Ready <none> 2h v1.9.0 tvm-02 Ready <none> 27s v1.9.0
[root@tvm-00 ~]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-etcd-djrtb 1/1 Running 1 1d
kube-system calico-kube-controllers-d6c6b9b8-7ssrn 1/1 Running 1 1d
kube-system calico-node-9bncs 2/2 Running 4 19h
kube-system calico-node-mff7x 2/2 Running 3 1d
kube-system calico-node-mw96v 2/2 Running 3 19h
kube-system etcd-tvm-00 1/1 Running 1 4h
kube-system kube-apiserver-tvm-00 1/1 Running 0 2m
kube-system kube-controller-manager-tvm-00 1/1 Running 2 3d
kube-system kube-dns-6f4fd4bdf-ntcgn 3/3 Running 7 3d
kube-system kube-proxy-6nqwv 1/1 Running 1 19h
kube-system kube-proxy-7xtv4 1/1 Running 1 19h
kube-system kube-proxy-pfmh8 1/1 Running 1 3d
kube-system kube-scheduler-tvm-00 1/1 Running 2 3d
符合預期,有 3 個 calico-node 和 kube-proxy 在集群中
### ZYXW、參考
1. [一步步打造基於Kubeadm的高可用Kubernetes集群-第一部分](http://tonybai.com/2017/05/15/setup-a-ha-kubernetes-cluster-based-on-kubeadm-part1/)
2. [install docker for kubeadm](https://kubernetes.io/docs/setup/independent/install-kubeadm/#installing-docker)
3. [kube-dns crashloopbackoff after flannel/weave install #54910](https://github.com/kubernetes/kubernetes/issues/54910)
4. [kubeadm reset](https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#tear-down)
使用kubeadm部署k8s集群01-初始化