Kubernetes K8S在IPVS代理模式下Service服務的ClusterIP型別訪問失敗處理
Kubernetes K8S使用IPVS代理模式,當Service的型別為ClusterIP時,如何處理訪問service卻不能訪問後端pod的情況。
背景現象
Kubernetes K8S使用IPVS代理模式,當Service的型別為ClusterIP時,出現訪問service卻不能訪問後端pod的情況。
主機配置規劃
伺服器名稱(hostname) | 系統版本 | 配置 | 內網IP | 外網IP(模擬) |
---|---|---|---|---|
k8s-master | CentOS7.7 | 2C/4G/20G | 172.16.1.110 | 10.0.0.110 |
k8s-node01 | CentOS7.7 | 2C/4G/20G | 172.16.1.111 | 10.0.0.111 |
k8s-node02 | CentOS7.7 | 2C/4G/20G | 172.16.1.112 | 10.0.0.112 |
場景復現
Deployment的yaml資訊
yaml檔案
1 [root@k8s-master service]# pwd 2 /root/k8s_practice/service 3 [root@k8s-master service]# cat myapp-deploy.yaml 4 apiVersion: apps/v1 5 kind: Deployment 6 metadata: 7 name: myapp-deploy 8 namespace: default 9 spec: 10 replicas: 3 11 selector: 12 matchLabels: 13 app: myapp 14 release: v1 15 template: 16 metadata: 17 labels: 18 app: myapp 19 release: v1 20 env: test 21 spec: 22 containers: 23 - name: myapp 24 image: registry.cn-beijing.aliyuncs.com/google_registry/myapp:v1 25 imagePullPolicy: IfNotPresent 26 ports: 27 - name: http 28 containerPort: 80
啟動Deployment並檢視狀態
1 [root@k8s-master service]# kubectl apply -f myapp-deploy.yaml 2 deployment.apps/myapp-deploy created 3 [root@k8s-master service]# 4 [root@k8s-master service]# kubectl get deploy -o wide 5 NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR 6 myapp-deploy 3/3 3 3 14s myapp registry.cn-beijing.aliyuncs.com/google_registry/myapp:v1 app=myapp,release=v1 7 [root@k8s-master service]# kubectl get rs -o wide 8 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR 9 myapp-deploy-5695bb5658 3 3 3 21s myapp registry.cn-beijing.aliyuncs.com/google_registry/myapp:v1 app=myapp,pod-template-hash=5695bb5658,release=v1 10 [root@k8s-master service]# 11 [root@k8s-master service]# kubectl get pod -o wide --show-labels 12 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS 13 myapp-deploy-5695bb5658-7tgfx 1/1 Running 0 39s 10.244.2.111 k8s-node02 <none> <none> app=myapp,env=test,pod-template-hash=5695bb5658,release=v1 14 myapp-deploy-5695bb5658-95zxm 1/1 Running 0 39s 10.244.3.165 k8s-node01 <none> <none> app=myapp,env=test,pod-template-hash=5695bb5658,release=v1 15 myapp-deploy-5695bb5658-xtxbp 1/1 Running 0 39s 10.244.3.164 k8s-node01 <none> <none> app=myapp,env=test,pod-template-hash=5695bb5658,release=v1
curl訪問
1 [root@k8s-master service]# curl 10.244.2.111/hostname.html 2 myapp-deploy-5695bb5658-7tgfx 3 [root@k8s-master service]# 4 [root@k8s-master service]# curl 10.244.3.165/hostname.html 5 myapp-deploy-5695bb5658-95zxm 6 [root@k8s-master service]# 7 [root@k8s-master service]# curl 10.244.3.164/hostname.html 8 myapp-deploy-5695bb5658-xtxbp
Service的ClusterIP型別資訊
yaml檔案
1 [root@k8s-master service]# pwd 2 /root/k8s_practice/service 3 [root@k8s-master service]# cat myapp-svc-ClusterIP.yaml 4 apiVersion: v1 5 kind: Service 6 metadata: 7 name: myapp-clusterip 8 namespace: default 9 spec: 10 type: ClusterIP # 可以不寫,為預設型別 11 selector: 12 app: myapp 13 release: v1 14 ports: 15 - name: http 16 port: 8080 # 對外暴露埠 17 targetPort: 80 # 轉發到後端埠
啟動Service並檢視狀態
1 [root@k8s-master service]# kubectl apply -f myapp-svc-ClusterIP.yaml 2 service/myapp-clusterip created 3 [root@k8s-master service]# 4 [root@k8s-master service]# kubectl get svc -o wide 5 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR 6 kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 16d <none> 7 myapp-clusterip ClusterIP 10.102.246.104 <none> 8080/TCP 6s app=myapp,release=v1
檢視ipvs資訊
1 [root@k8s-master service]# ipvsadm -Ln 2 IP Virtual Server version 1.2.1 (size=4096) 3 Prot LocalAddress:Port Scheduler Flags 4 -> RemoteAddress:Port Forward Weight ActiveConn InActConn 5 ……………… 6 TCP 10.102.246.104:8080 rr 7 -> 10.244.2.111:80 Masq 1 0 0 8 -> 10.244.3.164:80 Masq 1 0 0 9 -> 10.244.3.165:80 Masq 1 0 0
由此可見,正常情況下:當我們訪問Service時,訪問鏈路是能夠傳遞到後端的Pod並返回資訊。
Curl訪問結果
直接訪問Pod,如下所示是能夠正常訪問的。
1 [root@k8s-master service]# curl 10.244.2.111/hostname.html 2 myapp-deploy-5695bb5658-7tgfx 3 [root@k8s-master service]# 4 [root@k8s-master service]# curl 10.244.3.165/hostname.html 5 myapp-deploy-5695bb5658-95zxm 6 [root@k8s-master service]# 7 [root@k8s-master service]# curl 10.244.3.164/hostname.html 8 myapp-deploy-5695bb5658-xtxbp
但通過Service訪問結果異常,資訊如下。
1 [root@k8s-master service]# curl 10.102.246.104:8080 2 curl: (7) Failed connect to 10.102.246.104:8080; Connection timed out
處理過程
抓包核實
使用如下命令進行抓包,並通過Wireshark工具進行分析。
tcpdump -i any -n -nn port 80 -w ./$(date +%Y%m%d%H%M%S).pcap
結果如下圖:
可見,已經向Pod發了請求,但是沒有得到回覆。結果TCP又重傳了【TCP Retransmission】。
檢視kube-proxy日誌
1 [root@k8s-master service]# kubectl get pod -A | grep 'kube-proxy' 2 kube-system kube-proxy-6bfh7 1/1 Running 1 3h52m 3 kube-system kube-proxy-6vfkf 1/1 Running 1 3h52m 4 kube-system kube-proxy-bvl9n 1/1 Running 1 3h52m 5 [root@k8s-master service]# 6 [root@k8s-master service]# kubectl logs -n kube-system kube-proxy-6bfh7 7 W0601 13:01:13.170506 1 feature_gate.go:235] Setting GA feature gate SupportIPVSProxyMode=true. It will be removed in a future release. 8 I0601 13:01:13.338922 1 node.go:135] Successfully retrieved node IP: 172.16.1.112 9 I0601 13:01:13.338960 1 server_others.go:172] Using ipvs Proxier. ##### 可見使用的是ipvs模式 10 W0601 13:01:13.339400 1 proxier.go:420] IPVS scheduler not specified, use rr by default 11 I0601 13:01:13.339638 1 server.go:571] Version: v1.17.4 12 I0601 13:01:13.340126 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072 13 I0601 13:01:13.340159 1 conntrack.go:52] Setting nf_conntrack_max to 131072 14 I0601 13:01:13.340500 1 conntrack.go:83] Setting conntrack hashsize to 32768 15 I0601 13:01:13.346991 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400 16 I0601 13:01:13.347035 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600 17 I0601 13:01:13.347703 1 config.go:313] Starting service config controller 18 I0601 13:01:13.347718 1 shared_informer.go:197] Waiting for caches to sync for service config 19 I0601 13:01:13.347736 1 config.go:131] Starting endpoints config controller 20 I0601 13:01:13.347743 1 shared_informer.go:197] Waiting for caches to sync for endpoints config 21 I0601 13:01:13.448223 1 shared_informer.go:204] Caches are synced for endpoints config 22 I0601 13:01:13.448236 1 shared_informer.go:204] Caches are synced for service config
可見kube-proxy日誌無異常
網絡卡設定並修改
備註:在k8s-master節點操作的
之後進一步搜尋表明,這可能是由於“Checksum offloading” 造成的。資訊如下:
1 [root@k8s-master service]# ethtool -k flannel.1 | grep checksum 2 rx-checksumming: on 3 tx-checksumming: on ##### 當前為 on 4 tx-checksum-ipv4: off [fixed] 5 tx-checksum-ip-generic: on ##### 當前為 on 6 tx-checksum-ipv6: off [fixed] 7 tx-checksum-fcoe-crc: off [fixed] 8 tx-checksum-sctp: off [fixed]
flannel的網路設定將傳送端的checksum打開了,而實際應該關閉,從而讓物理網絡卡校驗。操作如下:
1 # 臨時關閉操作 2 [root@k8s-master service]# ethtool -K flannel.1 tx-checksum-ip-generic off 3 Actual changes: 4 tx-checksumming: off 5 tx-checksum-ip-generic: off 6 tcp-segmentation-offload: off 7 tx-tcp-segmentation: off [requested on] 8 tx-tcp-ecn-segmentation: off [requested on] 9 tx-tcp6-segmentation: off [requested on] 10 tx-tcp-mangleid-segmentation: off [requested on] 11 udp-fragmentation-offload: off [requested on] 12 [root@k8s-master service]# 13 # 再次查詢結果 14 [root@k8s-master service]# ethtool -k flannel.1 | grep checksum 15 rx-checksumming: on 16 tx-checksumming: off ##### 當前為 off 17 tx-checksum-ipv4: off [fixed] 18 tx-checksum-ip-generic: off ##### 當前為 off 19 tx-checksum-ipv6: off [fixed] 20 tx-checksum-fcoe-crc: off [fixed] 21 tx-checksum-sctp: off [fixed]
當然上述操作只能臨時生效。機器重啟後flannel虛擬網絡卡還會開啟Checksum校驗。
之後我們再次curl嘗試
1 [root@k8s-master ~]# curl 10.102.246.104:8080 2 Hello MyApp | Version: v1 | <a href="hostname.html">Pod Name</a> 3 [root@k8s-master ~]# 4 [root@k8s-master ~]# curl 10.102.246.104:8080/hostname.html 5 myapp-deploy-5695bb5658-7tgfx 6 [root@k8s-master ~]# 7 [root@k8s-master ~]# curl 10.102.246.104:8080/hostname.html 8 myapp-deploy-5695bb5658-95zxm 9 [root@k8s-master ~]# 10 [root@k8s-master ~]# curl 10.102.246.104:8080/hostname.html 11 myapp-deploy-5695bb5658-xtxbp 12 [root@k8s-master ~]# 13 [root@k8s-master ~]# curl 10.102.246.104:8080/hostname.html 14 myapp-deploy-5695bb5658-7tgfx
由上可見,能夠正常訪問了。
永久關閉flannel網絡卡傳送校驗
備註:所有機器都操作
使用以下程式碼建立服務
1 [root@k8s-node02 ~]# cat /etc/systemd/system/k8s-flannel-tx-checksum-off.service 2 [Unit] 3 Description=Turn off checksum offload on flannel.1 4 After=sys-devices-virtual-net-flannel.1.device 5 6 [Install] 7 WantedBy=sys-devices-virtual-net-flannel.1.device 8 9 [Service] 10 Type=oneshot 11 ExecStart=/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off
開機自啟動,並啟動服務
1 systemctl enable k8s-flannel-tx-checksum-off 2 systemctl start k8s-flannel-tx-checksum-off
相關閱讀
1、關於k8s的ipvs轉發svc服務訪問慢的問題分析(一)
2、Kubernetes + Flannel: UDP packets dropped for wrong checksum – Workaround
———END———
如果覺得不錯就關注下唄 (-^O^-) !
&n