Istio多叢集(1)-多控制面

阿新 • • 發佈：2020-10-16

Istio多叢集(1)-多控制面

參考自官方文件。

Istio多叢集(1)-多控制面
- 複製控制面

複製控制面

本節將使用多個主叢集(帶控制面的叢集)來部署Istio多叢集，每個叢集都有自己的控制面，叢集之間使用gateway進行通訊。

由於不使用共享的控制面來管理網格，因此這種配置下，每個叢集都有自己的控制面來管理後端應用。為了策略執行和安全目的，所有的群集都處於一個公共的管理控制之下。

通過複製共享服務和名稱空間，並在所有叢集中使用一個公共的根CA證書，可以實現單個Istio服務網格跨叢集通訊。

要求

兩個或多個kubernees叢集，版本為1.17，1.18，1.19

在每個kubernetes叢集上部署Istio控制面
每個叢集中的istio-ingressgateway的服務IP地址必須能夠被所有的叢集訪問，理想情況下，使用L4網路負載平衡器（NLB）。
根CA。跨叢集通訊需要在服務之間使用mutual TLS。為了在跨叢集通訊時啟用mutual TLS，每個叢集的Istio CA必須配置使用共享的CA證書生成的中間CA。出於演示目的，將會使用Istio的samples/certs安裝目錄下的根CA證書。

在每個叢集中部署Istio控制面

使用自定義的根CA為每個叢集生成中間CA證書，使用共享的根CA來為跨叢集的通訊啟用mutual TLS。

出於演示目的，下面使用了istio樣例目錄中的證書。在真實部署時，應該為每個叢集選擇不同的CA證書，這些證書由一個共同的根CA簽發。

在每個叢集中執行如下命令來為所有叢集部署相同的Istio控制面。

為生成的CA建立一個kubernetes secret。它與在Istio中插入自定義的CA一文中的方式類似。

生產中不能使用samples 目錄中的證書，有安全風險。

$ kubectl create namespace istio-system
$ kubectl create secret generic cacerts -n istio-system \
    --from-file=samples/certs/ca-cert.pem \
    --from-file=samples/certs/ca-key.pem \
    --from-file=samples/certs/root-cert.pem \
    --from-file=samples/certs/cert-chain.pem

部署Istio，部署後會在istio-system名稱空間中建立一個pod istiocoredns，用於提供到global域的DNS解析，其配置檔案如下：

# cat Corefile
.:53 {
      errors
      health

      # Removed support for the proxy plugin: https://coredns.io/2019/03/03/coredns-1.4.0-release/
      grpc global 127.0.0.1:8053
      forward . /etc/resolv.conf {
        except global
      }

      prometheus :9153
      cache 30
      reload
    }

$ istioctl install -f manifests/examples/multicluster/values-istio-multicluster-gateways.yaml

配置DNS

當為遠端叢集中的服務提供DNS解析時，現有應用程式無需修改即可執行，因為應用程式通常會訪問通過DNS解析出的IP。Istio本身並不需要DNS在服務之間路由請求。本地服務會共享一個共同的DNS字首(即，svc.cluster.local)。kubernetes DNS為這些服務提供了DNS解析。

為了給遠端叢集提供一個類似的服務配置，需要使用格式<name>.<namespace>.global來命名遠端叢集中的服務。Istio附帶了一個Core DNS服務，可以為這些服務提供DNS解析。為了使用該DNS，kubernetes的DNS必須配置為.global的域名存根。

在每個需要呼叫遠端的服務的叢集中建立或更新一個現有的k8s的ConfigMap，本環境中使用的coredns為1.7.0版本，使用的配置檔案如下：

注意不能直接採用官方配置檔案，可能會因為不同版本的配置原因導致k8s的coredns無法正常啟動。正確做法是在kube-system名稱空間下獲取k8s coredns的configmap配置，然後在後面追加global域有關的配置即可。

另外使用如下命令apply之後，k8s的coredns可能並不會生效，可以手動重啟k8s的dns pod來使其生效。注意如下配置需要在cluster1和cluster2中同時生效。

kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 { #k8s的coredns的原始配置
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
    global:53 { #新增了對global的解析，將其轉發到istio-system下面的istiocoredns服務，即上面istio建立的coredns
        errors
        cache 30
        forward . $(kubectl get svc -n istio-system istiocoredns -o jsonpath={.spec.clusterIP}):53
    }
EOF

在建立第4步的serviceEntry之後可以使用如下方式判斷cluster1的DNS解析是否正確：
首先在cluster1中的sleep容器中通過istio的coredns解析httpbin.bar.global，10.96.199.197為istio的coredns的service
# nslookup -type=a httpbin.bar.global 10.96.199.197
Server:         10.96.199.197
Address:        10.96.199.197:53

Name:   httpbin.bar.global
Address: 240.0.0.2 #解析成功
在cluster1中的sleep容器中通過k8s的coredns解析httpbin.bar.global，10.96.0.10為k8s的coredns的service，可以看到k8s的coredns將httpbin.bar.global的解析轉發給了istio的coredns，並且解析成功。
# nslookup -type=a httpbin.bar.global 10.96.0.10
Server:         10.96.0.10
Address:        10.96.0.10:53

Name:   httpbin.bar.global
Address: 240.0.0.2 #解析成功
DNS的解析路徑為：sleep容器的/etc/resolv.conf-->k8s coredns-->istio coredns-->istio-coredns-plugin

istio-coredns-plugin是istio的CoreDNS gRPC外掛，用於從Istio ServiceEntries中提供DNS記錄。(注：該外掛將整合到Istio 1.8的sidecar中，後續將會被廢棄)

可以在istio-coredns-plugin的log日誌中檢視到對global域的操作：
# crictl inspect e8d3f73c4d38d|grep "logPath"
    "logPath": "/var/log/pods/istio-system_istiocoredns-75dd7c7dc8-cg55l_8c960b74-419c-44e8-8992-58293e36d6fd/istio-coredns-plugin/0.log"
例如該外掛會讀取下面建立的到httpbin.bar.global的ServiceEntries，並將其做DNS對映：
... info    Reading service entries at 2020-10-09 17:53:38.710306063 +0000 UTC m=+19500.216843506
... info    Have 1 service entries
... info    adding DNS mapping: httpbin.bar.global.->[240.0.0.2]

配置應用服務

一個叢集中的服務如果需要被遠端叢集訪問，就需要在遠端叢集中配置一個ServiceEntry。service entry使用的host格式為<name>.<namespace>.global，name和namespace分別對應服務的name和namespace。

為了演示跨叢集訪問，在一個叢集中配置sleep服務，使用該服務訪問另一個叢集中的httpbin服務。

選擇兩個Istio叢集，分別為cluster1 和cluster2

使用如下命令列出叢集的上下文：

# kubectl config get-contexts
CURRENT   NAME            CLUSTER         AUTHINFO        NAMESPACE
*         kind-cluster1   kind-cluster1   kind-cluster1
          kind-cluster2   kind-cluster2   kind-cluster2

使用環境變數儲存叢集的上下文名稱：

# export CTX_CLUSTER1=$(kubectl config view -o jsonpath='{.contexts[0].name}')
# export CTX_CLUSTER2=$(kubectl config view -o jsonpath='{.contexts[1].name}')
# echo "CTX_CLUSTER1 = ${CTX_CLUSTER1}, CTX_CLUSTER2 = ${CTX_CLUSTER2}"
CTX_CLUSTER1 = kind-cluster1, CTX_CLUSTER2 = kind-cluster2

配置用例服務

在cluster1叢集中部署sleep應用

$ kubectl create --context=$CTX_CLUSTER1 namespace foo
$ kubectl label --context=$CTX_CLUSTER1 namespace foo istio-injection=enabled
$ kubectl apply --context=$CTX_CLUSTER1 -n foo -f samples/sleep/sleep.yaml
$ export SLEEP_POD=$(kubectl get --context=$CTX_CLUSTER1 -n foo pod -l app=sleep -o jsonpath={.items..metadata.name})

在cluster2叢集中部署httpbin應用

$ kubectl create --context=$CTX_CLUSTER2 namespace bar
$ kubectl label --context=$CTX_CLUSTER2 namespace bar istio-injection=enabled
$ kubectl apply --context=$CTX_CLUSTER2 -n bar -f samples/httpbin/httpbin.yaml

暴露cluster2的閘道器地址

本地部署的kubernetes由於沒有loadBalancer，因此使用nodeport方式(如果使用kind部署kubernetes，此時需要手動修改service istio-ingressgateway的nodeport，使其與kind暴露的埠一致)。

export INGRESS_PORT=$(kubectl -n istio-system --context=$CTX_CLUSTER2 get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')

INGRESS_HOST的獲取方式如下：

export INGRESS_HOST=$(kubectl --context=$CTX_CLUSTER2 get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}')

為了允許cluster1中的sleep訪問cluster2中的httpbin，需要在cluster1中為httpbin建立一個service entry。service entry的host名稱的格式應該為<name>.<namespace>.global，name和namespace分別對應遠端服務的name和namespace。

為了讓DNS解析.global域下的服務，需要給這些服務分配虛擬IP地址。

每個.global DNS域下的服務都必須在叢集中擁有唯一的虛擬IP。

如果global服務已經有了實際的VIPs，那麼可以直接使用這類地址，否則建議使用範圍為240.0.0.0/4的E類IP地址。應用使用這些IP處理流量時，流量會被sidecar捕獲，並路由到合適的遠端服務。

不能使用多播地址(224.0.0.0 ~ 239.255.255.255)，因為預設情況下不會有到達這些地址的路由。同時也不能使用環回地址(127.0.0.0/8)，因為發往該地址的流量會被重定向到sidecar的inbound listener。

$ kubectl apply --context=$CTX_CLUSTER1 -n foo -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: httpbin-bar
spec:
  hosts: #外部服務的主機名
  # must be of form name.namespace.global
  - httpbin.bar.global
  # Treat remote cluster services as part of the service mesh
  # as all clusters in the service mesh share the same root of trust.
  location: MESH_INTERNAL #標註為網格內的服務，使用mTLS互動
  ports: # 對端bar服務的埠
  - name: http1
    number: 8000
    protocol: http
  resolution: DNS #使用DNS伺服器進行域名解析，endpoints的address欄位可能是一個域名
  addresses: # 下面指定了httpbin.bar.global:8000服務對應的一個後端${INGRESS_HOST}:30615
  # the IP address to which httpbin.bar.global will resolve to
  # must be unique for each remote service, within a given cluster.
  # This address need not be routable. Traffic for this IP will be captured
  # by the sidecar and routed appropriately.
  - 240.0.0.2 # host對應的虛擬地址，必須包含，否則istio-coredns-plugin無法進行DNS解析
  endpoints:
  # This is the routable address of the ingress gateway in cluster2 that
  # sits in front of sleep.foo service. Traffic from the sidecar will be
  # routed to this address.
  - address: ${INGRESS_HOST} # 將其替換為對應的node地址值即可
 ports:
      http1: 30001 # 替換為對應的nodeport
EOF

由於使用了nodeport方式，因此需要使用容器15443埠對應的nodeport埠，使用如下方式獲取：
# kubectl --context=$CTX_CLUSTER2 get svc -n istio-system istio-ingressgateway -o=jsonpath='{.spec.ports[?(@.port==15443)].nodePort}'
30001

上述配置會將cluster1中httpbin.bar.global服務的所有埠上的流量(通過mutual TLS)路由到$INGRESS_HOST:15443。

閘道器的15443埠是一個感知SNI的Envoy配置，在安裝Istio控制面時部署。到達15443埠的流量會在目標叢集的內部服務的pod上進行負載均衡(即cluster2的httpbin.bar)。

下面是從cluster1的sleep中匯出的istio-proxy配置，可以看到httpbin.bar.global的後端為172.18.0.5:30615,即$INGRESS_HOST:$NODE_PORT
"cluster": {
"load_assignment": {
 "cluster_name": "outbound|8000||httpbin.bar.global",
 "endpoints": [
  {
   "locality": {},
   "lb_endpoints": [
    {
     "endpoint": {
      "address": {
       "socket_address": {
        "address": "172.18.0.4",
        "port_value": 30001
       }
      }
     },
     "load_balancing_weight": 1
    }
   ],
   "load_balancing_weight": 1
  }
 ]
},
	  ...
},
對應的路由如下，可以看到240.0.0.2只是作為了SNI的一種，將匹配到的請求轉發給上面的"cluster": "outbound|8000||httpbin.bar.global"進行處理：
"route_config": {
"@type": "type.googleapis.com/envoy.config.route.v3.RouteConfiguration",
"name": "8000",
"virtual_hosts": [
 ...
 {
  "name": "httpbin.bar.global:8000",
  "domains": [
   "httpbin.bar.global",
   "httpbin.bar.global:8000",
   "240.0.0.2",
   "240.0.0.2:8000"
  ],
  "routes": [
   {
    "match": {
     "prefix": "/"
    },
    "route": {
     "cluster": "outbound|8000||httpbin.bar.global",
     ...
},
另外需要注意的是cluster1和cluster2都使用了一個Gateway和DestinationRule，對從sleep到httpbin的*.global域的請求使用mTLS進行加密，並在閘道器上使用AUTO_PASSTHROUGH模式，此模式會根據SNI將請求直接轉發給後端應用，無需virtualservice進行繫結。
apiVersion: networking.istio.io/v1beta1
kind: Gateway
spec:
selector:
istio: ingressgateway
servers:
- hosts:
  - '*.global'
  port:
    name: tls
    number: 15443
    protocol: TLS
  tls:
    mode: AUTO_PASSTHROUGH

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
spec:
host: '*.global'
trafficPolicy:
  tls:
    mode: ISTIO_MUTUAL

校驗可以通過sleep服務訪問httpbin服務。

# kubectl exec --context=$CTX_CLUSTER1 $SLEEP_POD -n foo -c sleep -- curl -I httpbin.bar.global:8000/headers

HTTP/1.1 200 OK
server: envoy
date: Wed, 14 Oct 2020 22:44:00 GMT
content-type: application/json
content-length: 554
access-control-allow-origin: *
access-control-allow-credentials: true
x-envoy-upstream-service-time: 8

在官方文件中使用如上命令即可在cluster1的sleep Pod中訪問cluster2的httpbin服務。但從上面分析可以看到，當SNI為httpbin.bar.global的請求到達cluster2的ingress pod上時，它會按照k8s的coredns配置將該請求轉發到istio的coredns進行解析，但cluster2並沒有配置httpbin.bar.global對應的serviceentry，因此，istio的coredns也無法解析該dns，返回503錯誤。在cluster2的istio-coredns-plugin容器的日誌中可以找到如下資訊：

... info    Query A record: httpbin.bar.global.->{httpbin.bar.global. 1 1}
... info    Could not find the service requested
... info    DNS query  ;; opcode: QUERY, status: NOERROR, id: 64168

在cluster2中建立如下serviceentry：

$ kubectl apply --context=$CTX_CLUSTER2 -n bar -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: httpbin-bar
spec:
hosts:
  - httpbin.bar.global
  location: MESH_INTERNAL
  ports:
  - name: http1
    number: 8000
    protocol: http
  resolution: DNS
  addresses:
  - 240.0.0.3
  endpoints:
  - address: httpbin.bar.svc.cluster.local #httpbin的k8s service
EOF

httpbin生成的cluster如下，可以看到後端地址為httpbin.bar.svc.cluster.local，直接通過k8s的DNS即可解析該地址。

     "cluster": {
      "@type": "type.googleapis.com/envoy.config.cluster.v3.Cluster",
      "name": "outbound|8000||httpbin.bar.global",
      "type": "STRICT_DNS",
	  ...
      "load_assignment": {
       "cluster_name": "outbound|8000||httpbin.bar.global",
       "endpoints": [
        {
         "locality": {},
         "lb_endpoints": [
          {
           "endpoint": {
            "address": {
             "socket_address": {
              "address": "httpbin.bar.svc.cluster.local",
              "port_value": 8000
             }
            }
           },
           "load_balancing_weight": 1
          }
         ],
         "load_balancing_weight": 1
        }
       ]
      },
	  ...
     },

在cluster2中建立一個sleep pod，並在該pod中訪問cluster2的bar名稱空間下的httpbin服務，可以看到訪問成功：

# curl -I httpbin.bar.global:8000/headers
HTTP/1.1 200 OK
server: envoy
date: Sat, 10 Oct 2020 12:40:23 GMT
content-type: application/json
content-length: 554
access-control-allow-origin: *
access-control-allow-credentials: true
x-envoy-upstream-service-time: 206

在cluster2的ingress pod中匯出與15443埠有關的listeners配置如下，可以看到AUTO_PASSTHROUGH模式下的listener並沒有通過route_config_name指定到達cluster的路由(不需要通過virtualservice進行服務對映)，僅通過SNI進行請求轉發。

   "dynamic_listeners": [
    {
     "name": "0.0.0.0_15443",
     "active_state": {
      "version_info": "2020-10-14T19:52:24Z/14",
      "listener": {
       "@type": "type.googleapis.com/envoy.config.listener.v3.Listener",
       "name": "0.0.0.0_15443",
       "address": {
        "socket_address": {
         "address": "0.0.0.0",
         "port_value": 15443
        }
       },
       "filter_chains": [
        {
         "filter_chain_match": {
          "server_names": [
           "*.global"
          ]
         },
         "filters": [
          ...
          {
           "name": "istio.stats",
           ...
          },
          {
           "name": "envoy.filters.network.tcp_proxy",
           "typed_config": {
            "@type": "type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy",
            "stat_prefix": "BlackHoleCluster",
            "cluster": "BlackHoleCluster"
           }
          }
         ]
        }
       ],
       "listener_filters": [
        {
         ...
       ],
       "traffic_direction": "OUTBOUND"
      },
      "last_updated": "2020-10-14T19:53:22.089Z"
     }
    }
   ]

解除安裝

$ kubectl delete --context=$CTX_CLUSTER1 -n foo -f samples/sleep/sleep.yaml
$ kubectl delete --context=$CTX_CLUSTER1 -n foo serviceentry httpbin-bar
$ kubectl delete --context=$CTX_CLUSTER1 ns foo

$ kubectl delete --context=$CTX_CLUSTER2 -n bar -f samples/httpbin/httpbin.yaml
$ kubectl delete --context=$CTX_CLUSTER2 ns bar

$ unset SLEEP_POD CLUSTER2_GW_ADDR CLUSTER1_EGW_ADDR CTX_CLUSTER1 CTX_CLUSTER2

FAQ

cluster1和cluster2通訊時，需要保證cluster1和cluster2的根證書是相同的。可以通過對比cluster1的sleep和cluster2的httpbin匯出的istio sidecar的如下ROOTCA證書配置來判斷是否一致。可能發生證書不一致的原因是
- 先建立istio，後建立cacerts根證書
- 重建istio時，沒有刪除之前錯誤的istio-system名稱空間下的老的證書
- 重建istio後，沒有清理foo或bar名稱空間下的pod，secret資源。
因此在重建istio前，務必刪除istio-system和foo/bar名稱空間下的所有資源
```
   "dynamic_active_secrets": [
    {
     "name": "default",
     "secret": {
      "@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret",
      "name": "default",
      ...
     }
    },
    {
     "name": "ROOTCA",
     "secret": {
      "@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret",
      "name": "ROOTCA",
      "validation_context": {
       "trusted_ca": {
        "inline_bytes": "LS0tLxxx="
       }
      }
     }
    }
   ]
```
參考：
Using CoreDNS to Conceal Network Identities of Services in Istio

Istio多叢集(1)-多控制面