使用 Prometheus-Operator 監控 Calico

阿新 • • 發佈：2020-06-29

原文連結：https://fuckcloudnative.io/posts/monitoring-calico-with-prometheus-operator/

Calico 中最核心的元件就是 Felix，它負責設定路由表和 ACL 規則等，以便為該主機上的 endpoints 資源正常執行提供所需的網路連線。同時它還負責提供有關網路健康狀況的資料（例如，報告配置其主機時發生的錯誤和問題），這些資料會被寫入 etcd，以使其對網路中的其他元件和操作人員可見。

由此可見，對於我們的監控來說，監控 Calico 的核心便是監控 Felix，Felix 就相當於 Calico 的大腦。本文將學習如何使用 Prometheus-Operator

來監控 Calico。

本文不會涉及到 Calico 和 Prometheus-Operator 的部署細節，如果不知道如何部署，請查閱官方檔案和相關部落格。

1. 配置 Calico 以啟用指標

預設情況下 Felix 的指標是被禁用的，必須通過命令列管理工具 calicoctl 手動更改 Felix 配置才能開啟，需要提前配置好命令列管理工具。

本文使用的 Calico 版本是 v3.15.0，其他版本類似。先下載管理工具：

$ wget https://github.com/projectcalico/calicoctl/releases/download/v3.15.0/calicoctl -O /usr/local/bin/calicoctl
 

$ chmod +x /usr/local/bin/calicoctl

接下來需要設定 calicoctl 配置檔案（預設是 /etc/calico/calicoctl.cfg）。如果你的 Calico 後端儲存使用的是 Kubernetes API，那麼配置檔案內容如下：

apiVersion: projectcalico.org/v3

kind: CalicoAPIConfig

metadata:

spec:

  datastoreType: "kubernetes"

  kubeconfig: "/root/.kube/config"

如果 Calico 後端儲存使用的是 etcd

，那麼配置檔案內容如下：

apiVersion: projectcalico.org/v3

kind: CalicoAPIConfig

metadata:

spec:

  datastoreType: "etcdv3"

  etcdEndpoints: https://192.168.57.51:2379,https://192.168.57.52:2379,https://192.168.57.53:2379

  etcdKeyFile: /opt/kubernetes/ssl/server-key.pem

  etcdCertFile: /opt/kubernetes/ssl/server.pem

  etcdCACertFile: /opt/kubernetes/ssl/ca.pem

你需要將其中的證書路徑換成你的 etcd 證書路徑。

配置好了 calicoctl 之後就可以檢視或修改 Calico 的配置了，先來看一下預設的 Felix 配置：

$ calicoctl get felixConfiguration default -o yaml

apiVersion: projectcalico.org/v3

kind: FelixConfiguration

metadata:

  creationTimestamp: "2020-06-25T14:37:28Z"

  name: default

  resourceVersion: "269031"

  uid: 52146c95-ff97-40a9-9ba7-7c3b4dd3ba57

spec:

  bpfLogLevel: ""

  ipipEnabled: true

  logSeverityScreen: Info

  reportingInterval: 0s

可以看到預設的配置中沒有啟用指標，需要手動修改配置，命令如下：

$ calicoctl patch felixConfiguration default  --patch '{"spec":{"prometheusMetricsEnabled": true}}'

Felix 暴露指標的埠是 9091，可通過檢查監聽埠來驗證是否開啟指標：

$ ss -tulnp|grep 9091

tcp    LISTEN     0      4096   [::]:9091               [::]:*                   users:(("calico-node",pid=13761,fd=9))

$ curl -s http://localhost:9091/metrics

# HELP felix_active_local_endpoints Number of active endpoints on this host.

# TYPE felix_active_local_endpoints gauge

felix_active_local_endpoints 1

# HELP felix_active_local_policies Number of active policies on this host.

# TYPE felix_active_local_policies gauge

felix_active_local_policies 0

# HELP felix_active_local_selectors Number of active selectors on this host.

# TYPE felix_active_local_selectors gauge

felix_active_local_selectors 0

...

2. Prometheus 採集 Felix 指標

啟用了 Felix 的指標後，就可以通過 Prometheus-Operator 來採集指標資料了。Prometheus-Operator 在部署時會建立 Prometheus、PodMonitor、ServiceMonitor、AlertManager 和 PrometheusRule 這 5 個 CRD 資源物件，然後會一直監控並維持這 5 個資源物件的狀態。其中 Prometheus 這個資源物件就是對 Prometheus Server 的抽象。而 PodMonitor 和 ServiceMonitor 就是 exporter 的各種抽象，是用來提供專門提供指標資料介面的工具，Prometheus 就是通過 PodMonitor 和 ServiceMonitor 提供的指標資料介面去 pull 資料的。

ServiceMonitor 要求被監控的服務必須有對應的 Service，而 PodMonitor 則不需要，本文選擇使用 PodMonitor 來採集 Felix 的指標。

PodMonitor 雖然不需要應用建立相應的 Service，但必須在 Pod 中指定指標的埠和名稱，因此需要先修改 DaemonSet calico-node 的配置，指定埠和名稱。先用以下命令開啟 DaemonSet calico-node 的配置：

$ kubectl -n kube-system edit ds calico-node

然後線上修改，在 spec.template.sepc.containers 中加入以下內容：

        ports:

        - containerPort: 9091

          name: http-metrics

          protocol: TCP

建立 Pod 對應的 PodMonitor：

# prometheus-podMonitorCalico.yaml

apiVersion: monitoring.coreos.com/v1

kind: PodMonitor

metadata:

  labels:

    k8s-app: calico-node

  name: felix

  namespace: monitoring

spec:

  podMetricsEndpoints:

  - interval: 15s

    path: /metrics

    port: http-metrics

  namespaceSelector:

    matchNames:

    - kube-system

  selector:

    matchLabels:

      k8s-app: calico-node

$ kubectl apply -f prometheus-podMonitorCalico.yaml

有幾個引數需要注意：

PodMonitor 的 name 最終會反應到 Prometheus 的配置中，作為 job_name。
podMetricsEndpoints.port 需要和被監控的 Pod 中的 ports.name 相同，此處為 http-metrics。
namespaceSelector.matchNames 需要和被監控的 Pod 所在的 namespace 相同，此處為 kube-system。
selector.matchLabels 的標籤必須和被監控的 Pod 中能唯一標明身份的標籤對應。

最終 Prometheus-Operator 會根據 PodMonitor 來修改 Prometheus 的配置檔案，以實現對相關的 Pod 進行監控。可以開啟 Prometheus 的 UI 檢視監控目標：

注意 Labels 中有 pod="calico-node-xxx"，表明監控的是 Pod。

3. 視覺化監控指標

採集完指標之後，就可以通過 Grafana 的儀表盤來展示監控指標了。Prometheus-Operator 中部署的 Grafana 無法實時修改儀表盤的配置（必須提前將儀表盤的 json 檔案掛載到 Grafana Pod 中），而且也不是最新版（7.0 以上版本），所以我選擇刪除 Prometheus-Operator 自帶的 Grafana，自行部署 helm 倉庫中的 Grafana。先進入 kube-prometheus 專案的 manifests 目錄，然後將 Grafana 相關的部署清單都移到同一個目錄下，再刪除 Grafana：

$ cd kube-prometheus/manifests

$ mkdir grafana

$ mv grafana-* grafana/

$ kubectl delete -f grafana/

然後通過 helm 部署最新的 Grafana：

$ helm install grafana stable/grafana -n monitoring

訪問 Grafana 的密碼儲存在 Secret 中，可以通過以下命令檢視：

$ kubectl -n monitoring get secret grafana -o yaml

apiVersion: v1

data:

  admin-password: MnpoV3VaMGd1b3R3TDY5d3JwOXlIak4yZ3B2cTU1RFNKcVY0RWZsUw==

  admin-user: YWRtaW4=

  ldap-toml: ""

kind: Secret

metadata:

...

對密碼進行解密：

$ echo -n "MnpoV3VaMGd1b3R3TDY5d3JwOXlIak4yZ3B2cTU1RFNKcVY0RWZsUw=="|base64 -d

解密出來的資訊就是訪問密碼。使用者名稱是 admin。通過使用者名稱和密碼登入 Grafana 的 UI：

新增 Prometheus-Operator 的資料來源：

Calico 官方沒有單獨 dashboard json，而是將其放到了 ConfigMap 中，我們需要從中提取需要的 json，提取出 felix-dashboard.json 的內容，然後將其中的 datasource 值替換為 prometheus。你可以用 sed 替換，也可以用編輯器，大多數編輯器都有全域性替換的功能。如果你實在不知道如何提取，可以使用我提取好的 json：

修改完了之後，將 json 內容匯入到 Grafana：

最後得到的 Felix 儀表盤如下圖所示：

如果你對我截圖中 Grafana 的主題配色很感興趣，可以參考這篇文章：Grafana 自定義主題。

Kubernetes 1.18.2 1.17.5 1.16.9 1.15.12離線安裝包釋出地址http://store.lameleg.com ，歡迎體驗。使用了最新的sealos v3.3.6版本。作了主機名解析配置優化，lvscare 掛載/lib/module解決開機啟動ipvs載入問題，修復lvscare社群netlink與3.10核心不相容問題,sealos生成百年證書等特性。更多特性 https://github.com/fanux/sealos 。歡迎掃描下方的二維碼加入釘釘群，釘釘群已經整合sealos的機器人實時可以看到sealos的動態。

使用 Prometheus-Operator 監控 Calico

1. 配置 Calico 以啟用指標

2. Prometheus 採集 Felix 指標

3. 視覺化監控指標

使用 Prometheus-Operator 監控 Calico

使用 prometheus-operator 監控 Kubernetes 叢集【轉】

prometheus operator 監控mysql-exporter

prometheus operator 監控redis-exporter

容器雲平臺No.7~kubernetes監控系統prometheus-operator

Kubernetes-19：Prometheus-operator叢集監控神器

Prometheus Operator自定義監控項

Prometheus Operator使用ServiceMonitor自定義監控

Kubernetes 監控：Prometheus Operator

k8s容器中通過Prometheus Operator部署Redis Exporter監控Redis

k8s容器內部通過Prometheus Operator部署MySQL Exporter監控k8s叢集外部的MySQL

08 . Prometheus+Grafana監控haproxy+rabbitmq

搭建prometheus+grafana監控SpringBoot應用入門

Prometheus Operator 教程：根據服務維度對 Prometheus 分片

Prometheus+Grafana監控宿主機

heml 部署prometheus-operator（一）

prometheus程序監控

Prometheus之監控節點

prometheus+grafana監控體驗

Kubernetes運維之使用Prometheus全方位監控K8S

使用 Prometheus-Operator 監控 Calico

1. 配置 Calico 以啟用指標

2. Prometheus 採集 Felix 指標

3. 視覺化監控指標

相關推薦