1. 程式人生 > >Kubernetes平臺監控方案之:Exporters+Prometheus+Grafana

Kubernetes平臺監控方案之:Exporters+Prometheus+Grafana

1. 概述

1.1 總體目標

從監控平臺本身的業務需求來看,至少應該通過平臺獲取到以下的監控資料:

  • 效能指標(如:CPU、Memory、Load、磁碟、網路等)

    • 容器、Pod相關的效能指標資料
    • 主機節點相關的效能指標資料
    • 容器內程序自己主動暴露的指標資料
    • k8s上應用的網路效能,如http、tcp等資料
  • 狀態指標

    • k8s資源物件(Deployment、Daemonset、Pod等)的執行狀態指標
    • k8s平臺元件(如kube-apiserver、kube-scheduler等)的執行狀態指標

獲取監控資料之後,還需要對監控進行視覺化展示,以及對監控中出現的異常情況進行告警。

本文旨在實踐一整套監控方案,篇幅限制,不會對Prometheus等元件的基礎原理做具體介紹。Prometheus的基本原理可以參考此書:

prometheus-book

1.2 主流監控方案

目前對於kubernetes的主流監控方案主要有以下幾種:

  • Heapster+InfluxDB+Grafana
    每個K8S節點的Kubelet內含cAdvisor,暴露出API,Heapster通過訪問這些端點得到容器監控資料。它支援多種儲存方式,常用的是InfluxDB。這套方案的缺點是資料來源單一、缺乏報警功能以及InfluxDB的單點問題,而且Heapster也已經在新版本中被deprecated(被metrics server取代)了。這種實現方案的詳細介紹請見這篇文章
  • Metrics-Server+InfluxDB+Grafana

    k8s從1.8版本開始,CPU、記憶體等資源的metrics資訊可以通過 Metrics API來獲取,使用者還可以通過kubectl top直接獲取這些metrics資訊。Metrics API需要部署Metrics-Server。
  • 各種Exporter+Prometheus+Grafana
    通過各種export採集不同維度的監控指標,並通過Prometheus支援的資料格式暴露出來,Prometheus定期pull資料並用Grafana展示,異常情況使用AlertManager告警。本方案下文詳細敘述。

2. 架構

2.1 實現思路

總體實現思路如下:

  • 採集
    • 通過cadvisor採集容器、Pod相關的效能指標資料,並通過暴露的/metrics介面用prometheus抓取
    • 通過prometheus-node-exporter採集主機的效能指標資料,並通過暴露的/metrics介面用prometheus抓取
    • 應用側自己採集容器中程序主動暴露的指標資料(暴露指標的功能由應用自己實現,並新增平臺側約定的annotation,平臺側負責根據annotation實現通過Prometheus的抓取)
    • 通過blackbox-exporter採集應用的網路效能(http、tcp、icmp等)資料,並通過暴露的/metrics介面用prometheus抓取
    • 通過kube-state-metrics採集k8s資源物件的狀態指標資料,並通過暴露的/metrics介面用prometheus抓取
    • 通過etcd、kubelet、kube-apiserver、kube-controller-manager、kube-scheduler自身暴露的/metrics獲取節點上與k8s叢集相關的一些特徵指標資料。
  • 儲存(匯聚),通過prometheus pull並匯聚各種exporter的監控資料
  • 展示,通過grafana展示監控資訊
  • 告警,通過alertmanager進行告警

2.2 總體架構圖

總體架構如下圖:

3.監控指標採集實現

3.1 容器、Pod相關的效能指標資料—cAdvisor

cAdvisor是谷歌開源的一個容器監控工具,cadvisor採集了主機上容器相關的效能指標資料,通過容器的指標還可進一步計算出pod的指標。

cadvisor提供的一些主要指標有:

container_cpu_*	
container_fs_*	
container_memory_*	
container_network_*	
container_spec_*(cpu/memory)		
container_start_time_*	
container_tasks_state_*

可以看到基本都是容器相關的一些資源使用情況。

3.1.1 cadvisor介面

目前cAdvisor整合到了kubelet元件內,可以在kubernetes叢集中每個啟動了kubelet的節點使用cAdvisor提供的metrics介面獲取該節點所有容器相關的效能指標資料。1.7.3版本以前,cadvisor的metrics資料整合在kubelet的metrics中,在1.7.3以後版本中cadvisor的metrics被從kubelet的metrics獨立出來了,在prometheus採集的時候變成兩個scrape的job。

cAdvisor對外提供服務的預設埠為***4194***,主要提供兩種介面:

  • Prometheus格式指標介面:nodeIP:4194/metrics(或者通過kubelet暴露的cadvisor介面nodeIP:10255/metrics/cadvisor);
  • WebUI介面介面:nodeIP:4194/containers/

Prometheus作為一個時間序列資料收集,處理,儲存的服務,能夠監控的物件必須通過http api暴露出基於Prometheus認可的資料模型的監控資料,cAdvisor介面(nodeIP:4194/metrics)暴露的監控指標資料如下所示:

# HELP cadvisor_version_info A metric with a constant '1' value labeled by kernel version, OS version, docker version, cadvisor version & cadvisor revision.
# TYPE cadvisor_version_info gauge
cadvisor_version_info{cadvisorRevision="",cadvisorVersion="",dockerVersion="1.12.6",kernelVersion="4.9.0-1.2.el7.bclinux.x86_64",osVersion="CentOS Linux 7 (Core)"} 1
# HELP container_cpu_cfs_periods_total Number of elapsed enforcement period intervals.
# TYPE container_cpu_cfs_periods_total counter
container_cpu_cfs_periods_total{container_name="",id="/kubepods/burstable/pod1b0c1f83322defae700f33b1b8b7f572",image="",name="",namespace="",pod_name=""} 7.062239e+06
container_cpu_cfs_periods_total{container_name="",id="/kubepods/burstable/pod7f86ba308f28df9915b802bc48cfee3a",image="",name="",namespace="",pod_name=""} 1.574206e+06
container_cpu_cfs_periods_total{container_name="",id="/kubepods/burstable/podb0c8f695146fe62856bc23709a3e056b",image="",name="",namespace="",pod_name=""} 7.107043e+06
container_cpu_cfs_periods_total{container_name="",id="/kubepods/burstable/podc8cf73836b3caba7bf952ce1ac5a5934",image="",name="",namespace="",pod_name=""} 5.932159e+06
container_cpu_cfs_periods_total{container_name="",id="/kubepods/burstable/podfaa9db59-64b7-11e8-8792-00505694eb6a",image="",name="",namespace="",pod_name=""} 6.979547e+06
container_cpu_cfs_periods_total{container_name="calico-node",id="/kubepods/burstable/podfaa9db59-64b7-11e8-8792-
...

可以看到以上介面的資料是按prometheus的格式輸出的。

3.1.2 Prometheus配置

如下配置Prometheus來定期拉取cAdvisor的metrics:

      - job_name: 'cadvisor'
        # 通過https訪問apiserver,通過apiserver的api獲取資料
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
              
        #以k8s的角色(role)來定義收集,比如node,service,pod,endpoints,ingress等等 
        kubernetes_sd_configs:
        # 從k8s的node物件獲取資料
        - role: node
      
        relabel_configs:
        # 用新的字首代替原label name字首,沒有replacement的話功能就是去掉label name字首
        # 例如:以下兩句的功能就是將__meta_kubernetes_node_label_kubernetes_io_hostname
        # 變為kubernetes_io_hostname
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
          
        # replacement中的值將會覆蓋target_label中指定的label name的值,
        # 即__address__的值會被替換為kubernetes.default.svc:443
        - target_label: __address__
          replacement: kubernetes.default.svc:443
          
        # 獲取__meta_kubernetes_node_name的值
        - source_labels: [__meta_kubernetes_node_name]
          #匹配一個或多個任意字元,將上述source_labels的值生成變數
          regex: (.+)
          # replacement中的值將會覆蓋target_label中指定的label name的值,
          # 即__metrics_path__的值會被替換為/api/v1/nodes/${1}/proxy/metrics,
          # 其中${1}的值會被替換為__meta_kubernetes_node_name的值
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
          
        metric_relabel_configs:
          - action: replace
            source_labels: [id]
            regex: '^/machine\.slice/machine-rkt\\x2d([^\\]+)\\.+/([^/]+)\.service$'
            target_label: rkt_container_name
            replacement: '${2}-${1}'
          - action: replace
            source_labels: [id]
            regex: '^/system\.slice/(.+)\.service$'
            target_label: systemd_service_name
            replacement: '${1}'

之後,在prometheus的target(IP:Port/targets)中可以看到cadvisor相應的target:

備註

關於配置檔案的一些說明:

  1. 以上配置遵照官方example配置,通過apiserver提供的api做代理獲取cAdvisor( https://kubernetes.default.svc:443/api/v1/nodes/k8smaster01/proxy/metrics/cadvisor )的監控指標(和從nodeIP:4194/metrics獲取到的內容是一樣的),而不是直接從node上獲取。為什麼這樣做,官方這樣解釋的:This means it will work if Prometheus is running out of cluster, or can’t connect to nodes for some other reason (e.g. because of firewalling)。
  2. Prometheus配置檔案的語法規則較複雜,為便於理解,我加了一些註釋;更多語法規則請見Prometheus官方文件

關於target中label的一些說明,target中必有的幾個source label有:

  1. __address__(當static_configs時通過targets手工配置,當kubernetes_sd_configs時,值從apiserver中獲取)、
  2. __metrics_path__(預設值是/metrics)、
  3. __scheme__(預設值http)
  4. job

其他source label則是根據kubernetes_sd_configs時設定的- role(如endpoints、nodes、service、pod等)從k8s資源物件的label、annotation及其他一些資訊中提取的。

3.2 主機節點效能指標資料—node-exporter

Prometheus社群提供的NodeExporter專案可以對主機的關鍵度量指標進行監控,通過Kubernetes的DeamonSet可以在各個主機節點上部署有且僅有一個NodeExporter例項,實現對主機效能指標資料的監控。node-exporter所採集的指標主要有:

 node_cpu_*		
 node_disk_*	
 node_entropy_*		
 node_filefd_*
 node_filesystem_*	
 node_forks_*	
 node_intr_total_*
 node_ipvs_*	
 node_load_*	
 node_memory_*		
 node_netstat_*		
 node_network_*		
 node_nf_conntrack_*		
 node_scrape_*		
 node_sockstat_*		
 node_time_seconds_*
 node_timex	_*
 node_xfs_*

可以看到全是節點相關的一些資源使用情況。

3.2.1 prometheus-node-exporter部署

我的NodeExporter部署檔案node-exporter-daemonset.yaml如下,可從github下載。

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: prometheus-node-exporter
  namespace: kube-system
  labels:
    app: prometheus-node-exporter
spec:
  template:
    metadata:
      name: prometheus-node-exporter
      labels:
        app: prometheus-node-exporter
    spec:
      containers:
      - image: prom/node-exporter:v0.16.0
        imagePullPolicy: IfNotPresent
        name: prometheus-node-exporter
        ports:
        - name: prom-node-exp
          #^ must be an IANA_SVC_NAME (at most 15 characters, ..)
          containerPort: 9100
          hostPort: 9100
      tolerations:
      - key: "node-role.kubernetes.io/master"
        effect: "NoSchedule"
      hostNetwork: true
      hostPID: true
      hostIPC: true
      restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/app-metrics: 'true'
    prometheus.io/app-metrics-path: '/metrics'
  name: prometheus-node-exporter
  namespace: kube-system
  labels:
    app: prometheus-node-exporter
spec:
  clusterIP: None
  ports:
    - name: prometheus-node-exporter
      port: 9100
      protocol: TCP
  selector:
    app: prometheus-node-exporter
  type: ClusterIP

備註:

1.為了讓容器裡的node-exporter獲取到主機上的網路、PID、IPC指標,這裡設定了hostNetwork: true、hostPID: true、hostIPC: true,來與主機共用網路、PID、IPC這三個namespace。
2.為了
2.此處在Service的annotations中定義標註prometheus.io/scrape: 'true',表明該Service需要被Promethues發現並採集資料。

通過NodeExporter暴露的metrics介面(nodeIP:9100/metrics)檢視採集到的資料,可以看到是按Prometheus的格式輸出的資料:

# HELP node_arp_entries ARP entries by device
# TYPE node_arp_entries gauge
node_arp_entries{device="calid63983a5754"} 1
node_arp_entries{device="calid67ce395c9e"} 1
node_arp_entries{device="calid857f2bf9d5"} 1
node_arp_entries{device="calief3a4b64165"} 1
node_arp_entries{device="eno16777984"} 9
# HELP node_boot_time Node boot time, in unixtime.
# TYPE node_boot_time gauge
node_boot_time 1.527752719e+09
# HELP node_context_switches Total number of context switches.
# TYPE node_context_switches counter
node_context_switches 3.1425612674e+10
# HELP node_cpu Seconds the cpus spent in each mode.
# TYPE node_cpu counter
node_cpu{cpu="cpu0",mode="guest"} 0
node_cpu{cpu="cpu0",mode="guest_nice"} 0
node_cpu{cpu="cpu0",mode="idle"} 2.38051096e+06
node_cpu{cpu="cpu0",mode="iowait"} 11904.19
node_cpu{cpu="cpu0",mode="irq"} 0
node_cpu{cpu="cpu0",mode="nice"} 2990.94
node_cpu{cpu="cpu0",mode="softirq"} 8038.3
...

3.2.2 Prometheus配置

配置Prometheus來scrape node-exporter的metrics:

      - job_name: 'prometheus-node-exporter'
        
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:
        #The endpoints role discovers targets from listed endpoints of a service. For each
        #endpoint address one target is discovered per port. If the endpoint is backed by
        #a pod, all additional container ports of the pod, not bound to an endpoint port,
        #are discovered as targets as well
        - role: endpoints
        relabel_configs:
        # 只保留endpoints的annotations中含有prometheus.io/scrape: 'true'和port的name為prometheus-node-exporter的endpoint
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_endpoint_port_name]
          regex: true;prometheus-node-exporter
          action: keep
        # Match regex against the concatenated source_labels. Then, set target_label to replacement, 
        # with match group references (${1}, ${2}, ...) in replacement substituted by their value. 
        # If regex does not match, no replacement takes place.
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
          action: replace
          target_label: __scheme__
          regex: (https?)
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
          action: replace
          target_label: __address__
          regex: (.+)(?::\d+);(\d+)
          replacement: $1:$2
        # 去掉label name中的字首__meta_kubernetes_service_label_
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        # 將__meta_kubernetes_namespace重新命名為kubernetes_namespace
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        # 將__meta_kubernetes_service_name重新命名為kubernetes_name
        - source_labels: [__meta_kubernetes_service_name]
          action: replace
          target_label: kubernetes_name

在Prometheus中可以看到相應的target:

3.3 採集應用例項中某個程序自己暴露的指標資料

有的應用具有暴露容器內具體程序效能指標的需求,這些指標由應用側實現採集並暴露,平臺側做匯聚。

3.3.1 如何標識哪些是主動暴露監控指標的應用並獲取指標

平臺側可以約定好帶哪些annotation字首的服務是自主暴露監控指標的服務。應用新增平臺側約定的這些annotations,平臺側可以根據這些annotations實現Prometheus的scrape。

例如,應用側為自己的服務新增如下平臺側約定約定的annotation:

prometheus.io/scrape: 'true'
prometheus.io/app-metrics: 'true'
prometheus.io/app-metrics-port: '8080'
prometheus.io/app-metrics-path: '/metrics'

Prometheus可以:

  • 根據prometheus.io/scrape: 'true'獲知對應的endpoint是需要被scrape的
  • 根據prometheus.io/app-metrics: 'true'獲知對應的endpoint中有應用程序暴露的metrics
  • 根據prometheus.io/app-metrics-port: '8080'獲知程序暴露的metrics的埠號
  • 根據prometheus.io/app-metrics-path: '/metrics'獲知程序暴露的metrics的具體路徑

3.3.2 如何給應用加一些標誌資訊,並帶到Prometheus側

可能還需要根據平臺和業務的需求新增其他一些以prometheus.io/app-info-為字首的annotation,Prometheus擷取下字首,保留後半部分做key,連同value保留下來。這樣滿足在平臺對應用做其他一些標識的需求。比如加入如下annotation來標識應用所屬的的環境、租戶以及應用名稱:

prometheus.io/app-info-env: 'test'
prometheus.io/app-info-tenant: 'test-tenant'
prometheus.io/app-info-name: 'test-app'

Prometheus的config配置如下:

      - job_name: 'kubernetes-app-metrics'
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:
        #The endpoints role discovers targets from listed endpoints of a service. For each
        #endpoint address one target is discovered per port. If the endpoint is backed by
        #a pod, all additional container ports of the pod, not bound to an endpoint port,
        #are discovered as targets as well
        - role: endpoints
        relabel_configs:
        # 只保留endpoint中含有prometheus.io/scrape: 'true'的annotation的endpoint
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_service_annotation_prometheus_io_app_metrics]
          regex: true;true
          action: keep
        # 將使用者指定的程序的metrics_path替換預設的metrics_path
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_app_metrics_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        # 用pod_ip和使用者指定的程序的metrics埠組合成真正的可以拿到資料的地址來替換原始__address__
        - source_labels: [__meta_kubernetes_pod_ip, __meta_kubernetes_service_annotation_prometheus_io_app_metrics_port]
          action: replace
          target_label: __address__
          regex: (.+);(.+)
          replacement: $1:$2
        # 去掉label name中的字首__meta_kubernetes_service_annotation_prometheus_io_app_info_
        - action: labelmap
          regex: __meta_kubernetes_service_annotation_prometheus_io_app_info_(.+)

備註:

最後兩行的作用是將例如prometheus.io/app-info-tenant的annotation名切割成名為tenant的label。

在Prometheus中可以獲取到對應的應用程序targets:

3.4 通過blackbox-exporter採集應用的網路效能資料

blackbox-exporter是一個黑盒探測工具,可以對服務的http、tcp、icmp等進行網路探測。

3.4.1 blackbox-exporter部署

我的blackbox-exporter部署檔案如下:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: prometheus-blackbox-exporter
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: prometheus-blackbox-exporter
  replicas: 1
  template:
    metadata:
      labels:
        app: prometheus-blackbox-exporter
    spec:
      restartPolicy: Always
      containers:
      - name: prometheus-blackbox-exporter
        image: prom/blackbox-exporter:v0.12.0
        imagePullPolicy: IfNotPresent
        ports:
        - name: blackbox-port
          containerPort: 9115
        readinessProbe:
          tcpSocket:
            port: 9115
          initialDelaySeconds: 5
          timeoutSeconds: 5
        resources:
          requests:
            memory: 50Mi
            cpu: 100m
          limits:
            memory: 60Mi
            cpu: 200m
        volumeMounts:
        - name: config
          mountPath: /etc/blackbox_exporter
        args:
        - --config.file=/etc/blackbox_exporter/blackbox.yml
        - --log.level=debug
        - --web.listen-address=:9115
      volumes:
      - name: config
        configMap:
          name: prometheus-blackbox-exporter
      nodeSelector:
        node-role.kubernetes.io/master: "true"
      tolerations:
      - key: "node-role.kubernetes.io/master"
        effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: prometheus-blackbox-exporter
  name: prometheus-blackbox-exporter
  namespace: kube-system
  annotations:
    prometheus.io/scrape: 'true'
spec:
  type: NodePort
  selector:
    app: prometheus-blackbox-exporter
  ports:
  - name: blackbox
    port: 9115
    targetPort: 9115
    nodePort: 30009
    protocol: TCP

對應的configmap如下:

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app: prometheus-blackbox-exporter
  name: prometheus-blackbox-exporter
  namespace: kube-system
data:
  blackbox.yml: |-
    modules:
      http_2xx:
        prober: http
        timeout: 10s
        http:
          valid_http_versions: ["HTTP/1.1", "HTTP/2"]
          valid_status_codes: []
          method: GET
          preferred_ip_protocol: "ip4"
      http_post_2xx: # http post 監測模組
        prober: http
        timeout: 10s
        http:
          valid_http_versions: ["HTTP/1.1", "HTTP/2"]
          method: POST
          preferred_ip_protocol: "ip4"
      tcp_connect:
        prober: tcp
        timeout: 10s
      icmp:
        prober: icmp
        timeout: 10s
        icmp:
          preferred_ip_protocol: "ip4"

備註:

blackbox-exporter的配置檔案為/etc/blackbox_exporter/blackbox.yml,可以執行時動態的重新載入配置檔案,當重新載入配置檔案失敗時,不影響在執行的配置。過載方式:curl -XPOST http://IP:9115/-/reload

3.4.2 Prometheus配置

在Prometheus的config檔案中分別配置對http和tcp的探測:

      - job_name: 'kubernetes-service-http-probe'
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:
        - role: service
        # 將metrics_path由預設的/metrics改為/probe
        metrics_path: /probe
        # Optional HTTP URL parameters.
        # 生成__param_module="http_2xx"的label
        params:
          module: [http_2xx]
        relabel_configs:
        # 只保留含有label為prometheus/io=scrape的service
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_service_annotation_prometheus_io_http_probe]
          regex: true;true
          action: keep
        - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_namespace, __meta_kubernetes_service_annotation_prometheus_io_http_probe_port, __meta_kubernetes_service_annotation_prometheus_io_http_probe_path]
          action: replace
          target_label: __param_target
          regex: (.+);(.+);(.+);(.+)
          replacement: $1.$2:$3$4
        # 用__address__這個label的值建立一個名為__param_target的label為blackbox-exporter,值為內部service的訪問地址,作為blackbox-exporter採集用
        #- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_http_probe_path]
        #  action: replace
        #  target_label: __param_target
        #  regex: (.+);(.+)
        #  replacement: $1$2
        # 用blackbox-exporter的service地址值”prometheus-blackbox-exporter:9115"替換原__address__的值
        - target_label: __address__
          replacement: prometheus-blackbox-exporter:9115
        - source_labels: [__param_target]
          target_label: instance
        # 去掉label name中的字首__meta_kubernetes_service_annotation_prometheus_io_app_info_
        - action: labelmap
          regex: __meta_kubernetes_service_annotation_prometheus_io_app_info_(.+)
        #- source_labels: [__meta_kubernetes_namespace]
        #  target_label: kubernetes_namespace
        #- source_labels: [__meta_kubernetes_service_name]
        #  target_label: kubernetes_name
      ## kubernetes-services and kubernetes-ingresses are blackbox_exporter related
      
      # Example scrape config for probing services via the Blackbox Exporter.
      # 
      # The relabeling allows the actual service scrape endpoint to be configured
      # for all or only some services.
      - job_name: 'kubernetes-service-tcp-probe'
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:
        - role: service
        # 將metrics_path由預設的/metrics改為/probe
        metrics_path: /probe
        # Optional HTTP URL parameters.
        # 生成__param_module="tcp_connect"的label
        params:
          module: [tcp_connect]
        relabel_configs:
        # 只保留含有label為prometheus/io=scrape的service
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_service_annotation_prometheus_io_tcp_probe]
          regex: true;true
          action: keep
        - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_namespace, __meta_kubernetes_service_annotation_prometheus_io_tcp_probe_port]
          action: replace
          target_label: __param_target
          regex: (.+);(.+);(.+)
          replacement: $1.$2:$3
        # 用__address__這個label的值建立一個名為__param_target的label為blackbox-exporter,值為內部service的訪問地址,作為blackbox-exporter採集用
        #- source_labels: [__address__]
        #  target_label: __param_target
        # 用blackbox-exporter的service地址值”prometheus-blackbox-exporter:9115"替換原__address__的值
        - target_label: __address__
          replacement: prometheus-blackbox-exporter:9115
        - source_labels: [__param_target]
          target_label: instance
        # 去掉label name中的字首__meta_kubernetes_service_annotation_prometheus_io_app_info_
        - action: labelmap
          regex: __meta_kubernetes_service_annotation_prometheus_io_app_info_(.+)

3.4.3 應用側配置

應用可以在service中指定平臺側約定的annotation,實現監控平臺對該應用的網路服務進行探測:

  • http探測
   prometheus.io/scrape: 'true'
   prometheus.io/http-probe: 'true'
   prometheus.io/http-probe-port: '8080'
   prometheus.io/http-probe-path: '/healthz'
  • tcp探測
   prometheus.io/scrape: 'true'
   prometheus.io/tcp-probe: 'true'
   prometheus.io/tcp-probe-port: '80'

Prometheus根據這些annotation可以獲知相應service是需要被探測的,探測的具體網路協議是http還是tcp或其他,以及具體的探測埠。http探測的話還要知道探測的具體url。

在Prometheus中可以獲取到對應的targets:

3.5 資源物件(Deployment、Pod等)的狀態—kube-state-metrics

kube-state-metrics採集了k8s中各種資源物件的狀態資訊:

kube_daemonset_*(建立時間、所處的階段、期望跑在幾臺節點上、應當正在執行的節點數量、不應該跑卻跑了daemon pod的節點數、跑好了pod(ready)的節點數)	
kube_deployment_*(建立時間、是否將k8s的label轉化為prometheus的label、所處的階段、是不是以處於paused狀態並不再被dp controller處理、期望副本數、rolling update時最多不可用副本數、dp controller觀察到的階段、實際副本數、availabel副本數、unavailabel副本數、updated副本數)
kube_job_*(是否執行完成、建立時間戳...)
kube_namespace_*
kube_node_*
kube_persistentvolumeclaim_*
kube_pod_container_*
kube_pod_*
kube_replicaset_*
kube_service_*
kube_statefulset_*

檢視資料(IP:Port/metrics),可以看到是以prometheus的格式輸出:

3.5.1 kube-state-metrics部署

我的kube-state-metrics部署檔案kube-state-metrics-deployment.yaml如下,可以直接從github下載。

apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube-state-metrics
  namespace: kube-system
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: kube-state-metrics
  namespace: kube-system
  labels:
    app: kube-state-metrics
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: kube-state-metrics
    spec:
      serviceAccountName: kube-state-metrics
      containers:
      - name: kube-state-metrics
        image: daocloud.io/liukuan73/kube-state-metrics:v1.1.0
        ports:
        - containerPort: 8080
      restartPolicy: Always
      nodeSelector:
        node-role.kubernetes.io/master: "true"
      tolerations:
      - key: "node-role.kubernetes.io/master"
        effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/http-probe: 'true'
    prometheus.io/http-probe-path: '/healthz'
    prometheus.io/http-probe-port: '8080'
  name: kube-state-metrics
  namespace: kube-system
  labels:
    app: kube-state-metrics
spec:
  type: NodePort
  ports:
  - name: kube-state-metrics
    port: 8080
    targetPort: 8080
    nodePort: 30005
  selector:
    app: kube-state-metrics

3.5.2 Prometheus配置

配置Prometheus來scrape來自kube-state-metrics的metrics:

      - job_name: 'kube-state-metrics'
        
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:

        #The endpoints role discovers targets from listed endpoints of a service. For each
        #endpoint address one target is discovered per port. If the endpoint is backed by
        #a pod, all additional container ports of the pod, not bound to an endpoint port,
        #are discovered as targets as well
        - role: endpoints
        relabel_configs:
        # 只保留endpoint中的annotations含有prometheus.io/scrape: 'true'和port的name為prometheus-node-exporter的endpoint
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape,__meta_kubernetes_endpoint_port_name]
          regex: true;kube-state-metrics
          action: keep
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
          action: replace
          target_label: __scheme__
          regex: (https?)
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
          action: replace
          target_label: __address__
          regex: (.+)(?::\d+);(\d+)
          replacement: $1:$2
        # 去掉label name中的字首__meta_kubernetes_service_label_
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        # 將__meta_kubernetes_namespace重新命名為kubernetes_namespace
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        # 將__meta_kubernetes_service_name重新命名為kubernetes_name
        - source_labels: [__meta_kubernetes_service_name]
          action: replace
          target_label: kubernetes_name

在prometheus中可以看到相應的target:

3.6 k8s叢集元件的狀態指標採集

etcd、kube-controller-manager、kube-scheduler、kube-proxy、kube-apiserver、kubelet這幾個k8d平臺元件分別向外暴露了prometheus標準的指標介面/metrics。可通過配置prometheus來進行讀取。

3.6.1 etcd指標獲取

以kubeadm啟動的k8s叢集中,etcd是以static pod的形式啟動的,預設沒有service及對應的endpoint可供叢集內的prometheus訪問。所以首先建立一個用來為prometheus提供介面的service(endpoint),etcd-svc.yaml檔案如下:

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: etcd-prometheus-discovery
  labels:
    component: etcd
  annotations:
    prometheus.io/scrape: 'true'
spec:
  selector:
    component: etcd
  type: ClusterIP
  clusterIP: None
  ports:
  - name: http-metrics
    port: 2379
    targetPort: 2379
    protocol: TCP

prometheus配置抓取的檔案加入如下配置:

      - job_name: 'etcd'

        # 通過https訪問apiserver
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        #以k8s的角色(role)來定義收集,比如node,service,pod,endpoints,ingress等等
        kubernetes_sd_configs:
        # 從endpoints獲取apiserver資料
        - role: endpoints

        #relabel_configs允許在抓取之前對任何目標及其標籤進行修改。
        relabel_configs:
        # 選擇哪些label
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_namespace, __meta_kubernetes_service_name]
          # 上述選擇的label的值需要與下述對應
          regex: true;kube-system;etcd-prometheus-discovery
          # 含有符合regex的source_label的endpoints進行保留
          action: keep

3.6.2 kube-proxy指標獲取

kube-proxy通過10249埠暴露/metrics指標。與3.6.1同理,kube-proxy-svc.yaml如下:

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-proxy-prometheus-discovery
  labels:
    k8s-app: kube-proxy
  annotations:
    prometheus.io/scrape: 'true'
spec:
  selector:
    k8s-app: kube-proxy
  type: ClusterIP
  clusterIP: None
  ports:
  - name: http-metrics
    port: 10249
    targetPort: 10249
    protocol: TCP

prometheus配置抓取的檔案加入如下配置:

        - job_name: 'kube-proxy'

        # 通過https訪問apiserver
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        #以k8s的角色(role)來定義收集,比如node,service,pod,endpoints,ingress等等
        kubernetes_sd_configs:
        # 從endpoints獲取apiserver資料
        - role: endpoints

        #relabel_configs允許在抓取之前對任何目標及其標籤進行修改。
        relabel_configs:
        # 選擇哪些label
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_namespace, __meta_kubernetes_service_name]
          # 上述選擇的label的值需要與下述對應
          regex: true;kube-system;kube-proxy-prometheus-discovery
          # 含有符合regex的source_label的endpoints進行保留
          action: keep

3.6.3 kube-scheduler指標獲取

kube-scheduler通過10251埠暴露/metrics指標。與3.6.1同理,kube-scheduler-svc.yaml如下:

      - job_name: 'kube-scheduler'

        # 通過https訪問apiserver
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        #以k8s的角色(role)來定義收集,比如node,service,pod,endpoints,ingress等等
        kubernetes_sd_configs:
        # 從endpoints獲取apiserver資料
        - role: endpoints

        #relabel_configs允許在抓取之前對任何目標及其標籤進行修改。
        relabel_configs:
        # 選擇哪些label
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_namespace, __meta_kubernetes_service_name]
          # 上述選擇的label的值需要與下述對應
          regex: true;kube-system;kube-scheduler-prometheus-discovery
          # 含有符合regex的source_label的endpoints進行保留
          action: keep

prometheus對應的配置如下:

      - job_name: 'kube-scheduler'

        # 通過https訪問apiserver
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        #以k8s的角色(role)來定義收集,比如node,service,pod,endpoints,ingress等等
        kubernetes_sd_configs:
        # 從endpoints獲取apiserver資料
        - role: endpoints

        #relabel_configs允許在抓取之前對任何目標及其標籤進行修改。
        relabel_configs:
        # 選擇哪些label
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_namespace, __meta_kubernetes_service_name]
          # 上述選擇的label的值需要與下述對應
          regex: true;kube-system;kube-scheduler-prometheus-discovery
          # 含有符合regex的source_label的endpoints進行保留
          action: keep

3.6.4 kube-controller-manager指標獲取

kube-controller-manager通過10252埠暴露/metrics指標。與3.6.1同理,kube-controller-manager-svc.yaml如下:

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-controller-manager-prometheus-discovery
  labels:
    k8s-app: kube-controller-manager
  annotations:
    prometheus.io/scrape: 'true'
spec:
  selector:
    k8s-app: kube-controller-manager
  type: ClusterIP
  clusterIP: None
  ports:
  - name: http-metrics
    port: 10252
    targetPort: 10252
    protocol: TCP

prometheus配置抓取的檔案加入如下配置:

      - job_name: 'kube-controller-manager'

        # 通過https訪問apiserver
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        #以k8s的角色(role)來定義收集,比如node,service,pod,endpoints,ingress等等
        kubernetes_sd_configs:
        # 從endpoints獲取apiserver資料
        - role: endpoints

        #relabel_configs允許在抓取之前對任何目標及其標籤進行修改。
        relabel_configs:
        # 選擇哪些label
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_namespace, __meta_kubernetes_service_name]
          # 上述選擇的label的值需要與下述對應
          regex: true;kube-system;kube-controller-manager-prometheus-discovery
          # 含有符合regex的source_label的endpoints進行保留
          action: keep

以上四步配置好後可以看到分別增加了etcd-prometheus-discovery、kube-controller-manager-prometheus-discovery、kube-scheduler-prometheus-discovery和kube-proxy四個endpints:

prometheus中也可以看到抓取到了相應target:

3.6.5 kube-apiserver資料獲取

kube-apiserver與上面四個元件不同的是,部署好後集群中預設會有一個名為kubernetes的service和對應的名為kubernetes的endpoint,這個endpoint就是叢集內的kube-apiserver的訪問入口。可以如下配置prometheus抓取資料:

      - job_name: 'kube-apiservers'

        # 通過https訪問apiserver
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        #以k8s的角色(role)來定義收集,比如node,service,pod,endpoints,ingress等等
        kubernetes_sd_configs:
        # 從endpoints獲取apiserver資料
        - role: endpoints

        #relabel_configs允許在抓取之前對任何目標及其標籤進行修改。
        relabel_configs:
        # 選擇哪些label
        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
          # 上述選擇的label的值需要與下述對應
          regex: default;kubernetes;https
          # 含有符合regex的source_label的endpoints進行保留
          action: keep

之後就可以看到prometheus多了一個kube-apiserver的target:

3.6.6 kubelet資料獲取

kubelet暴露的metrics埠預設為 10255

  • 提供的prometheus格式指標介面:nodeIP:10255/metrics,使用Prometheus從這裡取資料
  • kubelet提供的stats/summary介面:nodeIP:10255/stats/summary,heapster和最新的metrics-server從這裡獲取資料

kubelet採集的指標主要有:

apiserver_client_certificate_expiration_seconds_bucket
apiserver_client_certificate_expiration_seconds_sum
apiserver_client_certificate_expiration_seconds_count
etcd_helper_cache_entry_count
etcd_helper_cache_hit_count
etcd_helper_cache_miss_count
etcd_request_cache_add_latencies_summary
etcd_request_cache_add_latencies_summary_sum
etcd_request_cache_add_latencies_summary_count
etcd_request_cache_get_latencies_summary
etcd_request_cache_get_latencies_summary_sum
etcd_request_cache_get_latencies_summary_count
kubelet_cgroup_manager_latency_microseconds
kubelet_containers_per_pod_count
kubelet_docker_operations
kubelet_network_plugin_operations_latency_microseconds
kubelet_pleg_relist_interval_microseconds
kubelet_pleg_relist_latency_microseconds
kubelet_pod_start_latency_microseconds
kubelet_pod_worker_latency_microseconds
kubelet_running_container_count
kubelet_running_pod_count
kubelet_runtime_operations*
kubernetes_build_info
process_cpu_seconds_total
reflector*
rest_client_request_*
storage_operation_duration_seconds_*

檢視kubelet監控指標資料(nodeIP:10255/metrics):

# HELP apiserver_audit_event_total Counter of audit events generated and sent to the audit backend.
# TYPE apiserver_audit_event_total counter
apiserver_audit_event_total 0
# HELP apiserver_client_certificate_expiration_seconds Distribution of the remaining lifetime on the certificate used to authenticate a request.
# TYPE apiserver_client_certificate_expiration_seconds histogram
apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="21600"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="43200"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="86400"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="172800"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="345600"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="604800"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="2.592e+06"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="7.776e+06"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="1.5552e+07"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="3.1104e+07"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="+Inf"} 4161
apiserver_client_certificate_expiration_seconds_sum 1.3091542942737878e+12
apiserver_client_certificate_expiration_seconds_count 4161
...

kubelet由於在每個節點上都有且僅有一個,所以可以通過k8s的node物件找到kubelet的指標,prometheus配置如下:

      - job_name: 'kubelet'
        # 通過https訪問apiserver,通過apiserver的api獲取資料
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                
        #以k8s的角色(role)來定義收集,比如node,service,pod,endpoints,ingress等等 
        kubernetes_sd_configs:
        # 從k8s的node物件獲取資料
        - role: node
        relabel_configs:
        # 用新的字首代替原label name字首,沒有replacement的話功能就是去掉label_name字首
        # 例如:以下兩句的功能就是將__meta_kubernetes_node_label_kubernetes_io_hostname
        # 變為kubernetes_io_hostname
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        # replacement中的值將會覆蓋target_label中指定的label name的值,
        # 即__address__的值會被替換為kubernetes.default.svc:443
        - target_label: __address__
          replacement: kubernetes.default.svc:443
          #replacement: 10.142.21.21:6443
        # 獲取__meta_kubernetes_node_name的值
        - source_labels: [__meta_kubernetes_node_name]
          #匹配一個或多個任意字元,將上述source_labels的值生成變數
          regex: (.+)
          # 將# replacement中的值將會覆蓋target_label中指定的label name的值,
          # 即__metrics_path__的值會被替換為/api/v1/nodes/${1}/proxy/metrics,
          # 其中${1}的值會被替換為__meta_kubernetes_node_name的值
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics
        #or:
        #- source_labels: [__address__]
        #  regex: '(.*):10250'
        #  replacement: '${1}:4194'
        #  target_label: __address__
        #- source_labels: [__meta_kubernetes_node_label_role]
        #  action: replace
        #  target_label: role

4. 資料匯聚層—Prometheus部署和配置

前面介紹了K8S平臺的監控資料怎麼採集出來,接下來介紹怎麼用prometheus收集和處理。

4.1 prometheus全域性配置檔案

,這個是我改寫的prometheus全域性配置的configmap檔案,我參考的Github官方示例,然後做適當改動並增加了一些註釋。prometheus的config檔案的更詳細語法請見官網

4.2 prometheus告警規則配置檔案

Prometheus中的告警規則允許使用者基於PromQL表示式定義告警觸發條件,Prometheus後端對這些觸發規則進行週期性的計算,當滿足觸發條件後會觸發告警通知。使用者可以通過Prometheus的Web介面檢視這些告警規則(/rules)以及告警的觸發狀態(/alerts)。當Promthues與Alertmanager關聯之後,可以將告警傳送到外部服務如Alertmanager中並通過Alertmanager可以對這些告警進行進一步的處理。

在Prometheus全域性配置檔案中通過rule_files指定一組告警規則檔案的路徑。Prometheus啟動後會自動掃描這些路徑下規則檔案中定義的內容,並且根據這些規則計算是否向外部發送通知。計算告警觸發條件的週期可以通過global下的evaluation_interval進行配置:

global:
  [ evaluation_interval: <duration> | default = 1m ]
rule_files:
  [ - <filepath_glob> ... ]

4.2.1 定義告警規則

告警規則檔案使用yaml格式進行定義,一個典型的告警配置如下:

groups:
- name: example
  rules:
  - alert: HighErrorRate
    expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
    for: 10m
    labels:
      severity: page
    annotations:
      summary: High request latency
      description: description info

在告警規則檔案中,我們可以將一組相關的規則設定定義在一個group下。在每一個group中我們可以定義多個告警規則(rule)。一條告警規則主要由以下幾部分組成:

  • alert:告警規則的名稱。
  • expr:基於PromQL表示式告警觸發條件,用於計算是否有時間序列滿足該條件。
  • for:評估等待時間,可選引數。用於表示只有當觸發條件持續一段時間後才傳送告警。在等待期間新產生告警的狀態為pending。
  • labels:自定義標籤,允許使用者指定要附加到告警上的一組附加標籤。
  • annotations:用於指定一組附加資訊,比如用於描述告警詳細資訊的文字等。

Prometheus根據global.evaluation_interval定義的週期計算PromQL表示式。如果PromQL表示式能夠找到匹配的時間序列則會為每一條時間序列產生一個告警例項。

rules的config檔案的詳細語法請見官網

4.2.2 模板化

一般來說,在告警規則檔案的annotations中使用summary描述告警的概要資訊,description用於描述告警的詳細資訊。同時Alertmanager的UI也會根據這兩個標籤值,顯示告警資訊。為了讓告警資訊具有更好的可讀性,Prometheus支援模板化label和annotations的中標籤的值。

通過$labels.變數可以訪問當前告警例項中指定標籤的值。$value則可以獲取當前PromQL表示式計算的樣本值。

# To insert a firing element's label values:
{{ $labels.<labelname> }}
# To insert the numeric expression value of the firing element:
{{ $value }}

例如,可以通過模板化優化summary以及description的內容的可讀性:

groups:
- name: example
  rules:
​
  # Alert for any instance that is unreachable for >5 minutes.
  - alert: InstanceDown
    expr: up == 0
    for: 5m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} down"
      description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."
​
  # Alert for any instance that has a median request latency >1s.
  - alert: APIHighRequestLatency
    expr: api_http_request_latencies_second{quantile="0.5"} > 1
    for: 10m
    annotations:
      summary: "High request latency on {{ $labels.instance }}"
      description: "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)"

4.2.3 告警規則建立

,這是我的prometheus配置告警規則的configmap檔案。參考的這裡
告警規則允許你定義基於Prometheus語言表達的報警條件,併發送報警通知到外部服務,如AlertManager。rules的config檔案的詳細語法請見官網

可以在Prometheus的alert介面看到告警規則:

可以在Prometheus的alert介面看到告警資訊:

4.3 prometheus啟動檔案

prometheus.yaml是prometheus以deployment起在kubernetes上的部署檔案,內容如下:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: prometheus
  namespace: kube-system
  labels:
    app: prometheus
spec:
  replicas: 1
  template:
    metadata:
      name: prometheus
      labels:
        app: prometheus
    spec:
#      hostNetwork: true
      containers:
      - name: prometheus
        image: prom/prometheus:v2.3.2
        imagePullPolicy: IfNotPresent
        args:
          - '--storage.tsdb.path=/prometheus/data/'
          - '--storage.tsdb.retention=1d'
          - '--config.file=/etc/prometheus/prometheus.yaml'
          - '--web.enable-lifecycle'
        ports:
        - name: webui
          containerPort: 9090
        resources:
          requests:
            cpu: 500m
            memory: 500M
        #  limits:
        #    cpu: 500m
        #    memory: 500M
        volumeMounts:
        - name: config-volume
          mountPath: /etc/prometheus
        - name: rules-volume
          mountPath: /etc/prometheus-rules
      volumes:
      - name: config-volume
        configMap:
          name: prometheus
      - name: rules-volume
        configMap:
          name: prometheus-rules
      nodeSelector:
        node-role.kubernetes.io/master: "true"
      tolerations:
      - key: "node-role.kubernetes.io/master"
        effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: kube-system
  labels:
    app: prometheus
  annotations:
    prometheus.io/scrape: 'true'
spec:
  type: NodePort
#  type: ClusterIP
  ports:
    - name: webui
      port: 9090
      protocol: TCP
      nodePort: 30006
  selector:
    app: prometheus

對args引數做幾點說明:

  • --storage.tsdb.path:tsdb資料庫儲存路徑
  • --storage.tsdb.retention:資料保留多久,可以看官方文件儲存部分
  • --config.file:指定prometheus的config檔案的路徑
  • --web.enable-lifecycle:加上這個引數後可以向/-/reload(curl -XPOST 10.142.232.150:30006/-/reload)傳送HTTP POST請求實現prometheus在config檔案修改後的動態reload,更多資訊請檢視官方文件
  • --web.enable-admin-api:加上這個引數可以為一些高階使用者暴露操作資料庫功能的API,比如快照備份(curl -XPOST http://<prometheus>/api/v2/admin/tsdb/snapshot),更多資訊請檢視官方文件 TSDB Admin APIs部分

啟動後可以(IP:Port/targets)檢視所有targets:

5.視覺化展示

5.1安裝grafana

rpm部署方式:

wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-5.0.4-1.x86_64.rpm
yum -y localinstall grafana-5.0.4-1.x86_64.rpm 
systemctl enable grafana-server
systemctl start grafana-server

docker部署方式:

docker run -d -p 3000:3000 --name=grafana -e "GF_SERVER_HTTP_PORT=3000" -e "GF_AUTH_BASIC_ENABLED=false" -e "GF_AUTH_ANONYMOUS_ENABLED=true" -e "GF_AUTH_ANONYMOUS_ORG_ROLE=Admin"  -e "GF_SERVER_ROOT_URL=/" daocloud.io/liukuan73/grafana:5.0.0
		

kubernetes部署方式(推薦):
安裝grafana,grafana.yaml如下:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-grafana
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: grafana
    spec:
      containers:
      - name: grafana
        image: daocloud.io/liukuan73/grafana:5.0.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000
          protocol: TCP
        volumeMounts:
        - mountPath: /var
          name: grafana-storage
        env:
        - name: INFLUXDB_HOST
          value: monitoring-influxdb
        - name: GF_SERVER_HTTP_PORT
          value: "3000"
          # The following env variables are required to make Grafana accessible via
          # the kubernetes api-server proxy. On production clusters, we recommend
          # removing these env variables, setup auth for grafana, and expose the grafana
          # service using a LoadBalancer or a public IP.
        - name: GF_AUTH_BASIC_ENABLED
          value: "false"
        - name: GF_AUTH_ANONYMOUS_ENABLED
          value: "true"
        - name: GF_AUTH_ANONYMOUS_ORG_ROLE
          value: Admin
        - name: GF_SERVER_ROOT_URL
          # If you're only using the API Server proxy, set this value instead:
          # value: /api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
          value: /
      volumes:
      - name: grafana-storage
        emptyDir: {}
      nodeSelector:
        node-role.kubernetes.io/master: "true"
      tolerations:
      - key: "node-role.kubernetes.io/master"
        effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
  labels:
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-grafana
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/tcp-probe: 'true'
    prometheus.io/tcp-probe-port: '80'
  name: monitoring-grafana
  namespace: kube-system
spec:
  type: NodePort
  # In a production setup, we recommend accessing Grafana through an external Loadbalancer
  # or through a public IP.
  # type: LoadBalancer
  # You could also use NodePort to expose the service at a randomly-generated port
  # type: NodePort
  ports:
  - port: 80
    targetPort: 3000
    nodePort: 30007
  selector:
    k8s-app: grafana

5.2 grafana配置

配置prometheus資料來源:
點選“Add data source”配置資料來源:

配置dashboard:
可以手工配置dashboard,也可以如下圖直接使用官網上別人分享的已配好的模板,直接輸入需要import的模板號即可:這裡寫圖片描述
效果:
這裡寫圖片描述
一些比較實用的模板:

  • 315這個模板是cadvisor採集的各種指標的圖表
  • 1860這個模板是node-exporter採集的各種主機相關的指標的圖表
  • 6417這個模板是kube-state-metrics採集的各種k8s資源物件的狀態的圖表
  • 48594865這兩個模板是blackbox-exporter採集的服務的http狀態指標的圖表(兩個效果基本一樣,選擇其一即可)
  • 5345這個模板是blackbox-exporter採集的服務的網路狀態指標的圖表

6. 告警

設定警報和通知的主要步驟:

  • 安裝配置Alertmanager
  • 配置Prometheus與Alertmanager通訊
  • 在Prometheus中建立告警規則

6.1 告警元件AlertManager部署

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: alertmanager
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      name: alertmanager
      labels:
        app: alertmanager
    spec:
      containers:
      - name: alertmanager
        image: prom/alertmanager:v0.11.0
        args
          - '-config.file=/etc/alertmanager/config.yml'
          - '-storage.path=/alertmanager'
          - '-web.external-url=http://alertmanager:9093'
        ports:
        - name: alertmanager
          containerPort: 9093
#        env:
#        - name: EXTERNAL_URL
#          valueFrom:
#            configMapKeyRef