1. 程式人生 > 其它 >k8s容器中通過Prometheus Operator部署Redis Exporter監控Redis

k8s容器中通過Prometheus Operator部署Redis Exporter監控Redis

寫在前面

在按照下面步驟操作之前,請先確保伺服器已經部署k8s,prometheus,prometheus operator以及Redis,關於這些環境的部署,可以自行查詢相關資料安裝部署,本文件便不在此贅述。

關於prometheus監控這部分,大致的系統架構圖如下,感興趣的同學可以自行研究一下,這裡就不再具體說明。

1、Deployment(工作負載)以及Service(服務)部署

配置yaml可參考如下:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-standalone-exporter
  namespace: prometheus-exporter
  labels:
    app: redis-standalone-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis-standalone-exporter
  template:
    metadata:
      labels:
        app: redis-standalone-exporter
    spec:
      containers:
      - name: redis-standalone-exporter
        image: oliver006/redis_exporter:latest
        imagePullPolicy: IfNotPresent
        # 此處新增redis相關配置,例如:地址、密碼等
        # 如果是監控k8s容器外的Redis,則此處的redis.addr對應的值需要新增redis://字首,類似下面註釋的那樣
# args: ["-redis.addr", "redis://10.128.27.22:6379", "-redis.password", "123456@redis"] args: ["-redis.addr", "redis-standalone.monitorsoftware:6379", "-redis.password", "admin@123"] ports: - containerPort: 9121 --- apiVersion: v1 kind: Service metadata: labels: app: redis-standalone-exporter name: redis-standalone-exporter namespace: prometheus-exporter spec: type: ClusterIP ports: - name: metrics port: 9121 protocol: TCP targetPort: 9121 selector: app: redis-standalone-exporter

說明:

1> 關於yaml中配置,redis exporter有對應的模板說明,地址如下:https://github.com/oliver006/redis_exporter

2> 關於redis exporter 映象版本可以根據需要選擇對應的版本,映象倉庫地址如下:https://hub.docker.com/r/oliver006/redis_exporter/tags

3> 關於Redis 叢集監控,redis exporter有說明,配置修改成下面這樣

 但是,我在k8s環境下按照此種方式配置,並沒有成功,在github上檢視類似issue,也並沒有找到合適的解決方案,最後,採用了一個比較low的方案,就是在k8s容器環境下配置

一個redis例項一個redis exporter,如果大家有更好的解決方案,歡迎交流~

4> 部署成功圖如下:

(1)Deployment(工作負載

 (2)Service(服務

2、建立ServiceMonitor配置檔案

yaml配置檔案如下:

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: redis-standalone-exporter
    prometheus: k8s
  name: redis-standalone-exporter
  namespace: prometheus-exporter
spec:
  endpoints:
    - interval: 1m
      port: metrics
      params:
        target:
# 此處為redis地址值 - redis-standalone.monitorsoftware:6379 relabelings: - sourceLabels: [__param_target] targetLabel: instance namespaceSelector: matchNames: - prometheus-exporter selector: matchLabels: app: redis-standalone-exporter

說明:

1> prometheus operator是通過ServiceMonitor發現監控目標,並對其進行監控。serviceMonitor 是對service 獲取資料的一種方式。

  • promethus-operator可以通過serviceMonitor 自動識別帶有某些 label 的service ,並從這些service 獲取資料。
  • serviceMonitor 也是由promethus-operator 自動發現的。

2> prometheus監控過程如下:

 3> 部署成功圖如下

(1)serviceMonitor部署

 (2)Prometheus部署成功圖

3、Prometheus告警規則配置 

 prometheus rule規則配置:

---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: redis-exporter-rules
  namespace: prometheus-exporter
spec:
  groups:
    - name: redis-exporter
      rules:
        - alert: RedisDown
          expr: redis_up == 0
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis down (instance {{ $labels.instance }})
            description: "Redis instance is down\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisClusterFlapping
          expr: changes(redis_connected_slaves[1m]) > 1
          for: 2m
          labels:
            severity: critical
          annotations:
            summary: Redis cluster flapping (instance {{ $labels.instance }})
            description: "Changes have been detected in Redis replica connection. This can occur when replica nodes lose connection to the master and reconnect (a.k.a flapping).\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisReplicationBroken
          expr: delta(redis_connected_slaves[1m]) < 0
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis replication broken (instance {{ $labels.instance }})
            description: "Redis instance lost a slave\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisDisconnectedSlaves
          expr: count without (instance, job) (redis_connected_slaves) - sum without (instance, job) (redis_connected_slaves) - 1 > 1
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis disconnected slaves (instance {{ $labels.instance }})
            description: "Redis not replicating for all slaves. Consider reviewing the redis replication status.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisRejectedConnections
          expr: increase(redis_rejected_connections_total[1m]) > 0
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis rejected connections (instance {{ $labels.instance }})
            description: "Some connections to Redis has been rejected\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

1> prometheusRule規則配置,可以參考模板配置,模板網址如下:https://awesome-prometheus-alerts.grep.to/rules#redis

2> 部署成功圖如下:

 4、Grafana部署圖

4.1、grafana dashboard地址如下:https://grafana.com/grafana/dashboards

        單節點監控,推薦模板ID為:11835;

        叢集方式監控,推薦模板ID為:14615

4.2、dashboard效果圖如下

單節點監控:

 叢集監控: