k8s容器中通過Prometheus Operator部署Redis Exporter監控Redis
寫在前面
在按照下面步驟操作之前,請先確保伺服器已經部署k8s,prometheus,prometheus operator以及Redis,關於這些環境的部署,可以自行查詢相關資料安裝部署,本文件便不在此贅述。
關於prometheus監控這部分,大致的系統架構圖如下,感興趣的同學可以自行研究一下,這裡就不再具體說明。
1、Deployment(工作負載)以及Service(服務)部署
配置yaml可參考如下:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-standalone-exporter
namespace: prometheus-exporter
labels:
app: redis-standalone-exporter
spec:
replicas: 1
selector:
matchLabels:
app: redis-standalone-exporter
template:
metadata:
labels:
app: redis-standalone-exporter
spec:
containers:
- name: redis-standalone-exporter
image: oliver006/redis_exporter:latest
imagePullPolicy: IfNotPresent
# 此處新增redis相關配置,例如:地址、密碼等
# 如果是監控k8s容器外的Redis,則此處的redis.addr對應的值需要新增redis://字首,類似下面註釋的那樣
# args: ["-redis.addr", "redis://10.128.27.22:6379", "-redis.password", "123456@redis"]
args: ["-redis.addr", "redis-standalone.monitorsoftware:6379", "-redis.password", "admin@123"]
ports:
- containerPort: 9121
---
apiVersion: v1
kind: Service
metadata:
labels:
app: redis-standalone-exporter
name: redis-standalone-exporter
namespace: prometheus-exporter
spec:
type: ClusterIP
ports:
- name: metrics
port: 9121
protocol: TCP
targetPort: 9121
selector:
app: redis-standalone-exporter
說明:
1> 關於yaml中配置,redis exporter有對應的模板說明,地址如下:https://github.com/oliver006/redis_exporter
2> 關於redis exporter 映象版本可以根據需要選擇對應的版本,映象倉庫地址如下:https://hub.docker.com/r/oliver006/redis_exporter/tags
3> 關於Redis 叢集監控,redis exporter有說明,配置修改成下面這樣
但是,我在k8s環境下按照此種方式配置,並沒有成功,在github上檢視類似issue,也並沒有找到合適的解決方案,最後,採用了一個比較low的方案,就是在k8s容器環境下配置
4> 部署成功圖如下:
(1)Deployment(工作負載)
(2)Service(服務)
2、建立ServiceMonitor配置檔案
yaml配置檔案如下:
--- apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: app: redis-standalone-exporter prometheus: k8s name: redis-standalone-exporter namespace: prometheus-exporter spec: endpoints: - interval: 1m port: metrics params: target:
# 此處為redis地址值 - redis-standalone.monitorsoftware:6379 relabelings: - sourceLabels: [__param_target] targetLabel: instance namespaceSelector: matchNames: - prometheus-exporter selector: matchLabels: app: redis-standalone-exporter
說明:
1> prometheus operator是通過ServiceMonitor發現監控目標,並對其進行監控。serviceMonitor 是對service 獲取資料的一種方式。
- promethus-operator可以通過serviceMonitor 自動識別帶有某些 label 的service ,並從這些service 獲取資料。
- serviceMonitor 也是由promethus-operator 自動發現的。
2> prometheus監控過程如下:
3> 部署成功圖如下
(1)serviceMonitor部署
(2)Prometheus部署成功圖
3、Prometheus告警規則配置
prometheus rule規則配置:
--- apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: labels: prometheus: k8s role: alert-rules name: redis-exporter-rules namespace: prometheus-exporter spec: groups: - name: redis-exporter rules: - alert: RedisDown expr: redis_up == 0 for: 0m labels: severity: critical annotations: summary: Redis down (instance {{ $labels.instance }}) description: "Redis instance is down\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: RedisClusterFlapping expr: changes(redis_connected_slaves[1m]) > 1 for: 2m labels: severity: critical annotations: summary: Redis cluster flapping (instance {{ $labels.instance }}) description: "Changes have been detected in Redis replica connection. This can occur when replica nodes lose connection to the master and reconnect (a.k.a flapping).\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: RedisReplicationBroken expr: delta(redis_connected_slaves[1m]) < 0 for: 0m labels: severity: critical annotations: summary: Redis replication broken (instance {{ $labels.instance }}) description: "Redis instance lost a slave\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: RedisDisconnectedSlaves expr: count without (instance, job) (redis_connected_slaves) - sum without (instance, job) (redis_connected_slaves) - 1 > 1 for: 0m labels: severity: critical annotations: summary: Redis disconnected slaves (instance {{ $labels.instance }}) description: "Redis not replicating for all slaves. Consider reviewing the redis replication status.\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: RedisRejectedConnections expr: increase(redis_rejected_connections_total[1m]) > 0 for: 0m labels: severity: critical annotations: summary: Redis rejected connections (instance {{ $labels.instance }}) description: "Some connections to Redis has been rejected\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
1> prometheusRule規則配置,可以參考模板配置,模板網址如下:https://awesome-prometheus-alerts.grep.to/rules#redis
2> 部署成功圖如下:
4、Grafana部署圖
4.1、grafana dashboard地址如下:https://grafana.com/grafana/dashboards
單節點監控,推薦模板ID為:11835;
叢集方式監控,推薦模板ID為:14615
4.2、dashboard效果圖如下
單節點監控:
叢集監控: