1. 程式人生 > >Prometheus 監控 Redis 應用

Prometheus 監控 Redis 應用

使用Prometheus監控Redis,推薦使用官方外掛: https://github.com/oliver006/redis_exporter這個exporter。 目前我們主要使用的是redis的主從複製搭配Sentinel實現的高可用方案。 部署方式: 將exporter部署在虛擬機器上解壓後執行:

redis_exporter -web.listen-address=:9121 -redis.addr redis://xxxx:26379 -redis.password password

redis_exporter提供了一個Grafana Dashboard畫圖 Grafana Dashboard

prometheus server配置:

global:
  scrape_interval: 10s
  scrape_timeout: 10s
  evaluation_interval: 1m
alerting:
  alertmanagers:
  - scheme: http
    static_configs:
    - targets:
      - "xxxx:9093"
rule_files:
  - /usr/local/prometheus/rules/*.rules
scrape_configs:
- job_name: 'file-ds'
  file_sd_configs:
  - refresh_interval: 1m
    files:
    - ./conf.d/targets*.json
- job_name: 'redis'
  file_sd_configs:
  - refresh_interval: 1m
    files:
    - ./conf.d/redis*.json
 [
[email protected]
prometheus]# cat conf.d/redis-discovery.json [ { "targets": ["xxxx:9121"], "labels": { "instance": "preproduct-redis-xxx:9121" } }

新增報警規則:

[[email protected] prometheus]# cat  /usr/local/prometheus/rules/redis_alert.rules
groups:
- name: RedisStatsAlert
  rules:
  - alert: Redis is down
    expr: redis_up == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Instance {{ $labels.instance }} Redis is down"
      description: "Redis database is down. This requires immediate action!"
#  - alert: 最近一次建立 RDB 檔案,操作失敗                       
#    expr: redis_rdb_last_bgsave_duration_sec == 0
#    for: 1m
#    labels:
#      severity: warning
#    annotations:
#      summary: " Instance {{ $labels.instance }} rdb_last_bgsave_status  "
#      description: "最近一次建立 RDB 檔案的結果是失敗"
  - alert: master link status(複製連線當前的狀態)                                     
    expr: redis_master_link_up == 0
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} 複製連線當前斷開"
      description: "redis_master_link狀態是0 關閉狀態"
#  - alert: 最近一次建立 AOF 檔案失敗
#    expr: redis_aof_last_rewrite_duration_sec == 0
#    for: 1m
#    labels:
#      severity: warning
#    annotations:
#      summary: "Instance {{ $labels.instance }} redis aof last rewrite duration sec"
#      description: "最近一次建立 AOF 檔案失敗"