Prometheus之Alertmanager報警配置
阿新 • • 發佈:2021-11-18
一 Alertmanager配置
1.1 編輯Alertmanager配置檔案
點選檢視程式碼
root@node-02:~# cat /usr/local/alertmanager/alertmanager.yml global: smtp_from: '[email protected]' smtp_smarthost: 'smtp.qq.com:465' smtp_auth_username: '[email protected]' smtp_auth_password: 'xxxxxxxxx' smtp_hello: '@qq.com' smtp_require_tls: false route: group_by: ['alertname'] group_wait: 30s group_interval: 5s repeat_interval: 1m receiver: 'web.hook' receivers: - name: 'web.hook' email_configs: - to: '[email protected]' inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'critical' equal: ['alertname', 'dev', 'instance']
1.2 重啟Alertmanager服務
root@node-02:~# systemctl restart alertmanager
二 Prometheus報警設定
2.1 修改Prometheus配置檔案
root@prometheus-01:~# cat /usr/local/prometheus/prometheus.yml alerting: alertmanagers: - static_configs: - targets: - 192.168.174.104:9093 rule_files: - "rules/*.yaml" - "alert_rules/*.yaml"
2.2 建立告警規則檔案
root@prometheus-01:~# cat /usr/local/prometheus/alert_rules/instance_down.yaml groups: - name: ALLInstances rules: - alert: InstanceDown expr: up == 0 for: 1m annotations: title: 'Instance down' description: 'Instance has been down for more than 1 munute.' labels: severity: 'critical'
2.3 驗證規則
root@prometheus-01:~# /usr/local/prometheus/promtool check rules /usr/local/prometheus/alert_rules/instance_down.yaml
Checking /usr/local/prometheus/alert_rules/instance_down.yaml
SUCCESS: 1 rules found
2.4 重啟Prometheus服務
root@prometheus-01:~# systemctl restart prometheus.service
2.5 停止node_exporter服務
root@k8s-master-01:~# systemctl stop node-exporter