使用prometheus operator監控envoy
kubernetes叢集三步安裝
概述
prometheus operator應當是使用監控系統的最佳實踐了,首先它一鍵構建整個監控系統,通過一些無侵入的手段去配置如監控資料來源等
故障自動恢復,高可用的告警等。。
不過對於新手使用上還是有一丟丟小門檻,本文就結合如何給envoy做監控這個例子來分享使用prometheus operator的正確姿勢
至於如何寫告警規則,如何配置prometheus查詢語句不是本文探討的重點,會在後續文章中給大家分享,本文著重探討如何使用prometheus operator
prometheus operator安裝
sealyun離線安裝包內已經包含prometheus operator,安裝完直接使用即可
配置監控資料來源
原理:通過operator的CRD發現監控資料來源service
啟動envoy
apiVersion: apps/v1 kind: Deployment metadata: name: envoy labels: app: envoy spec: replicas: 1 selector: matchLabels: app: envoy template: metadata: labels: app: envoy spec: volumes: - hostPath: # 為了配置方便把envory配置檔案掛載出來了 path: /root/envoy type: DirectoryOrCreate name: envoy containers: - name: envoy volumeMounts: - mountPath: /etc/envoy name: envoy readOnly: true image: envoyproxy/envoy:latest ports: - containerPort: 10000 # 資料埠 - containerPort: 9901 # 管理埠,metric是通過此埠暴露 --- kind: Service apiVersion: v1 metadata: name: envoy labels: app: envoy # 給service貼上標籤,operator會去找這個service spec: selector: app: envoy ports: - protocol: TCP port: 80 targetPort: 10000 name: user - protocol: TCP # service暴露metric的埠 port: 81 targetPort: 9901 name: metrics # 名字很重要,ServiceMonitor 會找埠名
envoy配置檔案:
監聽的地址一定需要修改成0.0.0.0,否則通過service獲取不到metric
/root/envoy/envoy.yaml
admin: access_log_path: /tmp/admin_access.log address: socket_address: protocol: TCP address: 0.0.0.0 # 這裡一定要改成0.0.0.0,而不能是127.0.0.1 port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: protocol: TCP address: 0.0.0.0 port_value: 10000 filter_chains: - filters: - name: envoy.http_connection_manager config: stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] routes: - match: prefix: "/" route: host_rewrite: sealyun.com cluster: service_google http_filters: - name: envoy.router clusters: - name: service_sealyun connect_timeout: 0.25s type: LOGICAL_DNS # Comment out the following line to test on v6 networks dns_lookup_family: V4_ONLY lb_policy: ROUND_ROBIN hosts: - socket_address: address: sealyun.com port_value: 443 tls_context: { sni: sealyun.com }
使用ServiceMonitor
envoyServiceMonitor.yaml:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app: envoy
name: envoy
namespace: monitoring # 這個可以與service不在一個namespace中
spec:
endpoints:
- interval: 15s
port: metrics # envoy service的埠名
path: /stats/prometheus # 資料來源path
namespaceSelector:
matchNames: # envoy service所在namespace
- default
selector:
matchLabels:
app: envoy # 選擇envoy service
create成功後我們就可以看到envoy的資料來源了:
然後就可以看到metric了:
然後就可以在grafana上進行一些配置了,promethues相關使用不是本文討論的物件
告警配置
alert manager配置
[[email protected] envoy]# kubectl get secret -n monitoring
NAME TYPE DATA AGE
alertmanager-main Opaque 1 27d
我們可以看到這個secrect,看下里面具體內容:
[[email protected] envoy]# kubectl get secret alertmanager-main -o yaml -n monitoring
apiVersion: v1
data:
alertmanager.yaml: Imdsb2JhbCI6IAogICJyZXNvbHZlX3RpbWVvdXQiOiAiNW0iCiJyZWNlaXZlcnMiOiAKLSAibmFtZSI6ICJudWxsIgoicm91dGUiOiAKICAiZ3JvdXBfYnkiOiAKICAtICJqb2IiCiAgImdyb3VwX2ludGVydmFsIjogIjVtIgogICJncm91cF93YWl0IjogIjMwcyIKICAicmVjZWl2ZXIiOiAibnVsbCIKICAicmVwZWF0X2ludGVydmFsIjogIjEyaCIKICAicm91dGVzIjogCiAgLSAibWF0Y2giOiAKICAgICAgImFsZXJ0bmFtZSI6ICJEZWFkTWFuc1N3aXRjaCIKICAgICJyZWNlaXZlciI6ICJudWxsIg==
kind: Secret
base64解碼一下:
"global":
"resolve_timeout": "5m"
"receivers":
- "name": "null"
"route":
"group_by":
- "job"
"group_interval": "5m"
"group_wait": "30s"
"receiver": "null"
"repeat_interval": "12h"
"routes":
- "match":
"alertname": "DeadMansSwitch"
"receiver": "null"
所以配置alertmanager就非常簡單了,就是建立一個secrect即可
如alertmanager.yaml:
global:
smtp_smarthost: 'smtp.qq.com:465'
smtp_from: '[email protected]'
smtp_auth_username: '[email protected]'
smtp_auth_password: 'xxx' # 這個密碼是開啟smtp授權後生成的,下文有說怎麼配置
smtp_require_tls: false
route:
group_by: ['alertmanager','cluster','service']
group_wait: 30s
group_interval: 5m
repeat_interval: 3h
receiver: 'fanux'
routes:
- receiver: 'fanux'
receivers:
- name: 'fanux'
email_configs:
- to: '[email protected]'
send_resolved: true
delete掉老的secret,根據自己的配置重新生成secret即可
kubectl delete secret alertmanager-main -n monitoring
kubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring
郵箱配置,以QQ郵箱為例
開啟smtp pop3服務
照著操作即可,後面會彈框一個授權碼,配置到上面的配置檔案中
然後就可以收到告警了:
告警規則配置
prometheus operator自定義PrometheusRule crd去描述告警規則
[[email protected] shell]# kubectl get PrometheusRule -n monitoring
NAME AGE
prometheus-k8s-rules 6m
直接edit這個rule即可,也可以再自己去建立個PrometheusRule
kubectl edit PrometheusRule prometheus-k8s-rules -n monitoring
如我們在group里加一個告警:
spec:
groups:
- name: ./example.rules
rules:
- alert: ExampleAlert
expr: vector(1)
- name: k8s.rules
rules:
重啟prometheuspod:
kubectl delete pod prometheus-k8s-0 prometheus-k8s-1 -n monitoring
然後在介面上就可以看到新加的規則:
探討可加QQ群:98488045