如何實現prometheus對k8的監控
Kubenetes是一款由Google開發的開源容器編排工具,其作為容器領域事實的標準,可以極大的簡化應用的管理和部署的複雜度。如何對Kubernetes自身的各種元件還有執行在Kubernetes叢集上的各種容器做到更好的監控,prometheus的出現提供了一個很好的解決方案。
Prometheus是一個開源監控系統,它會根據配置的任務週期性的採集指定目標上的指標,並將資料儲存為時間序列資料儲存在本地或者遠端儲存上(預設使用本地儲存),實時分析系統執行的狀態,為效能優化提供依據。 其主要特徵包括:
擁有多維度的資料模型。
提供了靈活的查詢語言,可以使使用者實時查詢和聚合採集到的時間序列資料。
通過基於HTTP的pull方式採集時序資料,也可以通過中間閘道器支援push方式。
通過服務發現或靜態配置來發現目標服務物件。
服務獨立執行,沒有其他依賴。
其技術架構如下圖所示:
核心元件
Prometheus Server : 根據配置完成資料的獲取,儲存以及查詢。
Exporters : Prometheus中資料採集元件的總稱。提供採集介面,用於暴露已有第三方服務的metrics給Prometheus。它負責從目標處蒐集資料,並將其轉化為Prometheus支援的格式。與傳統的資料採集元件不同的是,它並不主動傳送資料,而是等待Prometheus Server主動前來抓取。
Push Gateway : 為應對部分push場景提供的元件,監控資料先推送到Push Gateway 上,然後再由Prometheus Server端採集。
AlertManager : 在Prometheus Server 中支援基於PromQL建立告警規則,如果滿足PromQL定義的規則,則會產生一條告警。
資料模型
Prometheus 中儲存的資料為時間序列,每一個時間序列資料由metric名稱和它的標籤labels鍵值對集合唯一確定。
格式:
< metric name >{ < label name > = < label value >, … }
標籤:使同一個時間序列有了不同維度的標識。例如metric名稱是api_http_requests_total, 標籤為code=”200”, handler=”prometheus”, instance=”node2”, job=”kubernetes-nodes”, method=”get”的指標可以表示為:
http_requests_total{code=“200”,handler=“prometheus”,instance=“node2”,job=“kubernetes-nodes”,method=“get”}
樣本:實際的時間序列,每個序列由一個64位的浮點值和一個精確到毫秒級的時間戳組成。
我們關注的指標主要分成兩個部分,分別是狀態指標和效能指標。
效能指標
主要包括CPU、Memory、Load-負載、磁碟、網路等,具體有:
- 容器、Pod相關的效能指標資料
- 單純主機節點的效能指標資料
- 節點上與k8s叢集相關的指標資料
- k8s上應用的網路效能資料
狀態指標
- k8s資源物件(Deployment、Daemonset、Pod等)的執行狀態指標
- k8s平臺元件(如kube-apiserver、kube-scheduler等)的執行狀態指標
主流的監控方案是:通過各種exporter採集不同維度的監控指標,並通過Prometheus支援的資料格式暴露出來,Prometheus定期pull資料並用Grafana展示,異常情況使用AlertManager告警。
總體採集方式如下:
- 通過cadvisor採集容器、Pod相關的效能指標資料。
- 通過node-exporter採集單純主機節點的效能指標資料。
- 通過kube-state-metrics採集k8s資源物件以及k8s元件的(健康)狀態指標資料。
- 通過kubelet自身暴露的介面採集節點上與k8s叢集相關的指標資料。
- 通過blackbox-exporter採集應用的網路效能(http、tcp等)資料
實現prometheus對k8的監控分為以下幾個步驟(針對有k8s的環境的情況)。
1.1、建立名稱空間
首先我們建立一個名稱空間monitoring。後面所有的內容都會安裝在這個名稱空間中。
apiVersion: v1
kind: Namespace
metadata:
name: monitoring
建立命令:kubectl create -f namespace.yaml
與之相對的,如果修改了yaml檔案,則需要執行 kubectl delete -f namespace.yaml
後再重新create使修改生效。
1.2、建立node-exporter
node-exporter 探針用於收集主機的效能指標資料,如記憶體資訊等。預設埠為9100。指標字首名為node_
(注:node_
開頭的指標不全是node-exporter 探測出來的)。
編輯 node-exporter-service.yaml 檔案。
在Service中定義標註 prometheus.io/scrape: ‘true’
,表明該Service需要被promethues發現並採集資料。
如果監控的是叢集,需要在每臺主機上都存在node-exporter映象。
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: 'true'
name: prometheus-node-exporter
namespace: monitoring
labels:
app: prometheus
component: node-exporter
spec:
clusterIP: None
ports:
- name: prometheus-node-exporter
port: 9100
protocol: TCP
selector:
app: prometheus
component: node-exporter
type: ClusterIP
編輯 node-exporter-daemonset.yaml 檔案。
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: prometheus-node-exporter
namespace: monitoring
labels:
app: prometheus
component: node-exporter
spec:
template:
metadata:
name: prometheus-node-exporter
labels:
app: prometheus
component: node-exporter
spec:
containers:
- image: yourip/prometheus/node-exporter:latest
name: prometheus-node-exporter
ports:
- name: prom-node-exp
containerPort: 9100
hostPort: 9100
hostNetwork: true
hostPID: true
建立命令:
kubectl create -f node-exporter-service.yaml -f node-exporter-daemonset.yaml
1.3、建立kube-state-metrics
編輯 state-metrics-deployment.yaml 檔案。
採集到的指標的指標名字首為kube_
。採集K8S資源物件以及K8S元件的健康狀態指標資料,主要是Kubernetes叢集上Pod, DaemonSet, Deployment, Job等各種資源物件的狀態。預設埠為 8080 。
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kube-state-metrics
namespace: monitoring
spec:
replicas: 1
template:
metadata:
labels:
app: kube-state-metrics
spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: yourip/prometheus/kube-state-metrics:latest
ports:
- containerPort: 8080
編輯 state-metrics-service.yaml 檔案。
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: 'true'
name: kube-state-metrics
namespace: monitoring
labels:
app: kube-state-metrics
spec:
ports:
- name: kube-state-metrics
port: 8080
protocol: TCP
selector:
app: kube-state-metrics
建立命令:
kubectl create -f state-metrics-deployment.yaml -f state-metrics-service.yaml
1.4、建立node-directory-size-metrics(忽略)
編輯 node-directory-size-metrics-daemonset.yaml 檔案。
主要用於讀取節點目錄,獲取磁碟使用metric資料。注意,它只檢測/mnt目錄下磁碟的使用情況,可以用於檢測某個機器上某個目錄的變化,前提是需要把那個目錄掛載到 /mnt目錄下。
檢索方式為:{app="node-directory-size-metrics"}
。預設埠為9102 。指標名字首也是node_
。
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: node-directory-size-metrics
namespace: monitoring
annotations:
description: |
This `DaemonSet` provides metrics in Prometheus format about disk usage on the nodes.
The container `read-du` reads in sizes of all directories below /mnt and writes that to `/tmp/metrics`. It only reports directories larger then `100M` for now.
The other container `caddy` just hands out the contents of that file on request via `http` on `/metrics` at port `9102` which are the defaults for Prometheus.
These are scheduled on every node in the Kubernetes cluster.
To choose directories from the node to check, just mount them on the `read-du` container below `/mnt`.
spec:
template:
metadata:
labels:
app: node-directory-size-metrics
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '9102'
description: |
This `Pod` provides metrics in Prometheus format about disk usage on the node.
The container `read-du` reads in sizes of all directories below /mnt and writes that to `/tmp/metrics`. It only reports directories larger then `100M` for now.
The other container `caddy` just hands out the contents of that file on request on `/metrics` at port `9102` which are the defaults for Prometheus.
This `Pod` is scheduled on every node in the Kubernetes cluster.
To choose directories from the node to check just mount them on `read-du` below `/mnt`.
spec:
containers:
- name: read-du
image: yourip/prometheus/tiny-tools:latest
imagePullPolicy: Always
# FIXME threshold via env var
# The
command:
- fish
- --command
- |
touch /tmp/metrics-temp
while true
for directory in (du --bytes --separate-dirs --threshold=100M /mnt)
echo $directory | read size path
echo "node_directory_size_bytes{path=\"$path\"} $size" \
>> /tmp/metrics-temp
end
mv /tmp/metrics-temp /tmp/metrics
sleep 300
end
volumeMounts:
- name: host-fs-var
mountPath: /mnt/var
readOnly: true
- name: metrics
mountPath: /tmp
- name: caddy
image: yourip/prometheus/caddy:latest
command:
- "caddy"
- "-port=9102"
- "-root=/var/www"
ports:
- containerPort: 9102
volumeMounts:
- name: metrics
mountPath: /var/www
volumes:
- name: host-fs-var
hostPath:
path: /var
- name: metrics
emptyDir:
medium: Memory
建立命令:
kubectl create -f node-directory-size-metrics-daemonset.yaml
1.5、建立configmap
編輯 node-directory-size-metrics-daemonset.yaml 檔案。
在configMap中設定的五個分類分別是nodes、endpoints、services、pods、cadvisor 。這五個分類就是 prometheus web 端 targets 頁面中顯示的五個型別(jobs型別)。
真正在prometheus配置過程中生效的是configmap中 prometheus.yaml
下的部分:
apiVersion: v1
data:
prometheus.yaml: |
global:
scrape_interval: 10s
scrape_timeout: 10s
evaluation_interval: 10s
alerting:
alertmanagers:
- static_configs:
- targets: ["alertmanager:9093"]
rule_files:
- "/etc/prometheus/rules/test.yml"
scrape_configs:
# https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L37
- job_name: 'kubernetes-nodes'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:10255'
target_label: __address__
# https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L79
- job_name: 'kubernetes-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: (.+)(?::\d+);(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
# https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L119
- job_name: 'kubernetes-services'
metrics_path: /probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name
# https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L156
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: (.+):(?:\d+);(\d+)
replacement: ${1}:${2}
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- source_labels: [__meta_kubernetes_pod_container_port_number]
action: keep
regex: 9\d{3}
- job_name: 'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
metric_relabel_configs:
- action: replace
source_labels: [id]
regex: '^/machine\.slice/machine-rkt\\x2d([^\\]+)\\.+/([^/]+)\.service$'
target_label: rkt_container_name
replacement: '${2}-${1}'
- action: replace
source_labels: [id]
regex: '^/system\.slice/(.+)\.service$'
target_label: systemd_service_name
replacement: '${1}'
- job_name: 'discovery-node'
file_sd_configs:
- files: ['/etc/prometheus/test_sd_config/*.yml']
refresh_interval: 5s
- job_name: 'consul-prometheus'
consul_sd_configs:
- server: '10.4.**.**:8500'
services: []
relabel_configs:
- source_labels: ['__meta_consul_service']
regex: .*node.*
action: keep
kind: ConfigMap
metadata:
creationTimestamp: null
name: prometheus-core
namespace: monitoring
建立命令:
kubectl create -f configmap.yaml
1.6、建立prometheus-core
編輯 deployment.yaml 。
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: prometheus-core
namespace: monitoring
labels:
app: prometheus
component: core
spec:
replicas: 1
template:
metadata:
name: prometheus-main
labels:
app: prometheus
component: core
spec:
serviceAccountName: prometheus-k8s
containers:
- name: prometheus
image: 10.4.41.221/prometheus/prometheus:latest
args: ['--config.file=/etc/prometheus/prometheus.yaml','--storage.tsdb.path=/prometheus/data/','--storage.tsdb.retention=1d']
ports:
- name: webui
containerPort: 9090
resources:
requests:
cpu: 500m
memory: 500M
limits:
cpu: 500m
memory: 500M
volumeMounts:
- name: config-volume
mountPath: /etc/prometheus
- name: rules-volume
mountPath: /etc/prometheus/rules
- name: discovery-volume
mountPath: /etc/prometheus/test_sd_config
volumes:
- name: config-volume
configMap:
name: prometheus-core
- name: rules-volume
configMap:
name: prometheus-rules
- name: discovery-volume
configMap:
name: prometheus-discovery
建立命令:
kubectl create -f deployment.yaml
注: args: ['--config.file=/etc/prometheus/prometheus.yaml']
新增引數前面是--
,如果只寫一個-
,就會報錯:
Error parsing commandline arguments: unknown short flag '-c'
prometheus: error: unknown short flag '-c'
1.7、建立 prometheus service
編輯 service.yaml 檔案。
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: monitoring
labels:
app: prometheus
component: core
annotations:
prometheus.io/scrape: 'true'
spec:
type: NodePort
ports:
- port: 9090
protocol: TCP
name: webui
selector:
app: prometheus
component: core
建立命令:
kubectl create -f service.yaml
1.8、建立 rbac
檢視檔案 rbac.yaml 。
如果這個檔案出錯,可能導致在k8s的 web 端檢視時各 targets 都已經起來了,但是 prometheus 的web端的targets頁面什麼都沒有。
rbac 用於對上面建立的名稱空間分配叢集的讀取許可權,以便Prometheus可以通過Kubernetes的API獲取叢集的資源指標。
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources:
- configmaps
verbs: ["get"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-k8s
namespace: monitoring
建立命令:
kubectl create -f rbac.yaml
1.9、建立告警規則
apiVersion: v1
data:
test.yml: |
groups:
- name: goroutines_monitoring
rules:
- alert: TooMuchGoroutines
expr: go_goroutines > 20
for: 1m
labels:
severity: warning
annotations:
summary: "too much goroutines of job prometheus."
description: "testing"
kind: ConfigMap
metadata:
creationTimestamp: null
name: prometheus-rules
namespace: monitoring
建立命令:
kubectl create -f rbac.yaml
1.10、進入prometheus 的 web 頁面
查詢 prometheus 暴露的埠號,查詢到為32075:
kubectl -n monitoring get svc
這裡用到的ip不是所謂的叢集ip,而是部署 prometheus 的主機的 ip ,訪問路徑為 http://yourip:32075/targets
注1:如果訪問不到網址,可以去k8s的介面上檢視日誌,盤查錯誤。
注2:訪問到prometheus的頁面後,node報錯:
Get http://10.4.41.161:10255/metrics: dial tcp ip:10255: connect: connection refused
解決方法:
cd /etc/kubernetes/
vi kubelet.env
#修改
--read-only-port=10255 \
service kubelet restart
注3:在監控叢集的過程中,通過docker ps -a 發現node1上沒有/prometheus/prometheus 等容器,此時可以檢視node2,可能是叢集將這個分配到其他節點上去了。
1.11、配置AlterManager
kind: ConfigMap
apiVersion: v1
metadata:
name: alertmanager
namespace: monitoring
data:
config.yml: |-
global:
resolve_timeout: 1m
smtp_smarthost: 'smtp.neusoft.com:587'
smtp_from: 'mail.neusoft.com'
smtp_auth_username: '[email protected]'
smtp_auth_password: 'BABYNJSWDw1'
templates:
- '/etc/alertmanager-templates/*.tmpl'
route:
group_by: ['alertname']
group_wait: 30s
group_interval: 30s
repeat_interval: 1m
receiver: 'webhook'
receivers:
- name: 'webhook'
webhook_configs:
- url: '[email protected]'
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
建立命令:kubectl create -f configmap.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: alertmanager
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: alertmanager
template:
metadata:
name: alertmanager
labels:
app: alertmanager
spec:
containers:
- name: alertmanager
image: 10.4.41.221/prometheus/alertmanager:latest
args: ['--config.file=/etc/alertmanager/config.yml','--storage.path=/alertmanager']
ports:
- name: alertmanager
containerPort: 9093
volumeMounts:
- name: config-volume
mountPath: /etc/alertmanager
- name: templates-volume
mountPath: /etc/alertmanager-templates
- name: alertmanager
mountPath: /alertmanager
volumes:
- name: config-volume
configMap:
name: alertmanager
- name: templates-volume
configMap:
name: alertmanager-templates
- name: alertmanager
emptyDir: {}
建立命令: kubectl create -f deployment.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: 'true'
prometheus.io/path: '/metrics'
labels:
name: alertmanager
name: alertmanager
namespace: monitoring
spec:
selector:
app: alertmanager
type: NodePort
ports:
- name: alertmanager
protocol: TCP
port: 9093
targetPort: 9093
建立命令: kubectl create -f service.yaml
1.12、進入AlterManager 的 web 頁面
其中ip為部署altermanager的主機,port是通過kubectl -n monitoring get svc
查詢出來的,這裡是31555。通過9093訪問會出現無法訪問的情況。
下圖為prometheus中的Alert告警頁面
下圖為altermanager的告警頁面
======================================================
cAdvisor介紹
cAdvisor是谷歌開源的一個容器監控工具,cadvisor採集了主機上容器相關的效能指標資料,通過容器的指標還可進一步計算出pod的指標。
cadvisor提供的一些主要指標有:
container_cpu_*
container_fs_*
container_memory_*
container_network_*
container_spec_*(cpu/memory)
container_start_time_*
container_tasks_state_*
目前cAdvisor整合到了kubelet元件內,可以在kubernetes叢集中每個啟動了kubelet的節點使用cAdvisor提供的metrics介面獲取該節點所有容器相關的效能指標資料。cAdvisor介面暴露的監控指標資料是按prometheus的格式輸出的,是Prometheus認可的資料模型的監控資料。
1.7.3版本以前,cadvisor的metrics資料整合在kubelet的metrics中,在1.7.3以後版本中cadvisor的metrics被從kubelet的metrics獨立出來了,在prometheus採集的時候變成兩個scrape的job。
configmap.yaml 中cAdvisor的相關內容是用來配置Prometheus定期拉取cAdvisor的metrics 。Prometheus通過apiserver提供的api做代理獲取cAdvisor的監控指標。
注:通過NodeExporter暴露的metrics介面採集到的資料,也是按Prometheus的格式輸出的。其他的也同理。
參考文件:
1、cadvisor 與 kubelet 的區別
https://www.cnblogs.com/aguncn/p/9929684.html
2、cAdvisor 詳解(3.1章節,重點參考)
https://blog.csdn.net/liukuan73/article/details/78881008
======================================================
單詞註解:
spec
規範
metadata
元資料
DaemonSet
守護程序
regex
正則表示式
annotation
註釋
timeout
超時
=======================================================
注:
1、instance是指收集資料的目標端點,一般對應於一個程序;而job表示實現同一功能或目標的一組instance。
2、得到名稱空間為monitoring中pod的名字,狀態等資訊
kubectl get pod -n monitoring
根據上一句得到的pod名字,可以檢視他的日誌:
kubectl logs prometheus-core-69f86f78d7-xgclc -n monitoring
=======================================================
參考文獻:
1、主要參考(必看):
https://blog.csdn.net/wenwst/article/details/76624019
https://blog.csdn.net/liukuan73/article/details/78881008
2、主要參考的GitHub
https://github.com/giantswarm/kubernetes-prometheus/tree/master/manifests/prometheus
3、prometheus 條件查詢(官網)
https://prometheus.io/docs/prometheus/latest/querying/basics/
5、參考的GitHub
https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml