Kubernetes K8S之kube-prometheus概述與部署
Kubernetes K8S之kube-prometheus概述與部署
主機配置規劃
伺服器名稱(hostname) | 系統版本 | 配置 | 內網IP | 外網IP(模擬) |
---|---|---|---|---|
k8s-master | CentOS7.7 | 2C/4G/20G | 172.16.1.110 | 10.0.0.110 |
k8s-node01 | CentOS7.7 | 2C/4G/20G | 172.16.1.111 | 10.0.0.111 |
k8s-node02 | CentOS7.7 | 2C/4G/20G | 172.16.1.112 | 10.0.0.112 |
prometheus概述
Prometheus是一個開源的系統監控和警報工具包,自2012成立以來,許多公司和組織採用了Prometheus。它現在是一個獨立的開源專案,並獨立於任何公司維護。在2016年,Prometheus加入雲端計算基金會作為Kubernetes之後的第二託管專案。
Prometheus效能也足夠支撐上萬臺規模的叢集。
Prometheus的關鍵特性
- 多維度資料模型
- 靈活的查詢語言
- 不依賴於分散式儲存;單伺服器節點是自治的
- 通過基於HTTP的pull方式採集時序資料
- 可以通過中間閘道器進行時序列資料推送
- 通過服務發現或者靜態配置來發現目標服務物件
- 支援多種多樣的圖表和介面展示,比如Grafana等
架構圖
基本原理
Prometheus的基本原理是通過HTTP協議週期性抓取被監控元件的狀態,任意元件只要提供對應的HTTP介面就可以接入監控。不需要任何SDK或者其他的整合過程。
這樣做非常適合做虛擬化環境監控系統,比如VM、Docker、Kubernetes等。輸出被監控元件資訊的HTTP介面被叫做exporter 。目前網際網路公司常用的元件大部分都有exporter可以直接使用,比如Varnish、Haproxy、Nginx、MySQL、Linux系統資訊(包括磁碟、記憶體、CPU、網路等等)。
Prometheus三大套件
- Server 主要負責資料採集和儲存,提供PromQL查詢語言的支援。
- Alertmanager 警告管理器,用來進行報警。
- Push Gateway 支援臨時性Job主動推送指標的中間閘道器。
服務過程
- Prometheus Daemon負責定時去目標上抓取metrics(指標)資料,每個抓取目標需要暴露一個http服務的介面給它定時抓取。Prometheus支援通過配置檔案、文字檔案、Zookeeper、Consul、DNS SRV Lookup等方式指定抓取目標。Prometheus採用PULL的方式進行監控,即伺服器可以直接通過目標PULL資料或者間接地通過中間閘道器來Push資料。
- Prometheus在本地儲存抓取的所有資料,並通過一定規則進行清理和整理資料,並把得到的結果儲存到新的時間序列中。
- Prometheus通過PromQL和其他API視覺化地展示收集的資料。Prometheus支援很多方式的圖表視覺化,例如Grafana、自帶的Promdash以及自身提供的模版引擎等等。Prometheus還提供HTTP API的查詢方式,自定義所需要的輸出。
- PushGateway支援Client主動推送metrics到PushGateway,而Prometheus只是定時去Gateway上抓取資料。
- Alertmanager是獨立於Prometheus的一個元件,可以支援Prometheus的查詢語句,提供十分靈活的報警方式。
kube-prometheus部署
kube-prometheus的GitHub地址:
https://github.com/coreos/kube-prometheus/
本次我們選擇release-0.2版本,而不是其他版本。
kube-prometheus下載與配置修改
下載
1 [root@k8s-master prometheus]# pwd 2 /root/k8s_practice/prometheus 3 [root@k8s-master prometheus]# 4 [root@k8s-master prometheus]# wget https://github.com/coreos/kube-prometheus/archive/v0.2.0.tar.gz 5 [root@k8s-master prometheus]# tar xf v0.2.0.tar.gz 6 [root@k8s-master prometheus]# ll 7 total 432 8 drwxrwxr-x 10 root root 4096 Sep 13 2019 kube-prometheus-0.2.0 9 -rw-r--r-- 1 root root 200048 Jul 19 11:41 v0.2.0.tar.gz
配置修改
1 # 當前所在目錄 2 [root@k8s-master manifests]# pwd 3 /root/k8s_practice/prometheus/kube-prometheus-0.2.0/manifests 4 [root@k8s-master manifests]# 5 # 配置修改1 6 [root@k8s-master manifests]# vim grafana-service.yaml 7 apiVersion: v1 8 kind: Service 9 metadata: 10 labels: 11 app: grafana 12 name: grafana 13 namespace: monitoring 14 spec: 15 type: NodePort # 新增內容 16 ports: 17 - name: http 18 port: 3000 19 targetPort: http 20 nodePort: 30100 # 新增內容 21 selector: 22 app: grafana 23 [root@k8s-master manifests]# 24 # 配置修改2 25 [root@k8s-master manifests]# vim prometheus-service.yaml 26 apiVersion: v1 27 kind: Service 28 metadata: 29 labels: 30 prometheus: k8s 31 name: prometheus-k8s 32 namespace: monitoring 33 spec: 34 type: NodePort # 新增內容 35 ports: 36 - name: web 37 port: 9090 38 targetPort: web 39 nodePort: 30200 # 新增內容 40 selector: 41 app: prometheus 42 prometheus: k8s 43 sessionAffinity: ClientIP 44 [root@k8s-master manifests]# 45 # 配置修改3 46 [root@k8s-master manifests]# vim alertmanager-service.yaml 47 apiVersion: v1 48 kind: Service 49 metadata: 50 labels: 51 alertmanager: main 52 name: alertmanager-main 53 namespace: monitoring 54 spec: 55 type: NodePort # 新增內容 56 ports: 57 - name: web 58 port: 9093 59 targetPort: web 60 nodePort: 30300 # 新增內容 61 selector: 62 alertmanager: main 63 app: alertmanager 64 sessionAffinity: ClientIP 65 [root@k8s-master manifests]# 66 # 配置修改4 67 [root@k8s-master manifests]# vim grafana-deployment.yaml 68 # 將apps/v1beta2 改為 apps/v1 69 apiVersion: apps/v1 70 kind: Deployment 71 metadata: 72 labels: 73 app: grafana 74 name: grafana 75 namespace: monitoring 76 spec: 77 replicas: 1 78 selector: 79 ………………
kube-prometheus映象版本檢視與下載
由於映象都在國外,因此經常會下載失敗。為了快速下載映象,這裡我們下載國內的映象,然後tag為配置檔案中的國外映象名即可。
檢視kube-prometheus的映象資訊
1 # 當前工作目錄 2 [root@k8s-master manifests]# pwd 3 /root/k8s_practice/prometheus/kube-prometheus-0.2.0/manifests 4 [root@k8s-master manifests]# 5 # 所有映象資訊如下 6 [root@k8s-master manifests]# grep -riE 'quay.io|k8s.gcr|grafana/' * 7 0prometheus-operator-deployment.yaml: - --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1 8 0prometheus-operator-deployment.yaml: - --prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.33.0 9 0prometheus-operator-deployment.yaml: image: quay.io/coreos/prometheus-operator:v0.33.0 10 alertmanager-alertmanager.yaml: baseImage: quay.io/prometheus/alertmanager 11 grafana-deployment.yaml: - image: grafana/grafana:6.2.2 12 kube-state-metrics-deployment.yaml: image: quay.io/coreos/kube-rbac-proxy:v0.4.1 13 kube-state-metrics-deployment.yaml: image: quay.io/coreos/kube-rbac-proxy:v0.4.1 14 kube-state-metrics-deployment.yaml: image: quay.io/coreos/kube-state-metrics:v1.7.2 15 kube-state-metrics-deployment.yaml: image: k8s.gcr.io/addon-resizer:1.8.4 16 node-exporter-daemonset.yaml: image: quay.io/prometheus/node-exporter:v0.18.1 17 node-exporter-daemonset.yaml: image: quay.io/coreos/kube-rbac-proxy:v0.4.1 18 prometheus-adapter-deployment.yaml: image: quay.io/coreos/k8s-prometheus-adapter-amd64:v0.4.1 19 prometheus-prometheus.yaml: baseImage: quay.io/prometheus/prometheus 20 ##### 由上可知alertmanager和prometheus的映象版本未顯示 21 ### 獲取alertmanager映象版本資訊 22 [root@k8s-master manifests]# cat alertmanager-alertmanager.yaml 23 apiVersion: monitoring.coreos.com/v1 24 kind: Alertmanager 25 metadata: 26 labels: 27 alertmanager: main 28 name: main 29 namespace: monitoring 30 spec: 31 baseImage: quay.io/prometheus/alertmanager 32 nodeSelector: 33 kubernetes.io/os: linux 34 replicas: 3 35 securityContext: 36 fsGroup: 2000 37 runAsNonRoot: true 38 runAsUser: 1000 39 serviceAccountName: alertmanager-main 40 version: v0.18.0 41 ##### 由上可見alertmanager的映象版本為v0.18.0 42 ### 獲取prometheus映象版本資訊 43 [root@k8s-master manifests]# cat prometheus-prometheus.yaml 44 apiVersion: monitoring.coreos.com/v1 45 kind: Prometheus 46 metadata: 47 labels: 48 prometheus: k8s 49 name: k8s 50 namespace: monitoring 51 spec: 52 alerting: 53 alertmanagers: 54 - name: alertmanager-main 55 namespace: monitoring 56 port: web 57 baseImage: quay.io/prometheus/prometheus 58 nodeSelector: 59 kubernetes.io/os: linux 60 podMonitorSelector: {} 61 replicas: 2 62 resources: 63 requests: 64 memory: 400Mi 65 ruleSelector: 66 matchLabels: 67 prometheus: k8s 68 role: alert-rules 69 securityContext: 70 fsGroup: 2000 71 runAsNonRoot: true 72 runAsUser: 1000 73 serviceAccountName: prometheus-k8s 74 serviceMonitorNamespaceSelector: {} 75 serviceMonitorSelector: {} 76 version: v2.11.0 77 ##### 由上可見prometheus的映象版本為v2.11.0
執行指令碼:映象下載並重命名【叢集所有機器執行】
1 [root@k8s-master software]# vim download_prometheus_image.sh 2 #!/bin/sh 3 4 ##### 在 master 節點和 worker 節點都要執行 【所有機器執行】 5 6 # 載入環境變數 7 . /etc/profile 8 . /etc/bashrc 9 10 ############################################### 11 # 從國內下載 prometheus 所需映象,並對映象重新命名 12 src_registry="registry.cn-beijing.aliyuncs.com/cloud_registry" 13 # 定義映象集合陣列 14 images=( 15 kube-rbac-proxy:v0.4.1 16 kube-state-metrics:v1.7.2 17 k8s-prometheus-adapter-amd64:v0.4.1 18 configmap-reload:v0.0.1 19 prometheus-config-reloader:v0.33.0 20 prometheus-operator:v0.33.0 21 ) 22 # 迴圈從國內獲取的Docker映象 23 for img in ${images[@]}; 24 do 25 # 從國內源下載映象 26 docker pull ${src_registry}/$img 27 # 改變映象名稱 28 docker tag ${src_registry}/$img quay.io/coreos/$img 29 # 刪除源始映象 30 docker rmi ${src_registry}/$img 31 # 列印分割線 32 echo "======== $img download OK ========" 33 done 34 35 ##### 其他映象下載 36 image_name="alertmanager:v0.18.0" 37 docker pull ${src_registry}/${image_name} && docker tag ${src_registry}/${image_name} quay.io/prometheus/${image_name} && docker rmi ${src_registry}/${image_name} 38 echo "======== ${image_name} download OK ========" 39 40 image_name="node-exporter:v0.18.1" 41 docker pull ${src_registry}/${image_name} && docker tag ${src_registry}/${image_name} quay.io/prometheus/${image_name} && docker rmi ${src_registry}/${image_name} 42 echo "======== ${image_name} download OK ========" 43 44 image_name="prometheus:v2.11.0" 45 docker pull ${src_registry}/${image_name} && docker tag ${src_registry}/${image_name} quay.io/prometheus/${image_name} && docker rmi ${src_registry}/${image_name} 46 echo "======== ${image_name} download OK ========" 47 48 image_name="grafana:6.2.2" 49 docker pull ${src_registry}/${image_name} && docker tag ${src_registry}/${image_name} grafana/${image_name} && docker rmi ${src_registry}/${image_name} 50 echo "======== ${image_name} download OK ========" 51 52 image_name="addon-resizer:1.8.4" 53 docker pull ${src_registry}/${image_name} && docker tag ${src_registry}/${image_name} k8s.gcr.io/${image_name} && docker rmi ${src_registry}/${image_name} 54 echo "======== ${image_name} download OK ========" 55 56 57 echo "********** prometheus docker images OK! **********"
執行指令碼後得到如下映象
1 [root@k8s-master software]# docker images | grep 'quay.io/coreos' 2 quay.io/coreos/kube-rbac-proxy v0.4.1 a9d1a87e4379 6 days ago 41.3MB 3 quay.io/coreos/flannel v0.12.0-amd64 4e9f801d2217 4 months ago 52.8MB ## 之前已存在 4 quay.io/coreos/kube-state-metrics v1.7.2 3fd71b84d250 6 months ago 33.1MB 5 quay.io/coreos/prometheus-config-reloader v0.33.0 64751efb2200 8 months ago 17.6MB 6 quay.io/coreos/prometheus-operator v0.33.0 8f2f814d33e1 8 months ago 42.1MB 7 quay.io/coreos/k8s-prometheus-adapter-amd64 v0.4.1 5f0fc84e586c 15 months ago 60.7MB 8 quay.io/coreos/configmap-reload v0.0.1 3129a2ca29d7 3 years ago 4.79MB 9 [root@k8s-master software]# 10 [root@k8s-master software]# docker images | grep 'quay.io/prometheus' 11 quay.io/prometheus/node-exporter v0.18.1 d7707e6f5e95 11 days ago 22.9MB 12 quay.io/prometheus/prometheus v2.11.0 de242295e225 2 months ago 126MB 13 quay.io/prometheus/alertmanager v0.18.0 30594e96cbe8 10 months ago 51.9MB 14 [root@k8s-master software]# 15 [root@k8s-master software]# docker images | grep 'grafana' 16 grafana/grafana 6.2.2 a532fe3b344a 9 months ago 248MB 17 [root@k8s-node01 software]# 18 [root@k8s-node01 software]# docker images | grep 'addon-resizer' 19 k8s.gcr.io/addon-resizer 1.8.4 5ec630648120 20 months ago 38.3MB
kube-prometheus啟動
啟動prometheus
1 [root@k8s-master kube-prometheus-0.2.0]# pwd 2 /root/k8s_practice/prometheus/kube-prometheus-0.2.0 3 [root@k8s-master kube-prometheus-0.2.0]# 4 ### 如果出現異常,可以再重複執行一次或多次 5 [root@k8s-master kube-prometheus-0.2.0]# kubectl apply -f manifests/
啟動後svc與pod狀態檢視
1 [root@k8s-master ~]# kubectl top node 2 NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% 3 k8s-master 152m 7% 1311Mi 35% 4 k8s-node01 100m 5% 928Mi 54% 5 k8s-node02 93m 4% 979Mi 56% 6 [root@k8s-master ~]# 7 [root@k8s-master ~]# kubectl get svc -n monitoring 8 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE 9 alertmanager-main NodePort 10.97.249.249 <none> 9093:30300/TCP 7m21s 10 alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 7m13s 11 grafana NodePort 10.101.183.103 <none> 3000:30100/TCP 7m20s 12 kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 7m20s 13 node-exporter ClusterIP None <none> 9100/TCP 7m20s 14 prometheus-adapter ClusterIP 10.105.174.86 <none> 443/TCP 7m19s 15 prometheus-k8s NodePort 10.109.179.233 <none> 9090:30200/TCP 7m19s 16 prometheus-operated ClusterIP None <none> 9090/TCP 7m3s 17 prometheus-operator ClusterIP None <none> 8080/TCP 7m21s 18 [root@k8s-master ~]# 19 [root@k8s-master ~]# kubectl get pod -n monitoring -o wide 20 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 21 alertmanager-main-0 2/2 Running 0 2m11s 10.244.4.164 k8s-node01 <none> <none> 22 alertmanager-main-1 2/2 Running 0 2m11s 10.244.2.225 k8s-node02 <none> <none> 23 alertmanager-main-2 2/2 Running 0 2m11s 10.244.4.163 k8s-node01 <none> <none> 24 grafana-5cd56df4cd-6d75r 1/1 Running 0 29s 10.244.2.227 k8s-node02 <none> <none> 25 kube-state-metrics-7d4bb66d8d-gx7w4 4/4 Running 0 2m18s 10.244.2.223 k8s-node02 <none> <none> 26 node-exporter-pl47v 2/2 Running 0 2m17s 172.16.1.110 k8s-master <none> <none> 27 node-exporter-tmmbw 2/2 Running 0 2m17s 172.16.1.111 k8s-node01 <none> <none> 28 node-exporter-w8wd9 2/2 Running 0 2m17s 172.16.1.112 k8s-node02 <none> <none> 29 prometheus-adapter-c676d8764-phj69 1/1 Running 0 2m17s 10.244.2.224 k8s-node02 <none> <none> 30 prometheus-k8s-0 3/3 Running 1 2m1s 10.244.2.226 k8s-node02 <none> <none> 31 prometheus-k8s-1 3/3 Running 0 2m1s 10.244.4.165 k8s-node01 <none> <none> 32 prometheus-operator-7559d67ff-lk86l 1/1 Running 0 2m18s 10.244.4.162 k8s-node01 <none> <none>
kube-prometheus訪問
prometheus-service訪問
訪問地址如下:
http://172.16.1.110:30200/
通過訪問如下地址,可以看到prometheus已經成功連線上了k8s的apiserver。
http://172.16.1.110:30200/targets
檢視service-discovery
http://172.16.1.110:30200/service-discovery
prometheus自己指標檢視
http://172.16.1.110:30200/metrics
prometheus的WEB介面上提供了基本的查詢,例如查詢K8S叢集中每個POD的CPU使用情況,可以使用如下查詢條件查詢:
1 # 直接使用 container_cpu_usage_seconds_total 可以看見有哪些欄位資訊 2 sum(rate(container_cpu_usage_seconds_total{image!="", pod!=""}[1m] )) by (pod)
列表頁面
圖形頁面
grafana-service訪問
訪問地址如下:
http://172.16.1.110:30100/
首次登入時賬號密碼預設為:admin/admin
新增資料來源
得到如下頁面
如上,資料來源預設是已經新增好了的
點選進入,拉到下面,再點選Test按鈕,測驗資料來源是否正常
之後可匯入一些模板
資料資訊影象化檢視
異常問題解決
如果 kubectl apply -f manifests/
出現類似如下提示:
1 unable to recognize "manifests/alertmanager-alertmanager.yaml": no matches for kind "Alertmanager" in version "monitoring.coreos.com/v1" 2 unable to recognize "manifests/alertmanager-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1" 3 unable to recognize "manifests/grafana-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1" 4 unable to recognize "manifests/kube-state-metrics-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1" 5 unable to recognize "manifests/node-exporter-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1" 6 unable to recognize "manifests/prometheus-operator-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1" 7 unable to recognize "manifests/prometheus-prometheus.yaml": no matches for kind "Prometheus" in version "monitoring.coreos.com/v1" 8 unable to recognize "manifests/prometheus-rules.yaml": no matches for kind "PrometheusRule" in version "monitoring.coreos.com/v1" 9 unable to recognize "manifests/prometheus-serviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1" 10 unable to recognize "manifests/prometheus-serviceMonitorApiserver.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1" 11 unable to recognize "manifests/prometheus-serviceMonitorCoreDNS.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1" 12 unable to recognize "manifests/prometheus-serviceMonitorKubeControllerManager.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1" 13 unable to recognize "manifests/prometheus-serviceMonitorKubeScheduler.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1" 14 unable to recognize "manifests/prometheus-serviceMonitorKubelet.yaml": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
那麼再次 kubectl apply -f manifests/
即可;因為存在依賴。
但如果使用的是kube-prometheus:v0.3.0、v0.4.0、v0.5.0版本並出現了上面的提示【反覆執行kubectl apply -f manifests/
,但一直存在】,原因暫不清楚。
完畢!
———END———
如果覺得不錯就關注下唄 (-^O^-) !
&n