如何基於Python和Flask編寫Prometheus監控

阿新 • • 發佈：2020-11-26

介紹

Prometheus 的基本原理是通過 HTTP 週期性抓取被監控元件的狀態。

任意元件只要提供對應的 HTTP 介面並且符合 Prometheus 定義的資料格式，就可以接入 Prometheus 監控。

Prometheus Server 負責定時在目標上抓取 metrics（指標）資料並儲存到本地儲存。它採用了一種 Pull（拉）的方式獲取資料，不僅降低客戶端的複雜度，客戶端只需要採集資料，無需瞭解服務端情況，也讓服務端可以更加方便地水平擴充套件。

如果監控資料達到告警閾值，Prometheus Server 會通過 HTTP 將告警傳送到告警模組 alertmanger，通過告警的抑制後觸發郵件或者 Webhook。Prometheus 支援 PromQL 提供多維度資料模型和靈活的查詢，通過監控指標關聯多個 tag 的方式，將監控資料進行任意維度的組合以及聚合。

在python中實現伺服器端，對外提供介面。在Prometheus中配置請求網址，Prometheus會定期向該網址發起申請獲取你想要返回的資料。

另外Prometheus提供4種類型Metrics：Counter,Gauge,Summary和Histogram。

準備

pip install flask
pip install prometheus_client

Counter

Counter可以增長，並且在程式重啟的時候會被重設為0，常被用於訪問量，任務個數，總處理時間，錯誤個數等只增不減的指標。

定義它需要2個引數，第一個是metrics的名字，第二個是metrics的描述資訊：

c = Counter('c1','A counter')

counter只能增加，所以只有一個方法：

def inc(self,amount=1):
    '''Increment counter by the given amount.'''
    if amount < 0:
      raise ValueError('Counters can only be incremented by non-negative amounts.')
    self._value.inc(amount)

測試示例：

import prometheus_client
from prometheus_client import Counter
from prometheus_client.core import CollectorRegistry

from flask import Response,Flask

app = Flask(__name__)
requests_total = Counter('c1','A counter')

@app.route("/api/metrics/count/")
def requests_count():
 requests_total.inc(1)
 # requests_total.inc(2)
 return Response(prometheus_client.generate_latest(requests_total),mimetype="text/plain")


if __name__ == "__main__":
 app.run(host="127.0.0.1",port=8081)

訪問http://127.0.0.1:8081/api/metrics/count/：

# HELP c1_total A counter
# TYPE c1_total counter
c1_total 1.0
# HELP c1_created A counter
# TYPE c1_created gauge
c1_created 1.6053265493727107e+09

HELP是c1的註釋說明，建立Counter定義的。

TYPE是c1的型別說明。

c1_total為我們定義的指標輸出：你會發現多了字尾_total,這是因為OpenMetrics與Prometheus文字格式之間的相容性，OpenMetrics需要_total字尾。

gauge

gauge可增可減，可以任意設定。

比如可以設定當前的CPU溫度，記憶體使用量，磁碟、網路流量等等。

定義和counter基本一樣：

from prometheus_client import Gauge
g = Gauge('my_inprogress_requests','Description of gauge')
g.inc()   # Increment by 1
g.dec(10)  # Decrement by given value
g.set(4.2)  # Set to a given value

方法：

def inc(self,amount=1):
   '''Increment gauge by the given amount.'''
   self._value.inc(amount)

def dec(self,amount=1):
   '''Decrement gauge by the given amount.'''
   self._value.inc(-amount)

 def set(self,value):
   '''Set gauge to the given value.'''
   self._value.set(float(value))

測試示例：

import random
import prometheus_client
from prometheus_client import Gauge
from prometheus_client.core import CollectorRegistry
from flask import Response,Flask


app = Flask(__name__)
random_value = Gauge("g1",'A gauge')
@app.route("/api/metrics/gauge/")
def r_value():
  random_value.set(random.randint(0,10))
  return Response(prometheus_client.generate_latest(random_value),mimetype="text/plain")

if __name__ == "__main__":
 app.run(host="127.0.0.1",port=8081)

訪問http://127.0.0.1:8081/api/metrics/gauge/

# HELP g1 A gauge
# TYPE g1 gauge
g1 5.0

LABELS的用法

使用labels來區分metric的特徵，一個指標可以有其中一個label，也可以有多個label。

from prometheus_client import Counter
c = Counter('requests_total','HTTP requests total',['method','clientip'])
c.labels('get','127.0.0.1').inc()
c.labels('post','192.168.0.1').inc(3)
c.labels(method="get",clientip="192.168.0.1").inc()

import random
import prometheus_client
from prometheus_client import Gauge
from flask import Response,Flask


app = Flask(__name__)
c = Gauge("c1",'A counter','clientip'])
@app.route("/api/metrics/counter/")
def r_value():
  c.labels(method='get',clientip='192.168.0.%d' % random.randint(1,10)).inc()
  return Response(prometheus_client.generate_latest(c),port=8081)

連續訪問9次http://127.0.0.1:8081/api/metrics/counter/：

# HELP c1 A counter
# TYPE c1 gauge
c1{clientip="192.168.0.7",method="get"} 2.0
c1{clientip="192.168.0.1",method="get"} 1.0
c1{clientip="192.168.0.8",method="get"} 1.0
c1{clientip="192.168.0.5",method="get"} 2.0
c1{clientip="192.168.0.4",method="get"} 1.0
c1{clientip="192.168.0.10",method="get"} 1.0
c1{clientip="192.168.0.2",method="get"} 1.0

histogram

這種主要用來統計百分位的，什麼是百分位？英文叫做quantiles。

比如你有100條訪問請求的耗時時間，把它們從小到大排序，第90個時間是200ms，那麼我們可以說90%的請求都小於200ms，這也叫做”90分位是200ms”，能夠反映出服務的基本質量。當然，也許第91個時間是2000ms，這就沒法說了。

實際情況是，我們每天訪問量至少幾個億，不可能把所有訪問資料都存起來，然後排序找到90分位的時間是多少。因此，類似這種問題都採用了一些估算的演算法來處理，不需要把所有資料都存下來，這裡面數學原理比較高階，我們就直接看看prometheus的用法好了。

首先定義histogram：

h = Histogram('hh','A histogram',buckets=(-5,5))

第一個是metrics的名字，第二個是描述，第三個是分桶設定，重點說一下buckets。

這裡(-5,5)實際劃分成了幾種桶：(無窮小，-5]，（-5，0]，(0,5]，（5，無窮大）。

如果我們餵給它一個-8：

h.observe(8)

那麼metrics會這樣輸出：

# HELP hh A histogram
# TYPE hh histogram
hh_bucket{le="-5.0"} 0.0
hh_bucket{le="0.0"} 0.0
hh_bucket{le="5.0"} 0.0
hh_bucket{le="+Inf"} 1.0
hh_count 1.0
hh_sum 8.0

hh_sum記錄了observe的總和，count記錄了observe的次數，bucket就是各種桶了，le表示<=某值。

可見，值8<=無窮大，所以只有最後一個桶計數了1次（注意，桶只是計數，bucket作用相當於統計樣本在不同區間的出現次數）。

bucket的劃分需要我們根據資料的分佈拍腦袋指定，合理的劃分可以讓promql估算百分位的時候更準確，我們使用histogram的時候只需要知道先分好桶，再不斷的打點即可，最終百分位的計算可以基於histogram的原始資料完成。

測試示例：

import random
import prometheus_client
from prometheus_client import Histogram
from flask import Response,Flask
app = Flask(__name__)
h = Histogram("h1",'A Histogram',5))
@app.route("/api/metrics/histogram/")
def r_value():
  h.observe(random.randint(-5,5))
  return Response(prometheus_client.generate_latest(h),port=8081)

連續訪問http://127.0.0.1:8081/api/metrics/histogram/：

# HELP h1 A Histogram
# TYPE h1 histogram
h1_bucket{le="-5.0"} 0.0
h1_bucket{le="0.0"} 5.0
h1_bucket{le="5.0"} 10.0
h1_bucket{le="+Inf"} 10.0
h1_count 10.0
# HELP h1_created A Histogram
# TYPE h1_created gauge
h1_created 1.6053319432993534e+09

summary

python客戶端沒有完整實現summary演算法，這裡不介紹。

以上就是本文的全部內容，希望對大家的學習有所幫助，也希望大家多多支援我們。

如何基於Python和Flask編寫Prometheus監控

如何基於Python和Flask編寫Prometheus監控

使用Python和Flask編寫Prometheus監控

基於python和flask實現http介面過程解析

基於Python和PyYAML讀取yaml配置檔案資料

基於Python和C++實現刪除連結串列的節點

基於Python和MySQL實現的學生資訊管理系統

【Python環境】基於 Python 和 Scikit-Learn 的機器學習介紹

3W字乾貨深入分析基於Micrometer和Prometheus實現度量和監控的方案

基於Python編寫一個計算器程式，實現簡單的加減乘除和取餘二元運算

使用python編寫一個監控使用記憶體並使用flask模組出圖

基於Rancher k8s部署Prometheus 監控swoole專案核心指標實戰

python和mysql互動操作例項詳解【基於pymysql庫】

通過 Python 和 OpenCV 實現目標數量監控

Python中Flask-RESTful編寫API介面(小白入門)

基於Python中isfile函式和isdir函式使用詳解

基於python的列表list和集合set操作

基於python cut和qcut的用法及區別詳解

基於Python實現大檔案分割和命名指令碼過程解析

基於Python實現剪下板實時監控方法解析

基於python的docx模組處理word和WPS的docx格式檔案方式

如何基於Python和Flask編寫Prometheus監控

相關推薦