(八)Prometheus promQL查詢語言
Prometheus提供了一種名為PromQL (Prometheus查詢語言)的函式式查詢語言,允許使用者實時選擇和聚合時間序列資料。表示式的結果既可以顯示為圖形,也可以在Prometheus的表示式瀏覽器中作為表格資料檢視,或者通過HTTP API由外部系統使用。
一、準備工作
在進行查詢,這裡提供下我的配置檔案如下
[root@node00 prometheus]# cat prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090'] - job_name: "node" file_sd_configs: - refresh_interval: 1m files: - "/usr/local/prometheus/prometheus/conf/node*.yml" remote_write: - url: "http://localhost:8086/api/v1/prom/write?db=prometheus" remote_read: - url: "http://localhost:8086/api/v1/prom/read?db=prometheus" [root@node00 prometheus]# cat conf/node-dis.yml - targets: - "192.168.100.10:20001" labels: __datacenter__: dc0 __hostname__: node00 __businees_line__: "line_a" __region_id__: "cn-beijing" __availability_zone__: "a" - targets: - "192.168.100.11:20001" labels: __datacenter__: dc1 __hostname__: node01 __businees_line__: "line_a" __region_id__: "cn-beijing" __availability_zone__: "a" - targets: - "192.168.100.12:20001" labels: __datacenter__: dc0 __hostname__: node02 __businees_line__: "line_c" __region_id__: "cn-beijing" __availability_zone__: "b"
二、 簡單時序查詢
2.1 直接查詢特定metric_name
節點的forks的總次數
node_forks_total
結果如下
Element | Value |
---|---|
node_forks_total{instance="192.168.100.10:20001",job="node"} | 201518 |
node_forks_total{instance="192.168.100.11:20001",job="node"} | 23951 |
node_forks_total{instance="192.168.100.12:20001",job="node"} | 24127 |
2.2 帶標籤的查詢
node_forks_total{instance="192.168.100.10:20001"}
結果如下
Element | Value |
---|---|
node_forks_total{instance="192.168.100.10:20001",job="node"} | 201816 |
2.3 多標籤查詢
node_forks_total{instance="192.168.100.10:20001",job="node"}
結果如下
Element Value
node_forks_total{instance="192.168.100.10:20001",job="node"} 201932
2.4 查詢2分鐘的時序數值
node_forks_total{instance="192.168.100.10:20001",job="node"}[2m]
2.5 正則匹配
node_forks_total{instance=~"192.168.*:20001",job="node"}
Element | Value |
---|---|
node_forks_total{instance="192.168.100.10:20001",job="node"} | 202107 |
node_forks_total{instance="192.168.100.11:20001",job="node"} | 24014 |
node_forks_total{instance="192.168.100.12:20001",job="node"} | 24186 |
三、常用函式查詢
官方提供的函式比較多, 具體可以參考地址如下: https://prometheus.io/docs/prometheus/latest/querying/functions/
這裡主要就常用函式進行演示。
3.1 irate
irate用於計算速率。
通過標籤查詢,特定例項特定job,特定cpu 在idle狀態下的cpu次數速率
irate(node_cpu_seconds_total{cpu="0",instance="192.168.100.10:20001",job="node",mode="idle"}[1m])
Element | Value |
---|---|
{cpu="0",instance="192.168.100.10:20001",job="node",mode="idle"} | 0.9833988932595507 |
3.2 count_over_time
計算特定的時序資料中的個數。
這個數值個數和採集頻率有關, 我們的採集間隔是15s,在一分鐘會有4個點位資料。
count_over_time(node_boot_time_seconds[1m])
Element | Value |
---|---|
{instance="192.168.100.10:20001",job="node"} | 4 |
{instance="192.168.100.11:20001",job="node"} | 4 |
{instance="192.168.100.12:20001",job="node"} | 4 |
3.3 子查詢
過去的10分鐘內, 每分鐘計算下過去5分鐘的一個速率值。 一個採集10m/1m一共10個值。
rate(node_cpu_seconds_total{cpu="0",instance="192.168.100.10:20001",job="node",mode="idle"}[5m])[10m:1m]
四、複雜查詢
4.1 計算記憶體使用百分比
node_memory_MemFree_bytes / node_memory_MemTotal_bytes * 100
Element | Value |
---|---|
{instance="192.168.100.10:20001",job="node"} | 9.927579722322251 |
{instance="192.168.100.11:20001",job="node"} | 59.740727403673034 |
{instance="192.168.100.12:20001",job="node"} | 63.2080982675149 |
4.2 獲取所有例項的記憶體使用百分比前2個
topk(2,node_memory_MemFree_bytes / node_memory_MemTotal_bytes * 100 )
Element | Value |
---|---|
{instance="192.168.100.12:20001",job="node"} | 63.20129636298163 |
{instance="192.168.100.11:20001",job="node"} | 59.50586164125955 |
五、實用查詢樣例
5.1 獲取cpu核心個數
# 計算所有的例項cpu核心數
count by (instance) ( count by (instance,cpu) (node_cpu_seconds_total{mode="system"}) )
# 計算單個例項的
count by (instance) ( count by (instance,cpu) (node_cpu_seconds_total{mode="system",instance="192.168.100.11:20001"})
5.2 計算記憶體使用率
(1 - (node_memory_MemAvailable_bytes{instance=~"192.168.100.10:20001"} / (node_memory_MemTotal_bytes{instance=~"192.168.100.10:20001"})))* 100
Element | Value |
---|---|
{instance="192.168.100.10:20001",job="node"} | 87.09358620413717 |
5.3 計算根分割槽使用率
100 - ((node_filesystem_avail_bytes{instance="192.168.100.10:20001",mountpoint="/",fstype=~"ext4|xfs"} * 100) / node_filesystem_size_bytes {instance=~"192.168.100.10:20001",mountpoint="/",fstype=~"ext4|xfs"})
5.4 預測磁碟空間
# 整體分為 2個部分, 中間用and分割, 前面部分計算根分割槽使用率大於85的, 後面計算根據近6小時的資料預測接下來24小時的磁碟可用空間是否小於0 。
(1- node_filesystem_avail_bytes{fstype=~"ext4|xfs",mountpoint="/"}
/ node_filesystem_size_bytes{fstype=~"ext4|xfs",mountpoint="/"}) * 100 >= 85 and (predict_linear(node_filesystem_avail_bytes[6h],3600 * 24) < 0)