ceph (luminous 版) zabbix 監控
目標
ceph (luminous 版) 預設已經自帶 zabbix 監控支援
配置 zabbix 相應監控
說明
當前使用環境, ceph luminous 版本 ceph-12.2.0-0.el7.x86_64
當前 zabbix 監控支援, 需要新增 zabbix 模組
監控資料項由 ceph 自身提供, 並通過 trapper 模式向 zabbix server 提交監控資料
zabbix 監控針對整個 ceph cluster 整體健康狀態
只需要在其中一臺可以訪問 ceph mgr 服務的電腦中啟用監控程式即可
資訊參考
ceph zabbix plugin
強調:
只需要在 ceph 叢集中其中一臺具有訪問 mgr 許可權的電腦中執行即可
載入模組
[[email protected] ~]# ceph mgr module enable zabbix
配置
定義 zabbix server
[[email protected] ~]# ceph zabbix config-set zabbix_host gx-yun-084044.vclound.com
Configuration option zabbix_host updated
定義當前被監控電腦
[[email protected] ~]# ceph zabbix config-set identifier cephsvr-128040.vclound.com
Configuration option identifier updated
定義 zabbix-sender 位置
[[email protected] ~]# ceph zabbix config-set zabbix_sender /etc/apps/svr/zabbix/bin/zabbix_sender
Configuration option zabbix_sender updated
定義 zabbix server port
[[email protected] ~]# ceph zabbix config-set zabbix_port 10051
Configuration option zabbix_port updated
定義 item 週期時間
[[email protected] ~]# ceph zabbix config-set interval 60
Configuration option interval updated
顯示配置
[[email protected] ~]# ceph zabbix config-show
{"zabbix_host": "gx-yun-084044.vclound.com", "identifier": "cephsvr-128040.vclound.com", "zabbix_sender": "/etc/apps/svr/zabbix/bin/zabbix_sender", "interval": 60, "zabbix_port": 10051}
zabbix server 配置
模板
zabbix_tempalte.xml 位置
[[email protected] ~]# rpm -ql ceph-mgr | grep xml
/usr/lib64/ceph/mgr/zabbix/zabbix_template.xml
匯入模板
注意
模板預設對應 zabbix-3.x 版, 假如需要匯入到 zabbix-2.x 中, 則需要修改 zabbix_temaplte.xml
<?xml version="1.0" encoding="UTF-8"?>
<zabbix_export>
<version>2.0</version> <- 修改成 2.0 即可匯入
匯入模板方法
瀏覽時候選擇本地模板檔案, 點選 import 即可匯入模板
新增主機
指定主機對應 template
修改資料庫 allow host
為了確保每個 template 中的 trapper 都指定 allowed host, 最直接的方法是修改資料庫
參考
獲得 template id
MariaDB [(none)]> use zabbix;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MariaDB [zabbix]> select hostid from hosts where name='ceph-mgr Zabbix module';
+--------+
| hostid |
+--------+
| 10395 |
+--------+
1 row in set (0.00 sec)
獲得對應 item
MariaDB [zabbix]> select itemid, name, key_, type, trapper_hosts from items where hostid=10395;
+--------+-----------------------------------------------+-----------------------------+------+---------------+
| itemid | name | key_ | type | trapper_hosts |
+--------+-----------------------------------------------+-----------------------------+------+---------------+
| 35793 | Number of Monitors | ceph.num_mon | 2 | |
| 35794 | Number of OSDs | ceph.num_osd | 2 | |
| 35795 | Number of OSDs in state: IN | ceph.num_osd_in | 2 | |
| 35796 | Number of OSDs in state: UP | ceph.num_osd_up | 2 | |
| 35797 | Number of Placement Groups | ceph.num_pg | 2 | |
| 35798 | Number of Placement Groups in Temporary state | ceph.num_pg_temp | 2 | |
| 35799 | Number of Pools | ceph.num_pools | 2 | |
| 35800 | Ceph OSD avg fill | ceph.osd_avg_fill | 2 | |
| 35801 | Ceph backfill full ratio | ceph.osd_backfillfull_ratio | 2 | |
| 35802 | Ceph full ratio | ceph.osd_full_ratio | 2 | |
| 35803 | Ceph OSD Apply latency Avg | ceph.osd_latency_apply_avg | 2 | |
| 35804 | Ceph OSD Apply latency Max | ceph.osd_latency_apply_max | 2 | |
| 35805 | Ceph OSD Apply latency Min | ceph.osd_latency_apply_min | 2 | |
| 35806 | Ceph OSD Commit latency Avg | ceph.osd_latency_commit_avg | 2 | |
| 35807 | Ceph OSD Commit latency Max | ceph.osd_latency_commit_max | 2 | |
| 35808 | Ceph OSD Commit latency Min | ceph.osd_latency_commit_min | 2 | |
| 35809 | Ceph OSD max fill | ceph.osd_max_fill | 2 | |
| 35810 | Ceph OSD min fill | ceph.osd_min_fill | 2 | |
| 35811 | Ceph nearfull ratio | ceph.osd_nearfull_ratio | 2 | |
| 35812 | Overall Ceph status | ceph.overall_status | 2 | |
| 35813 | Overal Ceph status (numeric) | ceph.overall_status_int | 2 | |
| 35814 | Ceph Read bandwidth | ceph.rd_bytes | 2 | |
| 35815 | Ceph Read operations | ceph.rd_ops | 2 | |
| 35816 | Total bytes available | ceph.total_avail_bytes | 2 | |
| 35817 | Total bytes | ceph.total_bytes | 2 | |
| 35818 | Total number of objects | ceph.total_objects | 2 | |
| 35819 | Total bytes used | ceph.total_used_bytes | 2 | |
| 35820 | Ceph Write bandwidth | ceph.wr_bytes | 2 | |
| 35821 | Ceph Write operations | ceph.wr_ops | 2 | |
+--------+-----------------------------------------------+-----------------------------+------+---------------+
29 rows in set (0.00 sec)
定義 allow host
把之前添加了 ceph zabbix module 的伺服器 IP 地址 update 到表中
MariaDB [zabbix]> update items set trapper_hosts='xx.199.128.40,xx.199.128.214,xx.199.128.215' where hostid=10395;
Query OK, 29 rows affected (0.00 sec)
Rows matched: 29 Changed: 29 Warnings: 0
MariaDB [zabbix]> select itemid, name, key_, type, trapper_hosts from items where hostid=10395;
+--------+-----------------------------------------------+-----------------------------+------+---------------------------------------------+
| itemid | name | key_ | type | trapper_hosts |
+--------+-----------------------------------------------+-----------------------------+------+---------------------------------------------+
| 35793 | Number of Monitors | ceph.num_mon | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35794 | Number of OSDs | ceph.num_osd | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35795 | Number of OSDs in state: IN | ceph.num_osd_in | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35796 | Number of OSDs in state: UP | ceph.num_osd_up | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35797 | Number of Placement Groups | ceph.num_pg | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35798 | Number of Placement Groups in Temporary state | ceph.num_pg_temp | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35799 | Number of Pools | ceph.num_pools | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35800 | Ceph OSD avg fill | ceph.osd_avg_fill | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35801 | Ceph backfill full ratio | ceph.osd_backfillfull_ratio | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35802 | Ceph full ratio | ceph.osd_full_ratio | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35803 | Ceph OSD Apply latency Avg | ceph.osd_latency_apply_avg | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35804 | Ceph OSD Apply latency Max | ceph.osd_latency_apply_max | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35805 | Ceph OSD Apply latency Min | ceph.osd_latency_apply_min | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35806 | Ceph OSD Commit latency Avg | ceph.osd_latency_commit_avg | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35807 | Ceph OSD Commit latency Max | ceph.osd_latency_commit_max | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35808 | Ceph OSD Commit latency Min | ceph.osd_latency_commit_min | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35809 | Ceph OSD max fill | ceph.osd_max_fill | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35810 | Ceph OSD min fill | ceph.osd_min_fill | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35811 | Ceph nearfull ratio | ceph.osd_nearfull_ratio | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35812 | Overall Ceph status | ceph.overall_status | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35813 | Overal Ceph status (numeric) | ceph.overall_status_int | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35814 | Ceph Read bandwidth | ceph.rd_bytes | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35815 | Ceph Read operations | ceph.rd_ops | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35816 | Total bytes available | ceph.total_avail_bytes | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35817 | Total bytes | ceph.total_bytes | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35818 | Total number of objects | ceph.total_objects | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35819 | Total bytes used | ceph.total_used_bytes | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35820 | Ceph Write bandwidth | ceph.wr_bytes | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
| 35821 | Ceph Write operations | ceph.wr_ops | 2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
+--------+-----------------------------------------------+-----------------------------+------+---------------------------------------------+
29 rows in set (0.00 sec)
注, 上面只是一個允許新增多個 trapper allow host 的例子, 實際上只需要新增一臺伺服器 ip 地址
確認 trapper
參考下圖
開啟 zabbix 中新新增的 host , 開啟其中一個 ceph item, 確認 type = zabbix trapper, allowed hosts = 你 update 資料庫中的 ip 地址
ceph cron job
利用 cron job, 每分鐘自動上報一次 ceph 監控資料
[[email protected] ~]# cat /etc/cron.d/ceph
*/1 * * * * root ceph zabbix send
監控 screenshot
監控 ceph pool 可用空間
監控 ceph io
監控 ceph bandwidth
監控 ceph OSD latency