ceph (luminous 版) 預設已經自帶 zabbix 監控支援
配置 zabbix 相應監控


當前使用環境, ceph luminous 版本 ceph-12.2.0-0.el7.x86_64
當前 zabbix 監控支援, 需要新增 zabbix 模組
監控資料項由 ceph 自身提供, 並通過 trapper 模式向 zabbix server 提交監控資料
zabbix 監控針對整個 ceph cluster 整體健康狀態
只需要在其中一臺可以訪問 ceph mgr 服務的電腦中啟用監控程式即可


ceph zabbix plugin


只需要在 ceph 叢集中其中一臺具有訪問 mgr 許可權的電腦中執行即可


[[email protected] ~]# ceph mgr module enable zabbix


定義 zabbix server

[[email protected] ~]# ceph zabbix config-set zabbix_host gx-yun-084044.vclound.com
Configuration option zabbix_host updated


[[email protected] ~]# ceph zabbix config-set identifier cephsvr-128040.vclound.com
Configuration option identifier updated

定義 zabbix-sender 位置

[[email protected] ~]# ceph zabbix config-set zabbix_sender /etc/apps/svr/zabbix/bin/zabbix_sender
Configuration option zabbix_sender updated

定義 zabbix server port

[[email protected] ~]# ceph zabbix config-set zabbix_port 10051
Configuration option zabbix_port updated

定義 item 週期時間

[[email protected] ~]# ceph zabbix config-set interval 60
Configuration option interval updated


[[email protected] ~]# ceph zabbix config-show
{"zabbix_host": "gx-yun-084044.vclound.com", "identifier": "cephsvr-128040.vclound.com", "zabbix_sender": "/etc/apps/svr/zabbix/bin/zabbix_sender", "interval": 60, "zabbix_port": 10051}

zabbix server 配置


zabbix_tempalte.xml 位置

[[email protected] ~]# rpm -ql ceph-mgr | grep xml



模板預設對應 zabbix-3.x 版, 假如需要匯入到 zabbix-2.x 中, 則需要修改 zabbix_temaplte.xml

<?xml version="1.0" encoding="UTF-8"?>
                <version>2.0</version>     <- 修改成 2.0 即可匯入


瀏覽時候選擇本地模板檔案, 點選 import 即可匯入模板



指定主機對應 template


修改資料庫 allow host

為了確保每個 template 中的 trapper 都指定 allowed host, 最直接的方法是修改資料庫


獲得 template id

MariaDB [(none)]> use zabbix;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [zabbix]> select hostid from hosts where name='ceph-mgr Zabbix module';
| hostid |
|  10395 |
1 row in set (0.00 sec)

獲得對應 item

MariaDB [zabbix]> select itemid, name, key_, type, trapper_hosts  from items where hostid=10395;
| itemid | name                                          | key_                        | type | trapper_hosts |
|  35793 | Number of Monitors                            | ceph.num_mon                |    2 |               |
|  35794 | Number of OSDs                                | ceph.num_osd                |    2 |               |
|  35795 | Number of OSDs in state: IN                   | ceph.num_osd_in             |    2 |               |
|  35796 | Number of OSDs in state: UP                   | ceph.num_osd_up             |    2 |               |
|  35797 | Number of Placement Groups                    | ceph.num_pg                 |    2 |               |
|  35798 | Number of Placement Groups in Temporary state | ceph.num_pg_temp            |    2 |               |
|  35799 | Number of Pools                               | ceph.num_pools              |    2 |               |
|  35800 | Ceph OSD avg fill                             | ceph.osd_avg_fill           |    2 |               |
|  35801 | Ceph backfill full ratio                      | ceph.osd_backfillfull_ratio |    2 |               |
|  35802 | Ceph full ratio                               | ceph.osd_full_ratio         |    2 |               |
|  35803 | Ceph OSD Apply latency Avg                    | ceph.osd_latency_apply_avg  |    2 |               |
|  35804 | Ceph OSD Apply latency Max                    | ceph.osd_latency_apply_max  |    2 |               |
|  35805 | Ceph OSD Apply latency Min                    | ceph.osd_latency_apply_min  |    2 |               |
|  35806 | Ceph OSD Commit latency Avg                   | ceph.osd_latency_commit_avg |    2 |               |
|  35807 | Ceph OSD Commit latency Max                   | ceph.osd_latency_commit_max |    2 |               |
|  35808 | Ceph OSD Commit latency Min                   | ceph.osd_latency_commit_min |    2 |               |
|  35809 | Ceph OSD max fill                             | ceph.osd_max_fill           |    2 |               |
|  35810 | Ceph OSD min fill                             | ceph.osd_min_fill           |    2 |               |
|  35811 | Ceph nearfull ratio                           | ceph.osd_nearfull_ratio     |    2 |               |
|  35812 | Overall Ceph status                           | ceph.overall_status         |    2 |               |
|  35813 | Overal Ceph status (numeric)                  | ceph.overall_status_int     |    2 |               |
|  35814 | Ceph Read bandwidth                           | ceph.rd_bytes               |    2 |               |
|  35815 | Ceph Read operations                          | ceph.rd_ops                 |    2 |               |
|  35816 | Total bytes available                         | ceph.total_avail_bytes      |    2 |               |
|  35817 | Total bytes                                   | ceph.total_bytes            |    2 |               |
|  35818 | Total number of objects                       | ceph.total_objects          |    2 |               |
|  35819 | Total bytes used                              | ceph.total_used_bytes       |    2 |               |
|  35820 | Ceph Write bandwidth                          | ceph.wr_bytes               |    2 |               |
|  35821 | Ceph Write operations                         | ceph.wr_ops                 |    2 |               |
29 rows in set (0.00 sec)

定義 allow host

把之前添加了 ceph zabbix module 的伺服器 IP 地址 update 到表中

MariaDB [zabbix]> update items set trapper_hosts='xx.199.128.40,xx.199.128.214,xx.199.128.215' where hostid=10395;
Query OK, 29 rows affected (0.00 sec)
Rows matched: 29  Changed: 29  Warnings: 0

MariaDB [zabbix]> select itemid, name, key_, type, trapper_hosts  from items where hostid=10395;
| itemid | name                                          | key_                        | type | trapper_hosts                               |
|  35793 | Number of Monitors                            | ceph.num_mon                |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35794 | Number of OSDs                                | ceph.num_osd                |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35795 | Number of OSDs in state: IN                   | ceph.num_osd_in             |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35796 | Number of OSDs in state: UP                   | ceph.num_osd_up             |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35797 | Number of Placement Groups                    | ceph.num_pg                 |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35798 | Number of Placement Groups in Temporary state | ceph.num_pg_temp            |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35799 | Number of Pools                               | ceph.num_pools              |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35800 | Ceph OSD avg fill                             | ceph.osd_avg_fill           |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35801 | Ceph backfill full ratio                      | ceph.osd_backfillfull_ratio |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35802 | Ceph full ratio                               | ceph.osd_full_ratio         |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35803 | Ceph OSD Apply latency Avg                    | ceph.osd_latency_apply_avg  |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35804 | Ceph OSD Apply latency Max                    | ceph.osd_latency_apply_max  |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35805 | Ceph OSD Apply latency Min                    | ceph.osd_latency_apply_min  |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35806 | Ceph OSD Commit latency Avg                   | ceph.osd_latency_commit_avg |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35807 | Ceph OSD Commit latency Max                   | ceph.osd_latency_commit_max |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35808 | Ceph OSD Commit latency Min                   | ceph.osd_latency_commit_min |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35809 | Ceph OSD max fill                             | ceph.osd_max_fill           |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35810 | Ceph OSD min fill                             | ceph.osd_min_fill           |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35811 | Ceph nearfull ratio                           | ceph.osd_nearfull_ratio     |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35812 | Overall Ceph status                           | ceph.overall_status         |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35813 | Overal Ceph status (numeric)                  | ceph.overall_status_int     |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35814 | Ceph Read bandwidth                           | ceph.rd_bytes               |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35815 | Ceph Read operations                          | ceph.rd_ops                 |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35816 | Total bytes available                         | ceph.total_avail_bytes      |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35817 | Total bytes                                   | ceph.total_bytes            |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35818 | Total number of objects                       | ceph.total_objects          |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35819 | Total bytes used                              | ceph.total_used_bytes       |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35820 | Ceph Write bandwidth                          | ceph.wr_bytes               |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35821 | Ceph Write operations                         | ceph.wr_ops                 |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
29 rows in set (0.00 sec)

注, 上面只是一個允許新增多個 trapper allow host 的例子, 實際上只需要新增一臺伺服器 ip 地址

確認 trapper


開啟 zabbix 中新新增的 host , 開啟其中一個 ceph item, 確認 type = zabbix trapper, allowed hosts = 你 update 資料庫中的 ip 地址


ceph cron job

利用 cron job, 每分鐘自動上報一次 ceph 監控資料

[[email protected] ~]# cat /etc/cron.d/ceph
*/1 * * * * root ceph zabbix send

監控 screenshot

監控 ceph pool 可用空間


監控 ceph io


監控 ceph bandwidth


監控 ceph OSD latency
