1. 程式人生 > >ceph (luminous 版) zabbix 監控

ceph (luminous 版) zabbix 監控

目標

ceph (luminous 版) 預設已經自帶 zabbix 監控支援
配置 zabbix 相應監控

說明

當前使用環境, ceph luminous 版本 ceph-12.2.0-0.el7.x86_64
當前 zabbix 監控支援, 需要新增 zabbix 模組
監控資料項由 ceph 自身提供, 並通過 trapper 模式向 zabbix server 提交監控資料
zabbix 監控針對整個 ceph cluster 整體健康狀態
只需要在其中一臺可以訪問 ceph mgr 服務的電腦中啟用監控程式即可

資訊參考

ceph zabbix plugin

強調:

只需要在 ceph 叢集中其中一臺具有訪問 mgr 許可權的電腦中執行即可

載入模組

[[email protected] ~]# ceph mgr module enable zabbix

配置

定義 zabbix server

[[email protected] ~]# ceph zabbix config-set zabbix_host gx-yun-084044.vclound.com
Configuration option zabbix_host updated

定義當前被監控電腦

[[email protected] ~]# ceph zabbix config-set identifier cephsvr-128040.vclound.com
Configuration option identifier updated

定義 zabbix-sender 位置

[[email protected] ~]# ceph zabbix config-set zabbix_sender /etc/apps/svr/zabbix/bin/zabbix_sender
Configuration option zabbix_sender updated

定義 zabbix server port

[[email protected] ~]# ceph zabbix config-set zabbix_port 10051
Configuration option zabbix_port updated

定義 item 週期時間

[[email protected] ~]# ceph zabbix config-set interval 60
Configuration option interval updated

顯示配置

[[email protected] ~]# ceph zabbix config-show
{"zabbix_host": "gx-yun-084044.vclound.com", "identifier": "cephsvr-128040.vclound.com", "zabbix_sender": "/etc/apps/svr/zabbix/bin/zabbix_sender", "interval": 60, "zabbix_port": 10051}

zabbix server 配置

模板

zabbix_tempalte.xml 位置

[[email protected] ~]# rpm -ql ceph-mgr | grep xml
/usr/lib64/ceph/mgr/zabbix/zabbix_template.xml

匯入模板

注意

模板預設對應 zabbix-3.x 版, 假如需要匯入到 zabbix-2.x 中, 則需要修改 zabbix_temaplte.xml

<?xml version="1.0" encoding="UTF-8"?>
        <zabbix_export>
                <version>2.0</version>     <- 修改成 2.0 即可匯入

匯入模板方法
template

瀏覽時候選擇本地模板檔案, 點選 import 即可匯入模板

新增主機

new_host

指定主機對應 template

add_template

修改資料庫 allow host

為了確保每個 template 中的 trapper 都指定 allowed host, 最直接的方法是修改資料庫

參考

獲得 template id

MariaDB [(none)]> use zabbix;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [zabbix]> select hostid from hosts where name='ceph-mgr Zabbix module';
+--------+
| hostid |
+--------+
|  10395 |
+--------+
1 row in set (0.00 sec)

獲得對應 item

MariaDB [zabbix]> select itemid, name, key_, type, trapper_hosts  from items where hostid=10395;
+--------+-----------------------------------------------+-----------------------------+------+---------------+
| itemid | name                                          | key_                        | type | trapper_hosts |
+--------+-----------------------------------------------+-----------------------------+------+---------------+
|  35793 | Number of Monitors                            | ceph.num_mon                |    2 |               |
|  35794 | Number of OSDs                                | ceph.num_osd                |    2 |               |
|  35795 | Number of OSDs in state: IN                   | ceph.num_osd_in             |    2 |               |
|  35796 | Number of OSDs in state: UP                   | ceph.num_osd_up             |    2 |               |
|  35797 | Number of Placement Groups                    | ceph.num_pg                 |    2 |               |
|  35798 | Number of Placement Groups in Temporary state | ceph.num_pg_temp            |    2 |               |
|  35799 | Number of Pools                               | ceph.num_pools              |    2 |               |
|  35800 | Ceph OSD avg fill                             | ceph.osd_avg_fill           |    2 |               |
|  35801 | Ceph backfill full ratio                      | ceph.osd_backfillfull_ratio |    2 |               |
|  35802 | Ceph full ratio                               | ceph.osd_full_ratio         |    2 |               |
|  35803 | Ceph OSD Apply latency Avg                    | ceph.osd_latency_apply_avg  |    2 |               |
|  35804 | Ceph OSD Apply latency Max                    | ceph.osd_latency_apply_max  |    2 |               |
|  35805 | Ceph OSD Apply latency Min                    | ceph.osd_latency_apply_min  |    2 |               |
|  35806 | Ceph OSD Commit latency Avg                   | ceph.osd_latency_commit_avg |    2 |               |
|  35807 | Ceph OSD Commit latency Max                   | ceph.osd_latency_commit_max |    2 |               |
|  35808 | Ceph OSD Commit latency Min                   | ceph.osd_latency_commit_min |    2 |               |
|  35809 | Ceph OSD max fill                             | ceph.osd_max_fill           |    2 |               |
|  35810 | Ceph OSD min fill                             | ceph.osd_min_fill           |    2 |               |
|  35811 | Ceph nearfull ratio                           | ceph.osd_nearfull_ratio     |    2 |               |
|  35812 | Overall Ceph status                           | ceph.overall_status         |    2 |               |
|  35813 | Overal Ceph status (numeric)                  | ceph.overall_status_int     |    2 |               |
|  35814 | Ceph Read bandwidth                           | ceph.rd_bytes               |    2 |               |
|  35815 | Ceph Read operations                          | ceph.rd_ops                 |    2 |               |
|  35816 | Total bytes available                         | ceph.total_avail_bytes      |    2 |               |
|  35817 | Total bytes                                   | ceph.total_bytes            |    2 |               |
|  35818 | Total number of objects                       | ceph.total_objects          |    2 |               |
|  35819 | Total bytes used                              | ceph.total_used_bytes       |    2 |               |
|  35820 | Ceph Write bandwidth                          | ceph.wr_bytes               |    2 |               |
|  35821 | Ceph Write operations                         | ceph.wr_ops                 |    2 |               |
+--------+-----------------------------------------------+-----------------------------+------+---------------+
29 rows in set (0.00 sec)

定義 allow host

把之前添加了 ceph zabbix module 的伺服器 IP 地址 update 到表中

MariaDB [zabbix]> update items set trapper_hosts='xx.199.128.40,xx.199.128.214,xx.199.128.215' where hostid=10395;
Query OK, 29 rows affected (0.00 sec)
Rows matched: 29  Changed: 29  Warnings: 0

MariaDB [zabbix]> select itemid, name, key_, type, trapper_hosts  from items where hostid=10395;
+--------+-----------------------------------------------+-----------------------------+------+---------------------------------------------+
| itemid | name                                          | key_                        | type | trapper_hosts                               |
+--------+-----------------------------------------------+-----------------------------+------+---------------------------------------------+
|  35793 | Number of Monitors                            | ceph.num_mon                |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35794 | Number of OSDs                                | ceph.num_osd                |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35795 | Number of OSDs in state: IN                   | ceph.num_osd_in             |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35796 | Number of OSDs in state: UP                   | ceph.num_osd_up             |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35797 | Number of Placement Groups                    | ceph.num_pg                 |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35798 | Number of Placement Groups in Temporary state | ceph.num_pg_temp            |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35799 | Number of Pools                               | ceph.num_pools              |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35800 | Ceph OSD avg fill                             | ceph.osd_avg_fill           |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35801 | Ceph backfill full ratio                      | ceph.osd_backfillfull_ratio |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35802 | Ceph full ratio                               | ceph.osd_full_ratio         |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35803 | Ceph OSD Apply latency Avg                    | ceph.osd_latency_apply_avg  |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35804 | Ceph OSD Apply latency Max                    | ceph.osd_latency_apply_max  |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35805 | Ceph OSD Apply latency Min                    | ceph.osd_latency_apply_min  |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35806 | Ceph OSD Commit latency Avg                   | ceph.osd_latency_commit_avg |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35807 | Ceph OSD Commit latency Max                   | ceph.osd_latency_commit_max |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35808 | Ceph OSD Commit latency Min                   | ceph.osd_latency_commit_min |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35809 | Ceph OSD max fill                             | ceph.osd_max_fill           |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35810 | Ceph OSD min fill                             | ceph.osd_min_fill           |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35811 | Ceph nearfull ratio                           | ceph.osd_nearfull_ratio     |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35812 | Overall Ceph status                           | ceph.overall_status         |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35813 | Overal Ceph status (numeric)                  | ceph.overall_status_int     |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35814 | Ceph Read bandwidth                           | ceph.rd_bytes               |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35815 | Ceph Read operations                          | ceph.rd_ops                 |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35816 | Total bytes available                         | ceph.total_avail_bytes      |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35817 | Total bytes                                   | ceph.total_bytes            |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35818 | Total number of objects                       | ceph.total_objects          |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35819 | Total bytes used                              | ceph.total_used_bytes       |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35820 | Ceph Write bandwidth                          | ceph.wr_bytes               |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
|  35821 | Ceph Write operations                         | ceph.wr_ops                 |    2 | xx.199.128.40,xx.199.128.214,xx.199.128.215 |
+--------+-----------------------------------------------+-----------------------------+------+---------------------------------------------+
29 rows in set (0.00 sec)

注, 上面只是一個允許新增多個 trapper allow host 的例子, 實際上只需要新增一臺伺服器 ip 地址

確認 trapper

參考下圖

開啟 zabbix 中新新增的 host , 開啟其中一個 ceph item, 確認 type = zabbix trapper, allowed hosts = 你 update 資料庫中的 ip 地址

trapper

ceph cron job

利用 cron job, 每分鐘自動上報一次 ceph 監控資料

[[email protected] ~]# cat /etc/cron.d/ceph
*/1 * * * * root ceph zabbix send

監控 screenshot

監控 ceph pool 可用空間

storage

監控 ceph io

io

監控 ceph bandwidth

bandwidth

監控 ceph OSD latency

lantency