1. 程式人生 > 實用技巧 >Zabbix template for Microsoft SQL Server總結

Zabbix template for Microsoft SQL Server總結

Zabbix template for Microsoft SQL Server介紹

這裡介紹Zabbix下監控Microsoft SQL Server資料庫非常好用的一個模板,模板名為Zabbix template for Microsoft SQL Server,此模板的下載地址為:

Zabbix share的地址:

https://share.zabbix.com/databases/microsoft-sql-server/template-for-microsoft-sql-server

GitHub的地址:

https://github.com/MantasTumenas/Zabbix-template-for-Microsoft-SQL-Server

下面的實驗、測試均為Zabbix 5.x,其它Zabbix版本沒有經過測試驗證。另外,建議使用GitHub下Microsoft SQL Server目錄下的模板。感覺這個模板遇到的問題比較少,如果你使用Zabbix share下的模板,問題多到煩死你,除非你有能力Fix掉這些問題。

解壓GitHub下的模板檔案(Zabbix-template-for-Microsoft-SQL-Server-master.zip),你就會發現下面分三個(Zabbix share的只有兩個目錄)目錄,分別如下命名:

Microsoft SQL Server #分支版本,這裡部署的是這個模板。

Without SQL instance discovery

#適用於單例項SQL Server監控

With SQL instance discovery #適用於多例項SQL Server監控

Zabbix share下模板(Zabbix Template for Microsoft SQL Server.zip)的目錄:

Without SQL instance discovery #適用於單例項SQL Server監控

With SQL instance discovery #適用於多例項SQL Server監控

Microsoft SQL Server下還有下面個目錄,具體如下所示:

Documentation #下面是Zabbix template for Microsoft SQL Server

的文件資料,絕對是我見過的Zabbix模板裡面最詳細的資料

Scripts #下面是PowerShell監控指令碼

Template #下面是Template模板

User parameters #下面有一個檔案userparams.conf,裡面定義了User parameters引數的一些樣例

Zabbix Value Mapping #下面有SQL Agent Job status.xmlSQL Database status.xml這兩個檔案。裡面定義了一些對映值。

這個模板包含這些功能和特徵,如下所示:

Features

• MS SQL performance counters.

• MS SQL instance Low Level Discovery.

• MS SQL database Low Level Discovery.

• MS SQL agent job Low Level Discovery.

• MS SQL database backup monitoring.

• MS SQL database mirroring monitoring.

• MS SQL Always On monitoring.

• MS SQL Log Shipping monitoring.

支援的版本,詳細資訊請見下面介紹:

Supported versions

Tested on Microsoft SQL Server 2012, 2014 and 2016. It may work with earlier versions, but some items (with missing performance counters) may be unsupported. For the extensive overview on the performance counters difference between MS SQL 2008 and MS SQL 2012 you can read here (https://blog.dbi-services.com/sql-server-2012-new-perfmon-counters/).

Tested on Zabbix 3.4.0. It may work with earlier versions, but some items (for example service.info[service,<param>]) may be unsupported. The template was started on Zabbix 2.4.0 but after each new Zabbix version, objects were modified or new things were added.

注意:這裡測試的環境為Zabbix 5.x, 所以這個模板也是支援Zabbix 5.x的,請知曉

部署過程

Without SQL instance discovery模板部署

官方文件的部署步驟:

1. Import templates via Configuration >> Templates:

• “Template Microsoft SQL Server DE Tier 3.xml”

• “Template Microsoft SQL Server DE Tier 2.xml”

• “Template Microsoft SQL Server DE Tier 1.xml”

• “Template Microsoft SQL Server SA Tier 3.xml”

2. Import value mappings via Administration >> General >> Value mapping:

• “SQL Agent Job status.xml”

• “SQL Database status.xml”

3. Copy catalog MSSQL with PowerShell scripts (*.ps1) to a location a Zabbix Agent can access (by default “C:\...\Zabbix\bin\”).

4. Copy 3 *.conf files from catalog “User parameters” to a location a Zabbix Agent can access (by default “C:\...\Zabbix\”).

5. Update “zabbix_agentd.win.conf”:

• add line “Include= C:\Program Files\Zabbix\mssql.agent.userparams.conf”.

• add line “Include= C:\Program Files\Zabbix\mssql.backup.userparams.conf”.

• add line “Include= C:\Program Files\Zabbix\mssql.basic.userparams.conf”.

6. Grant rights for Zabbix Agent service account. It needs read rights on tables:

• msdb.dbo.sysjobhistory

• msdb.dbo.sysjobs

• master.sys.databases

• msdb.dbo.backupset

• msdb.dbo.log_shipping_monitor_secondary.

7. By default, Zabbix Agent service account is NT AUTHORITY\SYSTEM which is already in SQL Server. If you need to monitor mirrored databases or databases in Always On, you will have to give Zabbix Agent’s service account (NT AUTHORITY\SYSTEM by default) sysadmin rights. More about it here.

8. Restart Zabbix Agent.

9. Depending on your SQL server edition and monitoring requirements select and add templates to a host.

10. Modify macros in templates according to your needs. Default values are below:

Macros

Macros meaning

Value

Meaning

Trigger

{$SYSDBFTIME1}

Sys db full backup time value 1

25

25 hours

Information

{$SYSDBFTIME2}

Sys db full backup time value 2

50

50 hours

Low

{$SYSDBFTIME3}

Sys db full backup time value 3

75

75 hours

Medium

{$UDBDTIME1}

User db diff backup time value 1

48

2 days

Information

{$UDBDTIME2}

User db diff backup time value 2

72

3 days

Low

{$UDBDTIME3}

User db diff backup time value 3

96

4 days

Medium

{$UDBFTIME1}

User db full backup time value 1

168

7 days

Information

{$UDBFTIME2}

User db full backup time value 2

192

8 days

Low

{$UDBFTIME3}

User db full backup time value 3

216

9 days

Medium

{$UDBLTIME1}

User db log backup time value 1

30

30 minutes

Information

{$UDBLTIME2}

User db log backup time value 2

60

60 minutes

Low

{$UDBLTIME3}

User db log backup time value 3

90

90 minutes

Medium

{$EVENTLOGTIME}

Event log recovery time value

28h

28 hours

Medium

{$DAYS}

Maintenance job time value

7

7 days

None

11. “Template Microsoft SQL Server SA Tier 3.xml” lets you discover SQL agent jobs. Discovery rules consist of:

• “SQL Server Agent Discovery” – discover SQL Agent service.

• “SQL Server Agent Jobs P1 Discovery” – discover SQL Agent jobs.

• “SQL Server Agent Jobs P2 Discovery” – discover SQL Agent jobs.

• “SQL Server Agent Jobs P3 Discovery” – discover SQL Agent jobs.

12. Difference between “SQL Server Agent Jobs P1 / P2 / P3 Discovery” are triggers. They can be configured differently. For example:

• “SQL Server Agent Jobs P1 Discovery” – alerts after trigger failed. Good for monitoring jobs, which need immediate attention. Like failed job “CHECKDB”.

• “SQL Server Agent Jobs P2 Discovery” – alerts after trigger failed two times. Good for monitoring jobs, which need attention, but not immediate. For example, job “DB LOG BACKUP” failed 1st time, but it will run again in 30 minutes. If 2nd time it fails again, then alert is raised.

• “SQL Server Agent Jobs P3 Discovery” – alerts after trigger failed but with additional conditions. Good for monitoring jobs, which do not need immediate attention. Like failed job “IndexOptimize”. Alert will be raised only during Monday – Friday, during 08:00 – 16:00. If you want to change day and hour parameters, you can do it directly in triggers.

• In mssql.agent.userparams.conf I placed 2 additional user parameters. In case you need to create your own custom items for monitoring P(riority)4 and P(priority)5 jobs.

13. Every discovery rule “SQL Server Agent Jobs P1 / P2 / P3 Discovery” has its filters there you can enter the job name, you want to associate with a selected rule:

If you leave a filter empty, all agent jobs will be discovered. To avoid that, I entered a simple place holder for every rule – ENTER_JOB_NAME.

下面結合個人的操作用中文簡單描述一下:

1:在配置-> "模板下匯入下面四個模板:

• “Template Microsoft SQL Server DE Tier 3.xml”

• “Template Microsoft SQL Server DE Tier 2.xml”

• “Template Microsoft SQL Server DE Tier 1.xml”

• “Template Microsoft SQL Server SA Tier 3.xml”

注意,從Zabbix share上下載的模板,只有下面兩個模板:

“Template SQL Server Instance 0 DE.xml”

“Template SQL Server Instance 0 SA.xml”

另外,預設情況下,這些模板位於Templates下面,個人喜歡將其分配到Templates/Databases組下面,方便日後的使用和管理! 步驟1只需要做一次就好了。這個是針對Zabbix Server而言。

2:在管理Administration->一般(General)-> "值對映"Value mapping)下面匯入值對映

 “SQL Agent Job status.xml”

 “SQL Database status.xml”

注意:步驟2也是隻需做一次即可。

3:將Scirpt目錄下的MSSQL目錄(裡面有一些PowerShell指令碼)拷貝到Zabbix Agent能訪問的路徑(預設情況下,將其拷貝到“C:\...\Zabbix\bin\”下面),這裡將其拷貝到C:\zabbix\bin\win64下面。當然你可以根據實際情況進行調整設定。也可以按照官方文件設定。

4:將User parameters目錄下的3個配置檔案拷貝到Zabbix Agent能訪問的路徑下(預設情況下為C:\...\Zabbix\),這裡我將其拷貝到C:\zabbix\conf目錄下面。

由於第三步,我將這些PowerShell指令碼放在C:\zabbix\bin\win64\MSSQL,所以,這三個引數檔案(mssql.agent.userparams.confmssql.backup.userparams.confmssql.basic.userparams.conf)很多配置資訊必須修改。這個根據實際情況調整,如下例子所示:

例子(修改前)

# User parameter to get agent name. Tier 3 template.

UserParameter=tier3.agent.mssql.discovery,powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\Program Files\zabbix\bin\MSSQL\DiscoveryDatabaseAgent\Discovery.mssql.instanceagentname.ps1"

# User parameter to get job name. Priority 5. Tier 3 template.

UserParameter=tier3.jobsp5.mssql.discovery,powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\Program Files\zabbix\bin\MSSQL\DiscoveryDatabaseAgent\Discovery.mssql.jobname.ps1"

例子(修改後)

# User parameter to get agent name. Tier 3 template.

UserParameter=tier3.agent.mssql.discovery,powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\zabbix\bin\win64\MSSQL\DiscoveryDatabaseAgent\Discovery.mssql.instanceagentname.ps1"

# User parameter to get job name. Priority 5. Tier 3 template.

UserParameter=tier3.jobsp5.mssql.discovery,powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\zabbix\bin\win64\MSSQL\DiscoveryDatabaseAgent\Discovery.mssql.jobname.ps1"

5:更新zabbix_agentd.conf下的配置

• add line “Include= C:\Program Files\Zabbix\mssql.agent.userparams.conf”.

• add line “Include= C:\Program Files\Zabbix\mssql.backup.userparams.conf”.

• add line “Include= C:\Program Files\Zabbix\mssql.basic.userparams.conf”.

個人的設定如下,這個肯定根據具體實際情況進行調整。

Include=C:\zabbix\conf\mssql.agent.userparams.conf

Include=C:\zabbix\conf\mssql.backup.userparams.conf

Include=C:\zabbix\conf\mssql.basic.userparams.conf

6:授權給Zabbix Agent伺服器賬號許可權,它需要下面一些表的查詢查詢

• msdb.dbo.sysjobhistory

• msdb.dbo.sysjobs

• master.sys.databases

• msdb.dbo.backupset

• msdb.dbo.log_shipping_monitor_secondary.

7:預設情況下,Zabbix Agent的服務賬號為NT AUTHORITY\SYSTEM,它是SQL Server下一個已經存在的賬號,如果你需要監控資料映象或Always On下面的一些資料庫,你需要授予Zabbix Agent的服務賬號sysadmin角色許可權。更多參考相關資料。

8:重啟Zabbix Agent服務。

9:在Zabbix Server上給相關需要監控的主機新增對應的模板。

如下所示,勾選下面四個模板。

此時,你就會在主機的配置裡面看到關於SQL Server監控的一些應用集(Applications)選項(截圖只是部分)

Zabbix share的模板配置略有區別,它有詳細的配置文件,有興趣的可以自己測試驗證一下。下面是之前測試整理的簡單步驟。

1:在配置-> "模板下匯入下面兩個模板:

Template SQL Server Instance 0 DE.xml

Template SQL Server Instance 0 SA.xml

2:在管理Administration->一般(General)-> "值對映"Value mapping)下面匯入值對映

 “SQL Agent Job status.xml”

 “SQL Database status.xml”

3:將Discovery.mssql.server.ps1檔案copyZabbix Agent能訪問的地方,個人將其放置在C:\zabbix\bin\win64下面

4:編輯Discovery.mssql.server.ps1檔案,在檔案的第14行,找到下面指令碼,用伺服器名替換“InsertSQLInstanceName”

[Parameter(Mandatory = $false, Position = 2)]$SQLInstanceName="EnterInstanceName"

參考部落格https://segmentfault.com/a/1190000019203337,也可以修改Discovery.mssql.server.ps1指令碼,新增下面一段程式碼(紅色部分),以後直接copy這個檔案即可,不用做任何修改。這樣省事方便很多。

Param(

[Parameter(Mandatory = $true, Position = 0)] [string]$select,

[Parameter(Mandatory = $false, Position = 1)][string]$2,

[Parameter(Mandatory = $false, Position = 2)]$SQLInstanceName="EnterInstanceName"

)

if ($SQLInstanceName -eq "EnterInstanceName")

{

$SQLInstanceName = $(hostname.exe)

}

5:修改zabbix_agentd.conf中的引數UserParameter, 如果你將檔案Discovery.mssql.server.ps1放在C:\Program Files\zabbix\bin下面,那麼就可以用userparams.conf中的值。

UserParameter=databases.mssql.discovery,powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\Program Files\zabbix\bin\Discovery.mssql.server.ps1" JSONDBNAME

UserParameter=jobs.mssql.discovery,powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\Program Files\zabbix\bin\Discovery.mssql.server.ps1" JSONJOBNAME

UserParameter=data.mssql.discovery[*],powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\Program Files\zabbix\bin\Discovery.mssql.server.ps1" $1 "$2"

個人做了一些變跟。因為將檔案Discovery.mssql.server.ps1放在C:\zabbix\bin\win64下面

UserParameter=databases.mssql.discovery,powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\zabbix\bin\win64\Discovery.mssql.server.ps1" JSONDBNAME

UserParameter=jobs.mssql.discovery,powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\zabbix\bin\win64\Discovery.mssql.server.ps1" JSONJOBNAME

UserParameter=data.mssql.discovery[*],powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\zabbix\bin\win64\Discovery.mssql.server.ps1" $1 "$2"

6:給執行Zabbix Agent 服務的賬號授予資料庫的相關許可權,它需要訪問msdb.dbo.sysjobhistorymsdb.dbo.sysjobs,預設情況,執行Zabbix Agent 服務的賬號為NT AUTHORITY\SYSTEM已經在資料庫中。

當然你可以建立一個賬號,然後在Discovery.mssql.server.ps1中設定,取消$uid$pwd的設定,填上建立的的賬號密碼。

# Desenvolvido por Diego Cavalcante - 06/12/2017
# Monitoramento Windows SQLServer
# Versco: 1.1.0
# Criaeco = Versco 1.0.0 29/08/2017 (Script Bisico).
# Update = Versco 1.1.0 02/01/2018 (Obrigado @bernardolankheet, JOBSTATUS Retornava N = 5 Nunca Executado).
# Update = by Oleg D. and Mantas T. Translated to EN, added SQL Insance name.
# Parameters. Change Line 14 $SQLInstanceName="InstanceName" to correct instance name
Param(
[Parameter(Mandatory = $true, Position = 0)] [string]$select,
 [Parameter(Mandatory = $false, Position = 1)][string]$2,
 [Parameter(Mandatory = $false, Position = 2)]$SQLInstanceName="xxxx"#具體的例項名
)
#Login SQLInstanceName
#$uid = "Login" #具體的登入名和密碼
#$pwd = "Password"

7:重啟Zabbix Agent服務

8:給相關伺服器(host)新增模板。

9:如果需要的話,更新巨集

10:預設情況下,需要新增兩個模板,除非你資料庫是SQL Server Express edition,那麼你只需要新增模板Template SQL Server Instance 0 DE Baseline

11:最好將這兩個模板分類到Templates/Databases群組下面,方便日後的使用和管理!

With SQL instance discovery 的模板建立也非常簡單,跟上面的差異不是太大。按照官方文件的操作步驟,逐步操作即可。

使用總結

1:例如,YourSQlDba資料庫的恢復模式為簡單模式,只做了完整備份。那麼監控就會觸發告警,告訴你這個YourSQlDba資料庫的沒有做差異備份和事務日誌備份。如下截圖

如果你不想它觸發告警,你可以在監控項(Item)裡面找到SQL Server Databases Discovery:SQL Instance MSSQLSERVER Database YourSQLDba: Diff Backup Status,禁用這些監控項(Item)即可。

2:如果資料庫例項上有離線的資料庫(offline),那麼你必須禁用這個資料庫的相關監控項(Item),否則,你會在Zabbix Agent的日誌中發現大量類似這樣的日誌

...............................................................................

19120:20200826:154534.767 active check "perf_counter["\SQLServer:Databases(xxxx)\Log File(s) Used Size (KB)"]" is not supported: Cannot obtain performance information from collector.

19120:20200826:154534.768 active check "perf_counter["\SQLServer:Databases(xxxx)\Log Flush Wait Time"]" is not supported: Cannot obtain performance information from collector.

19120:20200826:154534.769 active check "perf_counter["\SQLServer:Databases(xxxx)\Log Flush Waits/sec"]" is not supported: Cannot obtain performance information from collector.

19120:20200826:154534.769 active check "perf_counter["\SQLServer:Databases(xxxx)\Log Flushes/sec"]" is not supported: Cannot obtain performance information from collector.

19120:20200826:154534.769 active check "perf_counter["\SQLServer:Databases(xxxx)\Log Growths"]" is not supported: Cannot obtain performance information from collector.

19120:20200826:154534.770 active check "perf_counter["\SQLServer:Databases(xxxx)\Log Shrinks"]" is not supported: Cannot obtain performance information from collector.

19120:20200826:154534.770 active check "perf_counter["\SQLServer:Databases(xxxx)\Log Truncations"]" is not supported: Cannot obtain performance information from collector.

..............................................................................

另外,如果不禁用這個資料庫的相關監控項(Item),那麼你會在ZabbixQueue佇列裡面看到大量被延遲的監控項(Item)。禁用了離線資料庫的相關Item後,你就會觀察到Queue佇列延遲的Item不見了。

3:你看到類似下面這樣各種告警或資訊。下面截圖僅僅是部分截圖,然後就是理解各種告警和解決問題了。

各類監控指標都有圖形。可以檢視這些指標的曲線圖。

問題小結:

在使用Zabbix template for Microsoft SQL Server模板過程中,也遇到了一些小問題,下面是這些問題的集合。下面絕大部分問題是Zabbix share下的模板才會遇到的。下面描述問題時儘量標明是那個分支模板遇到的問題。強烈推薦使用GitHub上的分支版本。可以讓你繞過很多坑。

問題1Zabbix Agent日誌中出現下面錯誤

764:20200715:140830.588 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Active Transactions"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.588 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Data File(s) Size (KB)"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.589 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Log Bytes Flushed/sec"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.589 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Log File(s) Size (KB)"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.590 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Log File(s) Used Size (KB)"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.590 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Log Flush Wait Time"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.590 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Log Flush Waits/sec"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.591 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Log Flushes/sec"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.591 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Log Growths"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.592 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Log Shrinks"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.592 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Log Truncations"]" is not supported: Cannot obtain performance information from collector.

764:20200715:140830.592 active check "perf_counter["\SQLServer:Databases(DBAInventory)\Percent Log Used"]" is not supported: Cannot obtain performance information from collector.

檢查分析發現,DBAInventory資料庫被設定為離線狀態,這臺伺服器應用了模板"Template SQL Server Instance 0 DE Baseline",那麼就會生成一些監控項(Items)和一些觸發器(Triggers),這些ItemsTiggers的狀態是不支援的Not supported),所以在主機設定裡面,通過過濾搜尋資料庫DBAInventory的監控項和觸發器,如下所示,然後將其停用(Disable)後,zabbix_agentd.log中就不會出現這個錯誤資訊了。

問題2:遇到 Timeout while executing a shell script.錯誤。

1364:20200709:085346.828 active check "jobs.mssql.discovery" is not supported: Timeout while executing a shell script.

1364:20200709:085842.183 Failed to execute command "powershell.exe -NoProfile -ExecutionPolicy Bypass -File "C:\zabbix\bin\win64\Discovery.mssql.server.ps1" JSONDBNAME": Timeout while executing a shell script.

1364:20200709:085842.183 active check "databases.mssql.discovery" is not supported: Timeout while executing a shell script.

修改zabbix_agentd.conf配置檔案中的引數Timeout, 例如將Timeout調整為30

### Option: Timeout

# Spend no more than Timeout seconds on processing.

#

# Mandatory: no

# Range: 1-30

# Default:

# Timeout=3

Timeout=30

此時你就會發現zabbix_agentd.log不會出現這個錯誤了。

整理的文件,本來有十幾個小問題,全部列在此處,不僅感覺非常混亂,而且佔用了大量的篇幅,後面想想,這裡就簡單列舉一兩個問題,後面有空,打算將這些問題以單篇展開述說。

參考資料:

https://share.zabbix.com/databases/microsoft-sql-server/template-for-microsoft-sql-server

https://github.com/MantasTumenas/Zabbix-template-for-Microsoft-SQL-Server