1. 程式人生 > >cloudfoundry研究(一) ---- BOSH與monit

cloudfoundry研究(一) ---- BOSH與monit

我們一般使用BOSH來部署cloudfoundry。使用bosh vms命令來檢視各個節點的執行情況,如下所示:



通過這種形式,我們可以一目瞭然的檢視到各節點的執行情況(running,failing等等),而這些資訊都是通過Monit來獲取的。


什麼是Monit

Monit是一個跨平臺的用來監控Unix/linux系統(比如Linux、BSD、OSX、Solaris)的工具。Monit特別易於安裝,而且非常輕量級(只有500KB大小),並且不依賴任何第三方程式、外掛或者庫。然而,Monit可以勝任全面監控、程序狀態監控、檔案系統變動監控、郵件通知和對核心服務的自定義動作等場景。易於安裝、輕量級的實現以及強大的功能,讓Monit成為一個理想的後備監控工具。Monit 包含一個內嵌的 HTTP(S) Web 介面,可以使用瀏覽器方便地檢視 Monit 所監視的服務。


BOSH中的Monit

我們可以登入到cloudfoundry的節點上,在每一個節點上我們都可以發現/var/vcap/bosh/bin/monit這個可執行檔案,執行 monit -h檢視一下monit可以做哪些事:
[email protected]:/var/vcap/bosh/bin# ./monit -h
Usage: monit [options] {arguments}
Options are as follows:
 -c file       Use this control file
 -d n          Run as a daemon once per n seconds
 -g name       Set group name for start, stop, restart, monitor and unmonitor
 -l logfile    Print log information to this file
 -p pidfile    Use this lock file in daemon mode
 -s statefile  Set the file monit should write state information to
 -I            Do not run in background (needed for run from init)
 -t            Run syntax check for the control file
 -v            Verbose mode, work noisy (diagnostic output)
 -H [filename] Print SHA1 and MD5 hashes of the file or of stdin if the
               filename is omited; monit will exit afterwards
 -V            Print version number and patchlevel
 -h            Print this text
Optional action arguments for non-daemon mode are as follows:
 start all           - Start all services
 start name          - Only start the named service
 stop all            - Stop all services
 stop name           - Only stop the named service
 restart all         - Stop and start all services
 restart name        - Only restart the named service
 monitor all         - Enable monitoring of all services
 monitor name        - Only enable monitoring of the named service
 unmonitor all       - Disable monitoring of all services
 unmonitor name      - Only disable monitoring of the named service
 reload              - Reinitialize monit
 status              - Print full status information for each service
 summary             - Print short status information for each service
 quit                - Kill monit daemon process
 validate            - Check all services and start if not running
 procmatch <pattern> - Test process matching pattern

(Action arguments operate on services defined in the control file)

monit不僅僅可以監控服務,還可以啟動,停止,重啟服務(start ,stop, restart...),功能不可謂不強大。 首先看監控,執行命令: monit summary
[email protected]:/var/vcap/bosh/bin# ./monit summary
The Monit daemon 5.2.4 uptime: 1h 7m

Process 'nats'                      running
Process 'nats_stream_forwarder'     running
Process 'etcd'                      running
Process 'hm9000_listener'           running
Process 'hm9000_fetcher'            running
Process 'hm9000_analyzer'           running
Process 'hm9000_sender'             running
Process 'hm9000_metrics_server'     running
Process 'hm9000_api_server'         running
Process 'hm9000_evacuator'          running
Process 'hm9000_shredder'           running
Process 'cloud_controller_ng'       running
Process 'cloud_controller_worker_local_1' running
Process 'cloud_controller_worker_local_2' running
Process 'nginx_cc'                  running
Process 'cloud_controller_migration' running
Process 'cloud_controller_worker_1' running
Process 'cloud_controller_clock'    running
Process 'uaa'                       running
Process 'consul_template'           running
File 'haproxy_config'               accessible
Process 'haproxy'                   running
Process 'gorouter'                  running
Process 'warden'                    running
Process 'dea_next'                  running
Process 'dir_server'                running
Process 'loggregator_trafficcontroller' running
Process 'doppler'                   running
Process 'metron_agent'              running
Process 'dea_logging_agent'         running
Process 'etcd_metrics_server'       running
Process 'consul_agent'              running
Process 'route_registrar'           running
Process 'postgres'                  running
System 'system_ubuntu'              running
以上列出了所有被監控的的cloudfoundry元件的執行情況。
重啟一個服務,執行命令: monit restart 
[email protected]:/var/vcap/bosh/bin# ./monit restart nats
[email protected]:/var/vcap/bosh/bin# ./monit summary
The Monit daemon 5.2.4 uptime: 1h 11m

Process 'nats'                      not monitored - restart pending
Process 'nats_stream_forwarder'     not monitored
Process 'etcd'                      running
Process 'hm9000_listener'           running
Process 'hm9000_fetcher'            running
Process 'hm9000_analyzer'           running
Process 'hm9000_sender'             running
Process 'hm9000_metrics_server'     running
Process 'hm9000_api_server'         running
Process 'hm9000_evacuator'          running
Process 'hm9000_shredder'           running
Process 'cloud_controller_ng'       running
Process 'cloud_controller_worker_local_1' running
Process 'cloud_controller_worker_local_2' running
Process 'nginx_cc'                  running
Process 'cloud_controller_migration' running
Process 'cloud_controller_worker_1' running
Process 'cloud_controller_clock'    running
Process 'uaa'                       running
Process 'consul_template'           running
File 'haproxy_config'               accessible
Process 'haproxy'                   running
Process 'gorouter'                  running
Process 'warden'                    running
Process 'dea_next'                  running
Process 'dir_server'                running
Process 'loggregator_trafficcontroller' running
Process 'doppler'                   running
Process 'metron_agent'              running
Process 'dea_logging_agent'         running
Process 'etcd_metrics_server'       running
Process 'consul_agent'              running
Process 'route_registrar'           running
Process 'postgres'                  running
System 'system_ubuntu'              running
[email protected]:/var/vcap/bosh/bin#

過一段時間,nats服務就會重新啟動起來。
[email protected]:/var/vcap/bosh/bin# ./monit summary
The Monit daemon 5.2.4 uptime: 1h 11m

Process 'nats'                      running
Process 'nats_stream_forwarder'     running

總之,monit提供了很多的命令,在此不一一列舉了。

自定義Monit

bosh中的monit配置檔案monitrc存放在 /var/vcap/bosh/etc目錄下面,檔案的內容形如以下:
[email protected]:/var/vcap/bosh/etc# cat monitrc
set daemon 10
set logfile /var/vcap/monit/monit.log
set httpd port 2822 and use address 10.0.0.112
  allow cleartext /var/vcap/monit/monit.user

include /var/vcap/monit/*.monitrc
include /var/vcap/monit/job/*.monitrc

set daemon 10 規定了檢查間隔為10秒 allow cleartext /var/vcap/monit/monit.user  規定了登入的使用者名稱密碼存放的檔案 set httpd port 2822 and use address 10.0.0.112  設定了web伺服器的地址和埠(後面會講到如何開啟web頁面,以便更直觀的看到監控資訊)
include /var/vcap/monit/*.monitrc
include /var/vcap/monit/job/*.monitrc 這兩條命令是為了引入其他的配置檔案,可以使用萬用字元

開啟web頁面

1. 在上述monitrc檔案中設定:
set httpd port 2822 and use address 10.0.0.112
其中2822為自定義的埠號,10.0.0.112為本機的ip
2. 修改防火牆
iptables -A INPUT -p tcp --dport 2822 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 2822 -j ACCEPT

永久儲存防火牆設定:

編輯/etc/network/interfaces檔案,新增以下內容

pre-up iptables-restore < /etc/iptables.rules
post-down iptables-restore < /etc/iptables.downrules


修改完的檔案類似於以下

auto eth0
iface eth0 inet dhcp
  pre-up iptables-restore < /etc/iptables.rules
  post-down iptables-restore < /etc/iptables.downrules


執行如下命令

sudo sh -c "iptables-save -c > /etc/iptables.rules" 





3. 重啟monit服務
monit reload

通過瀏覽器訪問:http://10.0.0.112:2822,使用/var/vcap/monit/monit.user檔案中(見monitrc檔案中定義的路徑)的使用者名稱密碼登入系統,可以看到如下的效果: