OpenStack原始碼分析之Nova-Compute服務啟動過程(icehouse)
學習OpenStack有半年多了,一直都停留在使用和trouble shooting的階段,最近有時間來好好研究了一下程式碼,因為以前是C++/Windows出生的,所以對Linux下面的Python開發不是很熟悉,本文適合一些已經使用過OpenStack並且想要初步瞭解程式碼工作原理的朋友,如果有什麼不對的地方歡迎指正
這裡分析的是Nova-Compute服務的開啟過程,其他服務如Nova-Scheduler, Nova-Conductor等的啟動過程類似(程式碼是基於GitHub上nova的icehouse版本的)
首先,當我們安裝完OpenStack的Nova-Compute元件後,通常我們會通過下面兩條命令來啟動服務:
service nova-compute start
start nova-compute
這兩條命令實際上是呼叫了upstart-job指令碼來啟動nova-compute服務的
首先我們可以在/etc/init/nova-compute.conf檔案中看到下面的shell啟動指令碼
chdir /var/run pre-start script mkdir -p /var/run/nova chown root:root /var/run/nova/ mkdir -p /var/lock/nova chown root:root /var/lock/nova/ modprobe nbd end script exec start-stop-daemon --start --chuid root --exec /usr/local/bin/nova-compute -- --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf
前面幾條可以不用管,就是修改一些許可權之類的,最後一句exec start-stop-deamon指令碼真正的用來啟動/usr/local/bin/nova-compute服務,後面帶的--config引數相信童鞋們都很熟悉了,就是nova-compute服務的兩個配置檔案,這在後面的程式碼中會用到來初始化配置
接下來可以找到/usr/local/bin/nova-compute檔案
import sys
from nova.cmd.compute import main
if __name__ == "__main__":
sys.exit(main())
這裡有點像C++的main函式啟動過程,呼叫了nova.cmd.compute的main方法
def main(): config.parse_args(sys.argv) logging.setup('nova') utils.monkey_patch() objects.register_all() gmr.TextGuruMeditation.setup_autorun(version) if not CONF.conductor.use_local: block_db_access() objects_base.NovaObject.indirection_api = \ conductor_rpcapi.ConductorAPI() server = service.Service.create(binary='nova-compute', topic=CONF.compute_topic, db_allowed=CONF.conductor.use_local) service.serve(server) service.wait()
分析其中的主要程式碼:
config.parse_args(sys.argv)
def parse_args(argv, default_config_files=None):
options.set_defaults(sql_connection=_DEFAULT_SQL_CONNECTION,
sqlite_db='nova.sqlite')
rpc.set_defaults(control_exchange='nova')
nova_default_log_levels = (log.DEFAULT_LOG_LEVELS +
["keystonemiddleware=WARN", "routes.middleware=WARN"])
log.set_defaults(default_log_levels=nova_default_log_levels)
debugger.register_cli_opts()
cfg.CONF(argv[1:],
project='nova',
version=version.version_string(),
default_config_files=default_config_files)
rpc.init(cfg.CONF)
利用sys.argv讀入在/etc/init/nova-compute.conf中指定的配置檔案,首先設定預設的資料庫連線,然後設定mq的預設exchange,設定日誌級別
cfg.CONF(argv[1:],
project='nova',
version=version.version_string(),
default_config_files=default_config_files)
利用Oslo中的config包初始化一個全域性變數CONF,讀入配置
rpc.init(cfg.CONF)
根據配置檔案初始化mq的rpc連線
回到main函式,主要看
server = service.Service.create(binary='nova-compute',
topic=CONF.compute_topic,
db_allowed=CONF.conductor.use_local)
service.serve(server)
service.wait()
首先利用類方法create建立一個Service物件
@classmethod
def create(cls, host=None, binary=None, topic=None, manager=None,
report_interval=None, periodic_enable=None,
periodic_fuzzy_delay=None, periodic_interval_max=None,
db_allowed=True):
"""Instantiates class and passes back application object.
:param host: defaults to CONF.host
:param binary: defaults to basename of executable
:param topic: defaults to bin_name - 'nova-' part
:param manager: defaults to CONF.<topic>_manager
:param report_interval: defaults to CONF.report_interval
:param periodic_enable: defaults to CONF.periodic_enable
:param periodic_fuzzy_delay: defaults to CONF.periodic_fuzzy_delay
:param periodic_interval_max: if set, the max time to wait between runs
"""
if not host:
host = CONF.host
if not binary:
binary = os.path.basename(sys.argv[0])
if not topic:
topic = binary.rpartition('nova-')[2]
if not manager:
manager_cls = ('%s_manager' %
binary.rpartition('nova-')[2])
manager = CONF.get(manager_cls, None)
if report_interval is None:
report_interval = CONF.report_interval
if periodic_enable is None:
periodic_enable = CONF.periodic_enable
if periodic_fuzzy_delay is None:
periodic_fuzzy_delay = CONF.periodic_fuzzy_delay
debugger.init()
service_obj = cls(host, binary, topic, manager,
report_interval=report_interval,
periodic_enable=periodic_enable,
periodic_fuzzy_delay=periodic_fuzzy_delay,
periodic_interval_max=periodic_interval_max,
db_allowed=db_allowed)
return service_obj
host表示當前節點的hostname,用於記錄,topic用於連線mq,manager是在配置檔案中出現的Manager類,用於連線mq並消費對應的訊息,對於compute服務即
compute_manager=nova.compute.manager.ComputeManager
最後初始化該service物件並返回
def serve(server, workers=None):
global _launcher
if _launcher:
raise RuntimeError(_('serve() can only be called once'))
_launcher = service.launch(server, workers=workers)
def wait():
_launcher.wait()
serve方法接收上面建立的service物件並呼叫nova.openstack.common中service.py下的launch方法(這裡我們可以看到nova.service.py下的Service類是nova.openstack.common.service.py下的Service類的子類)
def launch(service, workers=1):
if workers is None or workers == 1:
launcher = ServiceLauncher()
launcher.launch_service(service)
else:
launcher = ProcessLauncher()
launcher.launch_service(service, workers=workers)
return launcher
由於workers為預設值,所以初始化ServiceLauncher物件並呼叫launch_service方法(在ServiceLauncher的父類Launcher中定義)
def launch_service(self, service):
"""Load and start the given service.
:param service: The service you would like to start.
:returns: None
"""
service.backdoor_port = self.backdoor_port
self.services.add(service)
backdoor_port是nova.conf中的一個配置項,關於它的說明可以在配置檔案中的註釋中看到,這裡跳過
services.add()方法相當於一個Service物件的一個數組,它將該service加入到陣列中並等待啟動,在Services中還用到了eventlet中的Event類,它就是一個普通的事件,在服務開啟時初始化並通知服務關閉(event.wait()和event.send(),下文會看到程式碼)
class Services(object):
def __init__(self):
self.services = []
self.tg = threadgroup.ThreadGroup()
self.done = event.Event()
def add(self, service):
self.services.append(service)
self.tg.add_thread(self.run_service, service, self.done)
找到nova.openstack.common.threadgroup.py下的ThreadGroup定義
class ThreadGroup(object):
"""The point of the ThreadGroup class is to:
* keep track of timers and greenthreads (making it easier to stop them
when need be).
* provide an easy API to add timers.
"""
def __init__(self, thread_pool_size=10):
self.pool = greenpool.GreenPool(thread_pool_size)
self.threads = []
self.timers = []
def add_thread(self, callback, *args, **kwargs):
gt = self.pool.spawn(callback, *args, **kwargs)
th = Thread(gt, self)
self.threads.append(th)
return th
這裡用了greenpool來建立綠色的執行緒,所謂的綠色執行緒對於作業系統是透明的,由綠色執行緒庫本身來進行任務排程,也就是說這裡初始化一個10大小的執行緒池,實際對應到作業系統的native thread可能只有5個,在上面可以跑10個綠色執行緒,綠色執行緒之間切換不需要作業系統干預(因為對於作業系統來說還是同一個執行緒),可以避免執行緒切換的系統開銷(使用者態到核心態),同時又能有效避免死鎖等問題
spawn方法註冊了一個回掉函式run_service並傳入兩個引數
@staticmethod
def run_service(service, done):
"""Service start wrapper.
:param service: service to run
:param done: event to wait on until a shutdown is triggered
:returns: None
"""
service.start()
systemd.notify_once()
done.wait()
這裡呼叫了start方法真正開啟了相關的服務,並呼叫了done.wait()也就是event.wait()方法來終止服務
由於傳入的service是nova.service.py下的Service類物件,這裡的start是重寫的:
def start(self):
verstr = version.version_string_with_package()
LOG.audit(_('Starting %(topic)s node (version %(version)s)'),
{'topic': self.topic, 'version': verstr})
self.basic_config_check()
self.manager.init_host()
self.model_disconnected = False
ctxt = context.get_admin_context()
try:
self.service_ref = self.conductor_api.service_get_by_args(ctxt,
self.host, self.binary)
self.service_id = self.service_ref['id']
except exception.NotFound:
try:
self.service_ref = self._create_service_ref(ctxt)
except (exception.ServiceTopicExists,
exception.ServiceBinaryExists):
# NOTE(danms): If we race to create a record with a sibling
# worker, don't fail here.
self.service_ref = self.conductor_api.service_get_by_args(ctxt,
self.host, self.binary)
self.manager.pre_start_hook()
if self.backdoor_port is not None:
self.manager.backdoor_port = self.backdoor_port
.......
start方法很長,主要功能是初始化manager物件並建立相應的mq消費者,icehouse版本對mq的操作都封裝到了Oslo的messaging.py中了,下次有時間再仔細分析其中的程式碼吧~