OpenStack之Nova分析——Nova Scheduler排程演算法

阿新 • • 發佈：2019-01-26

上篇文章介紹了Nova Scheduler服務的啟動流程，我們知道Nova Scheduler服務作為一個排程者，其核心便是排程演算法。這篇文章我們就來分析一下Nova Scheduler服務的排程演算法吧。

在配置檔案中，排程演算法預設的驅動類是FilterScheduler，該類位於nova/nova/scheduler/filter_scheduler.py中。其演算法的原理是比較簡單的，就是“過濾”和“稱重”的過程。

class FilterScheduler(driver.Scheduler):
    def scheduler_run_instance(self, context, request_spec,
                               admin_password, injected_files,
                               requested_networks, is_first_time,
                               filter_properties):
        #獲取排程所需引數
        payload = dict(request_spec=request_spec)
        #通知Nova API開始排程
        notifier.notify(context, notifier.publisher_id("scheduler"),
                        'scheduler.run_instance.start', notifier.INFO, notifier.INFO,
                        payload)
        ...
        #執行排程演算法，獲取加權主機列表
        weighted_hosts = self._schedule(context, "compute", request_spec,
                                        filter_properties, instance_uuids)
        ...
        #為每個虛擬機器分配計算節點
        for num, instance_uuid in enumerate(instance_uuids):
            ...
            try:
                try:
                    #選擇權值最高的計算節點
                    weighted_host = weighted_hosts.pop(0)
                except IndexError:
                    raise exception.NoValidHost(reason="")
                #在權值最高的計算節點上建立虛擬機器
                self._provision_resource(context, weighted_host,
                                         request_spec,
                                         filter_properties,
                                         requested_networks,
                                         injected_files, admin_password,
                                         is_first_time,
                                         instance_uuid=instance_uuid)
            except Exception as ex:
                ...
        #通知Nova API虛擬機器排程完畢
        notifier.notify(context, notifier.publisher_id("scheduler"),
                        'scheduler.run_instance.end', notifier.INFO, payload)

演算法的核心實現在FilterScheduler類的_scheduler方法中。後面的_provision_resource方法實際上是遠端呼叫了Nova Compute服務的run_instance方法。我們下面來重點看一下包含演算法核心的_scheduler方法。

def _schedule(self, context, topic, request_spec, filter_properties,
                  instance_uuids=None):
    #獲取使用者上下文資訊
    elevated = context.elevated()
    #獲取虛擬機器的資訊
    instance_properties = request_spec['instance_properties']
    #獲取虛擬機器規格
    instance_type = request_spec.get("instance_type", None)
    ...
    #獲取配置項
    config_options = self._get_configuration_options()
    properties = instance_properties.copy()
    if instance_uuids:
        properties['uuid'] = instance_uuids[0]
    self._populate_retry(filter_properties, properties)

    #構造主機過濾引數
    filter_properties.update({'context': context,
                              'request_spec': request_spec,
                              'config_options': config_options,
                              'instance_type': instance_type})
    self.populate_filter_properties(request_spec,
                                    filter_properties)

    #獲取全部活動的主機列表
    hosts = self.host_manager.get_all_host_states(elevated)
    selected_hosts = []
    #獲取需要啟動的虛擬機器數量
    if instance_uuids:
        num_instances = len(instance_uuids)
    else:
        num_instances = request_spec.get('num_instances', 1)
    #為每個要建立的虛擬機器，選擇權值最高的主機
    for num in xrange(num_instances):
        #獲取所有可用的主機列表
        hosts = self.host_manager.get_filter_hosts(hosts,
                filter_properties)
        if not hosts:
            break
        #計算可用主機的權值
        weighted_host.host_state.consume_from_instance(
                instance_properties)
        #這個引數定義了新的例項將會被排程到一個主機上，這個主機是隨機的從最好的（分數最高的）N個主機組成的子集中選擇出來的
        scheduler_host_subset_size = CONF.scheduler_host_subset_size
        if scheduler_host_subset_size > len(weighed_hosts):
            scheduler_host_subset_size = len(weighed_hosts)
        if scheduler_host_subset_size < 1:
            scheduler_host_subset_size = 1
        #從分數最高的若干主機組成的子集中，隨機的選擇一個主機出來
        chosen_host = random.choice(weighed_hosts[0:scheduler_host_subset_size])
        selected_hosts.append(chosen_host)
        #因為已經選好了一個主機，所以要在下一個例項選擇主機前，更新主機資源資訊
        chosen_host.obj.consume_from_instance(instance_properties)
        ...
    return selected_hosts

虛擬機器排程演算法主要就是四個步驟：

1. 獲取可用的計算節點列表

(1) hosts = self.host_manager.get_all_host_states(elevated)

class HostManger(object):
    def get_all_host_states(self, context):
        #獲取所有計算節點
        compute_nodes = db.compute_node_get_all(context)
        seen_nodes = set()
        for compute in compute_nodes:
            #獲取節點的服務資訊
            service = compute['service']
            #節點上沒有服務，可能是過期節點
            if not service:
                continue
            #獲取節點的主機名
            host = service['host']
            node = compute.get('hypervisor_hostname')
            state_key = (host, node)
            #獲取HostManager物件快取的服務狀態和節點狀態資訊
            capabilities = self.service_states.get(state_key,None)
            host_state = self.host_state_map.get(state_key)
            #如果host_state存在，說明是舊節點
            if host_state:
                #更新節點的效能資訊
                host_state.update_capabilities(capabilities, dict
                (service.iteritems))
            else:
                #新增新節點的狀態資訊
                host_state = self.host_state_cls(host, node,capabilities=capabilities,service=dict(service.iteritems()))
                self.host_state_map[state_key] = host_state
            #更新計算節點的硬體資源資訊
            host_state.update_from_compute_node(compute)
            seen_nodes.add(state_key)
        #獲取不活動的節點列表
        dead_nodes = set(self.host_state_map.keys()) - seen_nodes
        #刪除不活動節點的快取資訊
        for state_key in dead_nodes:
            host, node = state_key
            del self.host_state_map[state_key]
        return self.host_state_map.itervalues()

可以看到，上面方法主要實現了兩個功能：獲取當前所有活動的計算節點列表；更新和維護HostManger物件快取的節點狀態資訊。

該方法首先呼叫db.compute_node_get_all，從資料庫中獲取當前活動的計算節點列表。列表中儲存了計算節點的CPU，記憶體和硬碟資源的最新資訊，該資訊由Nova Compute服務維護。Nova Compute服務會在每次執行完虛擬機器操作後更新計算節點的硬體資源資訊，同時還啟動了一個定時任務（update_available），定時更新硬體資源的資訊。變數capabilities儲存的是HostManager物件快取的計算節點效能資訊（包括節點的CPU、記憶體、硬碟的使用狀況），該效能資訊也是由Nova Compute服務的定時任務（update_capabilities）定時向Nova Scheduler服務報告節點的效能資訊。

(2) self.host_manager.get_filtered_hosts(hosts, filter_properties)

class HostManager(object):
    def get_filtered_hosts(self, hosts, filter_properties, filter_class_names=None):
        ...
        #獲取過濾器列表
        filter_classes = self._choose_host_filters(filter_class_names)
        ...
        #返回過濾後的主機列表
        return self.filter_handler.get_filtered_objects(filter_classes,
                                                        hosts, filter_properties)

為了確定計算節點是否可用，Nova Scheduler定義了多個過濾器，每個過濾器檢查節點的一種屬性。只有通過全部過濾器的節點，才被認為是可用的主機。上面的方法首先呼叫_choose_host_filters獲取過濾器列表。然後呼叫filter_handler變數的get_filtered_objects方法使用該過濾器。另外get_filtered_hosts方法還可以通過引數filter_properties傳入force_hosts和ignore_hosts兩個變數。

a. _choose_host_filters方法

class HostManager(object):
    def _choose_host_filters(self, filter_cls_names):
        #如果外部沒有傳入filter_cls_names引數，則使用預設的過濾器
        if filter_cls_names is None:
            filter_cls_names = CONF.scheduler_default_filters
        #將filter_cls_names封裝成列表
        if not isinstance(filter_cls_names, (list, tuple)):
            filter_cls_names = [filter_cls_names]
        good_filters = []
        bad_filters = []
        #遍歷所有配置的過濾器
        for filter_name in filter_cls_names:
            found_class = False
            #遍歷所有註冊的過濾器
            for cls in self.filter_classes:
                #如果filter_name對應的過濾器在註冊的過濾器列表中，則認為是好過濾器
                if cls.__name__ == filter_name:
                    good_filters.append(cls)
                    found_class = True
                    break
            #如果filter_name對應的過濾器不在註冊的過濾器列表中，則認為是壞過濾器
            if not found_class:
                bad_filters.append(filter_name)
        ...
        return good_filter

該方法遍歷filter_cls_names引數中所有的過濾器，從中提取好的過濾器，所謂好的過濾器就是指這個過濾器之前被註冊過。這個註冊過程在HostManager類的初始化方法中通過呼叫filter_handler物件的get_matching_classes方法完成，get_matching_classes方法會註冊nova.scheduler.filters包下定義的所有過濾器。

b. get_filtered_objects方法

class BaseFilterHandler(loadables.BaseLoader):
    def get_filtered_objects(self, filter_classes, objs, filter_properties):
        #遍歷每個過濾器
        for filter_cls in filter_classes:
            #呼叫過濾器類的filter_all方法
            objs = filter_cls().filter_all(objs, filter_properties)
        return list(objs)

該方法使用上面指定的過濾器，檢查計算節點是否可用，最終返回可用的計算節點列表。

方法依次呼叫了每個過濾器的filter_all方法，返回一個迭代器物件，該迭代器物件包含了通過該過濾器檢查的主機列表。每個過濾器物件都繼承自BaseHostFilter類，BaseHostFilter類繼承自BaseFilter類。filter_all方法定義在BaseFilter類中，其定義如下

class BaseFilter(object):
    def filter_all(self, filter_obj_list, filter_properties):
        for obj in filter_obj_list:
            if self._filter_one(obj, filter_properties):
                yield obj

filter_obj_list是待過濾的計算節點列表。filter_all方法對每個計算節點都呼叫了_filter_one方法，如果_filter_one方法返回True，則返回該主機的引用。BaseFilter類的_filter_one方法總是返回True，子類BaseHostFilter重寫了_filter_one方法，它會呼叫每個過濾器自身的host_pass方法。BaseHostFilter類的_filter_one方法定義如下

class BaseHostFilter(filters.BaseFilter):
    def _filter_one(self, obj, filter_properties):
        return self.host_pass(obj, filter_properties)

當主機通過了過濾器檢查時，host_pass方法返回True。只有當主機通過了所有過濾器檢查時，才被認為是可用的。

2. 計算可用計算節點的權值

get_weighed_hosts方法

class HostManager(object):
    def get_weighed_hosts(self, hosts, weight_properties):
        return self.weight_handler.get_weighed_objects(self.weight_classes,
                                                       hosts, weight_properties)

get_weighed_hosts方法較get_filtered_hosts方法要簡單。它不需要外部傳入類似weight_class_names的變數，而是直接使用預先註冊權值類（self.weight_classes = self.weight_handler.get_matching_classes(CONF.scheduler_weight_classes)），目前G版本的Nova只支援RAMWeigher權值類。

與get_filtered_hosts方法類似，get_weighed_host方法會呼叫weight_handler物件的get_weighed_objects方法來執行計算權值的方法，其定義如下

class BaseWeightHandler(loadables.BaseLoader):
    def get_weighed_objects(self, weigher_classes, obj_list, weighing_properties):
        if not obj_list:
            return []
        #將主機封裝成WeighedObject物件
        weighed_objs = [self.object_class(obj, 0.0) for obj in obj_list]
        #遍歷所有權值類
        for weigher_cls in weigher_classes:
            #建立權值物件
            weigher = weigher_cls()
            weigher.weigh_objects(weighed_objs, weighing_properties)
        #將主機列表按權值從高到低排序    
        return sorted(weighed_objs, key=lambda x: x.weight, reverse=True)

上面程式碼的核心部分是呼叫了權值物件的weigh_objects方法，每個權值物件都繼承自BaseHostWeigher類，BaseHostWeigher類繼承自BaseWeigher類。weigh_objects方法定義如下

class BaseWeigher(object):
    def weigh_objects(self, weighed_obj_list, weight_properties):
        for obj in weighed_obj_list:
            #主機權值=原來的權值+權重*當前權值物件賦予主機的權值
            obj.weight += (self._weight_multiplier() *
                           self._weigh_object(obj.obj, weight_properties))

可以看到，主機的權值實際上是各個權值類賦予主機的權值的加權和。其中_weight_multiplier方法返回當前權值類的權重，_weigh_object方法返回當前全之類賦予主機的權值。

由於當前Nova只支援RAMWeigher權值類，所以具體到這個權值類，我們來看一下_weight_multiplier和_weigh_object這兩個方法。它的權重由nova.conf配置檔案的ram_weight_multiplier配置項定義，預設值為1.0。其_weigh_object方法返回的是主機剩餘的記憶體大小。

3. 從權值最高的scheduler_host_subset_size個計算節點中隨機選擇一個計算節點作為建立虛擬機器的節點

4. 更新選擇的計算節點的硬體資源資訊，為虛擬機器預留資源

OpenStack之Nova分析——Nova Scheduler排程演算法

OpenStack之Nova分析——Nova Scheduler排程演算法

程式設計之美之小飛的電梯排程演算法（多種解法）---Java語言

openstack之網路分析2

Andrew Ng機器學習課程筆記（十六）之無監督學習之因子分析模型與EM演算法

深挖Openstack Nova - Scheduler排程策略

OpenStack之Nova分析——建立虛擬機器（五）

Openstack之Nova建立虛機流程分析

OpenStack之Nova分析——建立虛擬機器（七）——建立虛擬機器映象檔案

OpenStack之Nova架構分析

OpenStack Nova深入學習 -- 建立instance的過程之原始碼分析

OpenStack 之 Nova Compute 的代碼結構圖

Openstack之路（四）計算服務Nova

Openstack 之調整nova相關參數

在Ubuntu上學習OpenStack之六：計算節點基礎環境準備和安裝Nova

Spark原始碼分析之Master資源排程演算法原理

openstack分析——NOVA中的RabbitMQ解析

linux核心分析之排程演算法（一）

【openstack】【nova】nova 刪除雲主機流程及程式碼分析

OpenStack入門之架構分析

執行openstack flavor list後nova組件接收請求

OpenStack之Nova分析——Nova Scheduler排程演算法

相關推薦