kubernetes scheduler

阿新 • • 發佈：2019-02-12

The scheduling algorithm

For given pod:

+---------------------------------------------+
|               Schedulable nodes:            |
|                                             |
| +--------+    +--------+      +--------+    |
| | node 1 |    | node 2 |      | node 3 |    |
| +--------+    +--------+      +--------+    |
|                                             |
+-------------------+-------------------------+
                    |
                    |
                    v
+-------------------+-------------------------+

Pred. filters: node 3 doesn't have enough resource

+-------------------+-------------------------+
                    |
                    |
                    v
+-------------------+-------------------------+
|             remaining nodes:                |
|   +--------+                 +--------+     |
|   | node 1 |                 | node 2 |     |
|   +--------+                 +--------+     |
|                                             |
+-------------------+-------------------------+
                    |
                    |
                    v
+-------------------+-------------------------+

Priority function:    node 1: p=2
                      node 2: p=5

+-------------------+-------------------------+
                    |
                    |
                    v
    select max{node priority} = node 2

排程器一次只為一個pod尋找適合的node.

首先，排程器會進行一系列的判斷，篩選掉不不合適的node. 例如，pod.spec 中定義了資源配額，排程器會過濾掉那些資源不足的node.
其次，排程器會通過一系列的優先順序判定，將剩下的node進行排序。排序過程並不會過濾掉node。例如，排程器會將pod儘可能的排程到資源充足，切位於不同zone的節點上。
最後，擁有最高優秀級的node會被選擇為排程節點（如果有多個node優秀級相同，會隨機選擇一個）。相關的程式碼實現參考 plugin/pkg/scheduler/generic_scheduler.go 中的 schedule() 函式。

總而言之，kubernetes 排程分為兩個部分
1、找到符合條件的node (predicates)
2、在符合條件的node中，根據策略選擇最優node (priorities policies)

predicates

這裡引用官方設計文件的描述

NoDiskConflict: Evaluate if a pod can fit due to the volumes it requests, and those that are already mounted. Currently supported volumes are: AWS EBS, GCE PD, ISCSI and Ceph RBD. Only Persistent Volume Claims for those supported types are checked. Persistent Volumes added directly to pods are not evaluated and are not constrained by this policy.

NoVolumeZoneConflict: Evaluate if the volumes a pod requests are available on the node, given the Zone restrictions.
PodFitsResources: Check if the free resource (CPU and Memory) meets the requirement of the Pod. The free resource is measured by the capacity minus the sum of requests of all Pods on the node. To learn more about the resource QoS in Kubernetes, please check QoS proposal.
PodFitsHostPorts: Check if any HostPort required by the Pod is already occupied on the node.
HostName: Filter out all nodes except the one specified in the PodSpec’s NodeName field.
MatchNodeSelector: Check if the labels of the node match the labels specified in the Pod’s nodeSelector field and, as of Kubernetes v1.2, also match the scheduler.alpha.kubernetes.io/affinity pod annotation if present. See here for more details on both.
MaxEBSVolumeCount: Ensure that the number of attached ElasticBlockStore volumes does not exceed a maximum value (by default, 39, since Amazon recommends a maximum of 40 with one of those 40 reserved for the root volume – see Amazon’s documentation. The maximum value can be controlled by setting the KUBE_MAX_PD_VOLS environment variable.
MaxGCEPDVolumeCount: Ensure that the number of attached GCE PersistentDisk volumes does not exceed a maximum value (by default, 16, which is the maximum GCE allows – see GCE’s documentation). The maximum value can be controlled by setting the KUBE_MAX_PD_VOLS environment variable.
CheckNodeMemoryPressure: Check if a pod can be scheduled on a node reporting memory pressure condition. Currently, no BestEffort should be placed on a node under memory pressure as it gets automatically evicted by kubelet.
CheckNodeDiskPressure: Check if a pod can be scheduled on a node reporting disk pressure condition. Currently, no pods should be placed on a node under disk pressure as it gets automatically evicted by kubelet.

其中，MatchNodeSelector 定義了pod可以被分配到特定的Node上去。通過給node設定label的方式來匹配。

node affinity 甚至可以配置pod與pod之間的分配策略

priorities policies

在篩選出一組符合條件的Node之後，根據優先順序策略計算各Node的權重。來決定pod最終被分配到哪個Node

當前，kubernetes提供瞭如下幾種優先順序策略(引用官方設計文件)：

LeastRequestedPriority: The node is prioritized based on the fraction of the node that would be free if the new Pod were scheduled onto the node. (In other words, (capacity - sum of requests of all Pods already on the node - request of Pod that is being scheduled) / capacity). CPU and memory are equally weighted. The node with the highest free fraction is the most preferred. Note that this priority function has the effect of spreading Pods across the nodes with respect to resource consumption.
BalancedResourceAllocation: This priority function tries to put the Pod on a node such that the CPU and Memory utilization rate is balanced after the Pod is deployed.
SelectorSpreadPriority: Spread Pods by minimizing the number of Pods belonging to the same service, replication controller, or replica set on the same node. If zone information is present on the nodes, the priority will be adjusted so that pods are spread across zones and nodes.
CalculateAntiAffinityPriority: Spread Pods by minimizing the number of Pods belonging to the same service on nodes with the same value for a particular label.
ImageLocalityPriority: Nodes are prioritized based on locality of images requested by a pod. Nodes with larger size of already-installed packages required by the pod will be preferred over nodes with no already-installed packages required by the pod or a small total size of already-installed packages required by the pod.
NodeAffinityPriority: (Kubernetes v1.2) Implements preferredDuringSchedulingIgnoredDuringExecution node affinity; see here for more details.

Kubernetes Scheduler原理解析

本文是對Kubernetes Scheduler的演算法解讀和原理解析,重點介紹了預選(Predicates)和優選(Priorities)步驟的原理，並介紹了預設配置的Default Policies。接下來，我會分析Kubernetes Scheduler的

Kubernetes Scheduler原始碼分析

本文是對Kubernetes 1.5的Scheduler原始碼層面的剖析，包括對應的原始碼目錄結構分析、kube-scheduler執行機制分析、整體程式碼流程圖、核心程式碼走讀分析等內容。閱讀本文前，請先了解kubernetes scheduler原理解析。

資深實踐篇 |基於Kubernetes 1.61的Kubernetes Scheduler排程詳解

資深實踐篇 |基於Kubernetes 1.61的Kubernetes Scheduler排程詳解關鍵字標籤：騰訊雲，效能優化，Github 說明：該文轉載自騰訊雲技術社群騰雲閣，已徵求作者本人同意。原始碼為 k8s v1.6.1 版本，github 上對應的 com

kubernetes scheduler

The scheduling algorithm For given pod: +---------------------------------------------+ | Schedulable nodes:

Kubernetes Scheduler 排程- Node資訊管理

原始碼為k8s v1.6.1版本，github上對應的commit id為b0b7a323cc5a4a2019b2e9520c21c7830b7f708e 本文將對Scheduler的排程中Node資訊的管理流程進行介紹，主要介紹Scheduler模組中nod

圖解 kubernetes scheduler 架構設計系列-初步瞭解

資源排程基礎 scheudler是kubernetes中的核心元件，負責為使用者宣告的pod資源選擇合適的node,同時保證叢集資源的最大化利用，這裡先介紹下資源排程系統設計裡面的一些基礎概念基礎任務資源排程基礎的任務資源排程通常包括三部分：角色型別功能 node node負責具體任

圖解kubernetes scheduler基於map/reduce模式實現優選階段

優選階段通過分map/reduce模式來實現多個node和多種演算法的平行計算，並且通過基於二級索引來設計最終的儲存結果，從而達到整個計算過程中的無鎖設計，同時為了保證分配的隨機性，針對同等優先順序的採用了隨機的方式來進行最終節點的分配，如果大家後續有類似的需求，不妨可以借鑑借鑑 1. 設計基礎 1.1 兩階

Kubernetes Scheduler淺析

### 概述 Kubernetes 排程器(Scheduler)是Kubernetes的核心元件；使用者或者控制器建立Pod之後，排程器通過 kubernetes 的 watch 機制來發現叢集中新建立且尚未被排程到 Node 上的 Pod。排程器會將發現的每一個未排程的 Pod 排程到一個合適的 Node

kubernetes Master部署之Scheduler 以及 HA部署(5)

mit 節點 ext health 作用 fig heal color nod Kubernetes Scheduler作用是將Controller Manager將要新建的Pod按照特定的調度算法和調度策略綁定到集群中某個合適的Node上，並將綁定信息寫入到etcd中。

kubernetes部署kube-scheduler服務

scp sch names 訪問 emd sta sys form unit 同樣的分非認證授權和認證授權：非認證授權： cat > /lib/systemd/system/kube-scheduler.service <<EOF [Unit] Des

kubernetes之Scheduler分析

1. kubernetes Scheduler 簡介 kubernetes Scheduler 執行在 master 節點，它的核心功能是監聽 apiserver 來獲取 PodSpec.NodeName 為空的 pod，然後為每個這樣的 pod 建立一個 binding 指示 pod 應該排程

[kubernetes系列]Scheduler模組深度講解

一，前言排程器的職責是負責將Pod排程到最合適的Node上，但是要實現它並不是易事，需要考慮很多方面。(1) 公平性：排程後集群各個node應該保持均衡的狀態。(2) 效能：不能成為叢集的效能瓶頸。 (3) 擴充套件性：使用者能根據自身需求定製排程器和排程演算法。(4) 限制：需要考慮多種限制條件，例如親

【kubernetes/k8s概念】kube-scheduler啟動引數

kubernetes 1.12.1版本 Desc The Kubernetes scheduler is a policy-rich, topology-aware, workload-specific function that signifi

【kubernetes/k8s原始碼分析】kube-scheduler 原始碼分析

前言在 kubernetes 體系中，scheduler 是唯一一個以 plugin 形式存在的模組，這種可插拔的設計方便使用者自定義所需要的排程演算法，所以原始碼路徑為 plugin 目錄下

Kubernetes K8S之排程器kube-scheduler詳解

Kubernetes K8S之排程器kube-scheduler概述與詳解 kube-scheduler排程概述在 Kubernetes 中，排程是指將 Pod 放置到合適的 Node 節點上，然後對應 Node 上的 Kubelet 才能夠執行這些 pod。排程器通過 k

YARN資源調度策略之Capacity Scheduler

default capacity schedule 應用程序社會主義背景yarn默認使用的是最簡單的FIFO調度器，即一個default隊列，所有用戶共享，分配資源也是先到先得，沒有優先級之分。有時一兩個任務就把資源全占了，其他任務吃不到資源造成饑餓，顯然這樣的資源分配是不合理的（在當

Kubernetes——自動擴展容器！假設你突然需要增加你的應用;你只需要告訴deployment一個新的 pod 副本總數即可

運行 class 都在功能 ima curl docs extern read 參考：http://kubernetes.kansea.com/docs/hellonode/ 現在你應該可以通過這個地址來訪問這個service: http://EXTERNAL_IP:

Oracle Scheduler中的repeat_interval

oracle scheduler repeat_intervalOracle 11g版本中引入了Scheduler(調度)來取代之前版本的JOB（任務）。這裏簡單介紹一下Scheduler中repeat_interval參數的含義和使用方法。repeat_interval從字面意思來說就是重復間隔。是指用戶定

Task Scheduler

pen follow tps mic .com wan ros microsoft rar https://technet.microsoft.com/en-us/library/cc748993(v=ws.11).aspx#BKMK_winui If Task Sche

在kubernetes 集群運行 odoo

rip yaml logs .cn 微軟雅黑 gre 執行 post uber kubernetes 可以自動運行多個 odoo服務的副本，因此非常適用用來做高可用的odoo部署，在本例中，odoo服務運行在 kubernetes 集群中，而 pos

kubernetes scheduler

The scheduling algorithm

predicates

priorities policies

相關推薦