kubernetes資源配額管理
參考:https://kubernetes.io/docs/concepts/policy/resource-quotas/
當叢集有多個使用者或者團隊時,一個重要的問題就是資源的公平分配。
資源配置就是管理員用來解決類似問題的工具。
通過定義ResourceQuota物件,約束一個名稱空間可以消耗的系統資源的總數。它可以限制一個名稱空間內可以建立的某種型別物件的數量,也可以限制名稱空間所能消耗的低層資源的總數。
資源配額按如下方式工作:
- 不同teams工作在不同namespace下,目前這種個約束需要使用者自願遵守,但支援通過ACLs強制遵守此約束。
- 由管理員為每個namespace建立一個或者多個ResourceQuota。
- 使用者在namespace下建立各種資源,配額系統追蹤namespace下各種資源的使用量確保不會超過ResourceQuota規定的硬性數量。
- 如果在建立、更新資源時違反了資源配額規定,apiserver返回403應答並在正文中解決原因。
- 如果名稱空間中有對計算資源如cpu、memory等的配額限制,則使用者在建立pod時必需指明對這些資源的最小需求與最大限制,否則系統建立pod。提示:用LimitRanger admission controller(入口管理器)強制為沒有指定對資源需求的pod設定預設值。
你可能會建立如下資源配額規則:
- 叢集有32 GiB RAM、16 cores,允許team A使用20 GiB RAM、10 cores,允許B使用10 GiB RAM、4 cores,剩下2GiB RAM、2 cores保留。
- 限制“testing” namespace使用1core、1GiB RAM。讓“production” namespace沒有配額限制。
發系統中的資源總額少於系統中所有資源配額的總時,會發生爭用情況,目前的處理措施是先到先得。
配額對已經建立的資源沒有影響。
允許資源配額管理
預設安裝,apiserver的標誌--enable-admission-plugins的值包含ResourceQuota,資源配額管理預設開啟。
如果某個namespace包含資源配額條目,那麼無法apiserver標誌是否開啟,針對這個namespace的資源配額管理都會被開啟。
計算資源配額管理
可以為namespace設定計算資源配額,支援如下型別:
Resource Name | Description |
---|---|
cpu |
Across all pods in a non-terminal state, the sum of CPU requests cannot exceed this value. |
limits.cpu |
Across all pods in a non-terminal state, the sum of CPU limits cannot exceed this value. |
limits.memory |
Across all pods in a non-terminal state, the sum of memory limits cannot exceed this value. |
memory |
Across all pods in a non-terminal state, the sum of memory requests cannot exceed this value. |
requests.cpu |
Across all pods in a non-terminal state, the sum of CPU requests cannot exceed this value. |
requests.memory |
Across all pods in a non-terminal state, the sum of memory requests cannot exceed this value. |
擴充套件資源資源配額
允許對擴充套件資源使用資源配額管理。方式與系統內建資源型別有所不同。對於擴充套件資源型別的限制,只需要指定limit值,無需指定request值。例如,以GPU資源為例,資源的名稱是nvidia.com/gpu,限制namespace對此資源的配額為4,那麼quota的定義如下:
requests.nvidia.com/gpu: 4
儲存資源配額
可以為namespace指定儲存資源配額,進一步的話可以基於儲存資源的種類進行配額管理。
esource Name | Description |
---|---|
requests.storage |
Across all persistent volume claims, the sum of storage requests cannot exceed this value. |
persistentvolumeclaims |
The total number of persistent volume claims that can exist in the namespace. |
<storage-class-name>.storageclass.storage.k8s.io/requests.storage |
Across all persistent volume claims associated with the storage-class-name, the sum of storage requests cannot exceed this value. |
<storage-class-name>.storageclass.storage.k8s.io/persistentvolumeclaims |
Across all persistent volume claims associated with the storage-class-name, the total number of persistent volume claims that can exist in the namespace. |
如下分別對金牌與銅牌兩種儲存做配額管理:
- gold.storageclass.storage.k8s.io/requests.storage: 500Gi
- bronze.storageclass.storage.k8s.io/requests.storage: 100G
從1.8版本開始,作為測試我支援對臨時儲存的配額管理:
Resource Name | Description |
---|---|
requests.ephemeral-storage |
Across all pods in the namespace, the sum of local ephemeral storage requests cannot exceed this value. |
limits.ephemeral-storage |
Across all pods in the namespace, the sum of local ephemeral storage limits cannot exceed this value. |
物件數量配額
從1.9版本開始,支援所有標準物件型別的配額管理,語法如下:
count/<resource>.<group>
以下是可以數量配額管理的資源型別:
- count/persistentvolumeclaims
- count/services
- count/secrets
- count/configmaps
- count/replicationcontrollers
- count/deployments.apps
- count/replicasets.apps
- count/statefulsets.apps
- count/jobs.batch
- count/cronjobs.batch
- count/deployments.extensions
物件數量配額可以防止某種型別的物件太多而引起的系統整體效能的下降。以下是支援的配額項:
Resource Name | Description |
---|---|
configmaps |
The total number of config maps that can exist in the namespace. |
persistentvolumeclaims |
The total number of persistent volume claims that can exist in the namespace. |
pods |
The total number of pods in a non-terminal state that can exist in the namespace. A pod is in a terminal state if .status.phase in (Failed, Succeeded) is true. |
replicationcontrollers |
The total number of replication controllers that can exist in the namespace. |
resourcequotas |
The total number of resource quotas that can exist in the namespace. |
services |
The total number of services that can exist in the namespace. |
services.loadbalancers |
The total number of services of type load balancer that can exist in the namespace. |
services.nodeports |
The total number of services of type node port that can exist in the namespace. |
secrets |
The total number of secrets that can exist in the namespace. |
pod數量配額為限制namespace中所有沒有終止的pod的數量,也許管理員想通過這種配額限制使用者在namespace下建立過多的小pod從而用光可用的pod網路地址。
配額作用範圍
每個配額可以與某個scope相關聯,配額只度量與其範圍相符的資源使用情況,只限制與其範圍相符的資源使用。以下是支援的scope:
Scope | Description |
---|---|
Terminating |
Match pods where .spec.activeDeadlineSeconds >= 0 |
NotTerminating |
Match pods where .spec.activeDeadlineSeconds is nil |
BestEffort |
Match pods that have best effort quality of service. |
NotBestEffort |
Match pods that do not have best effort quality of service. |
BestEffort型別的scope只支援pod型別配額。
erminating, NotTerminating, and NotBestEffort型別的scope支援如下型別的配額:
- cpu
- limits.cpu
- limits.memory
- memory
- pods
- requests.cpu
- requests.memory
基於優先順序的資源配額
首先在建立pod時可以指定優先順序,pod的優先順序不同,它的配限制也應該不同,比如優無級的pod應該有更嚴格的配額限制,反之則應該更寬鬆。這一點可以通過為配額指定scopeSelector欄位實現。以下是示例,假設如下場景:
- Pod有三種優先順序:“low”, “medium”, “high”。
- 為每種優先順序設定不同的配額。
具體步驟如下:
1.將如下內容儲存到quota.yml檔案:
```yaml apiVersion: v1 kind: List items:
apiVersion: v1 kind: ResourceQuota metadata: name: pods-high spec: hard: cpu: “1000” memory: 200Gi pods: “10” scopeSelector: matchExpressions: - operator : In scopeName: PriorityClass values: [“high”]
apiVersion: v1 kind: ResourceQuota metadata: name: pods-medium spec: hard: cpu: “10” memory: 20Gi pods: “10” scopeSelector: matchExpressions: - operator : In scopeName: PriorityClass values: [“medium”]
apiVersion: v1 kind: ResourceQuota metadata: name: pods-low spec: hard: cpu: “5” memory: 10Gi pods: “10” scopeSelector: matchExpressions: - operator : In scopeName: PriorityClass values: [“low”] ```
2.建立三個quota物件
kubectl create -f ./quota.yml`
resourcequota/pods-high created
resourcequota/pods-medium created
resourcequota/pods-low created
3.確認使用掉的配額為"0"
kubectl describe quota
Name: pods-high
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 1k
memory 0 200Gi
pods 0 10
Name: pods-low
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 5
memory 0 10Gi
pods 0 10
Name: pods-medium
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 10
memory 0 20Gi
pods 0 10
4.建立high優先順序的pod,儲存如下內容到high-priority-pod.yml檔案
apiVersion: v1
kind: Pod
metadata:
name: high-priority
spec:
containers:
- name: high-priority
image: ubuntu
command: ["/bin/sh"]
args: ["-c", "while true; do echo hello; sleep 10;done"]
resources:
requests:
memory: "10Gi"
cpu: "500m"
limits:
memory: "10Gi"
cpu: "500m"
priorityClassName: high
建立物件:
kubectl create -f ./high-priority-pod.yml
5.確認pods-high中資源使用情況,說明對pod生效的配額是pods-high。
kubectl describe quota
Name: pods-high
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 500m 1k
memory 10Gi 200Gi
pods 1 10
Name: pods-low
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 5
memory 0 10Gi
pods 0 10
Name: pods-medium
Namespace: default
Resource Used Hard
-------- ---- ----
cpu 0 10
memory 0 20Gi
pods 0 10
scopeSelector支援的操作符:
- In
- NotIn
- Exist
- DoesNotExist
Requests vs Limits
這兩個是在建立資源時為其指定的屬性。Requests表示建立物件所必需的資源數值。Limits表示物件可以佔用的最大資源上限。
檢視設定配額
kubectl支援建立、更新、檢視配額。
kubectl create namespace myspace
cat <<EOF > compute-resources.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-resources
spec:
hard:
pods: "4"
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi
requests.nvidia.com/gpu: 4
EOF
kubectl create -f ./compute-resources.yaml --namespace=myspace
cat <<EOF > object-counts.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: object-counts
spec:
hard:
configmaps: "10"
persistentvolumeclaims: "4"
replicationcontrollers: "20"
secrets: "10"
services: "10"
services.loadbalancers: "2"
EOF
kubectl create -f ./object-counts.yaml --namespace=myspace
kubectl get quota --namespace=myspace
NAME AGE
compute-resources 30s
object-counts 32s
kubectl describe quota compute-resources --namespace=myspace
Name: compute-resources
Namespace: myspace
Resource Used Hard
-------- ---- ----
limits.cpu 0 2
limits.memory 0 2Gi
pods 0 4
requests.cpu 0 1
requests.memory 0 1Gi
requests.nvidia.com/gpu 0 4
kubectl describe quota object-counts --namespace=myspace
Name: object-counts
Namespace: myspace
Resource Used Hard
-------- ---- ----
configmaps 0 10
persistentvolumeclaims 0 4
replicationcontrollers 0 20
secrets 1 10
services 0 10
services.loadbalancers 0 2
kubectl也支援標準型別資源的資料配額。
kubectl create namespace myspace
kubectl create quota test --hard=count/deployments.extensions=2,count/replicasets.extensions=4,count/pods=3,count/secrets=4 --namespace=myspace
kubectl run nginx --image=nginx --replicas=2 --namespace=myspace
kubectl describe quota --namespace=myspace
Name: test
Namespace: myspace
Resource Used Hard
-------- ---- ----
count/deployments.extensions 1 2
count/pods 2 3
count/replicasets.extensions 1 4
count/secrets 1 4
Quota and Cluster Capacity
配額用絕對值表示,這一點與叢集容量不同。比如叢集中新加一個node,此時叢集的容量變大,但是namespace下的配額不會相應的放寬。如果想要配額與叢集容量聯動的話,可以考慮擴充套件一個控制器,監控叢集容量,再根據設定好的規則動態調整配額。
Limit Priority Class consumption by default
對於系統中很高的優先順序,比如“cluster-services”,管理員想限制系統中某些namespace建立這種優先順序的pod,也就是說只允許特定的namespace能建立這麼高優先順序的pod,實現方法如下。
apiserver指定如下選項--admission-control-config-file,其值為檔案路徑,檔案內容如下:
$ cat admission_config_file.yml
apiVersion: apiserver.k8s.io/v1alpha1
kind: AdmissionConfiguration
plugins:
- name: "ResourceQuota"
configuration:
apiVersion: resourcequota.admission.k8s.io/v1alpha1
kind: Configuration
limitedResources:
- resource: pods
matchScopes:
- operator : In
scopeName: PriorityClass
values: ["cluster-services"]
如此,cluster-services優先順序的pod只能在包含如下quota的namespace中建立:
scopeSelector:
matchExpressions:
- operator : In
scopeName: PriorityClass
values: ["cluster-services"]
注意:scopeSelector測試特性,使用之前先開啟特性開關ResourceQuotaScopeSelectors。