kubernetes+ prometheus自動伸縮的設計與實現(一)
這篇blog名字起的搞得和我寫論文一樣,xxxx的設計與實現。其實這個東西原理很簡單,kubernetes的hpa使用的是heapster,heapster是k8s那幫傢伙在搞的,所以k8s還是喜歡自己搞的東西,所以k8s的hpa預設使用的heapster,但在業內,還有一個比heapster更好的監控方案,那就是prometheus。如果按照寫論文的方式,我這邊應該分別介紹一下k8s和prometheus,但真的沒有那個閒功夫,在此略過,我之前blog也做過它們的原始碼分析。
上面的圖片展示了整個體系結構
下面就分析一下adapter的具體實現,這裡需要結合上一篇blog關於api聚合的功能,這個adapter就是通過api聚合的方式註冊到apiserver上面。
先看一個hpa的例子
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2alpha1
metadata:
name: wordpress
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: wordpress
minReplicas: 1
maxReplicas: 3
metrics:
- type: Pods
pods:
metricName: memory_usage_bytes
targetAverageValue: 100000
上面的程式碼新增一個記憶體使用量的hpa例子,通過targetAverageValue去設定閾值還有最大最小副本數,以及管理的deployment。在此額外說一下,hpa支援三種指標,分別是Object:描述k8s物件的某種指標,譬如ingress的hits-per-second。Pods:pod的平均指標,譬如transactions-processed-per-second,描述每個pod的事務每秒事務數,Resource是描述pod資源用量,譬如CPU或者記憶體。下面舉一個object的例子,pod的http請求數。
- type: Object
object:
target:
kind: Service
name: sample-metrics-app
metricName: http_requests
targetValue: 100
先從hpa程式碼開始pkg/controller/podautoscaler/horizontal.go,裡面的computeReplicasForMetrics方法,它是負責獲取監控指標並且計算副本數的方法。針對不同資源呼叫不同的方法:
case autoscalingv2.ObjectMetricSourceType:
GetObjectMetricReplicas
...
case autoscalingv2.PodsMetricSourceType:
GetMetricReplicas
case autoscalingv2.ResourceMetricSourceType:
GetRawResourceReplicas
先看看GetObjectMetricReplicas這個方法
pkg/controller/podautoscaler/replica_calculator.go
func (c *ReplicaCalculator) GetObjectMetricReplicas(currentReplicas int32, targetUtilization int64, metricName string, namespace string, objectRef *autoscaling.CrossVersionObjectReference) (replicaCount int32, utilization int64, timestamp time.Time, err error) {
utilization, timestamp, err = c.metricsClient.GetObjectMetric(metricName, namespace, objectRef)
if err != nil {
return 0, 0, time.Time{}, fmt.Errorf("unable to get metric %s: %v on %s %s/%s", metricName, objectRef.Kind, namespace, objectRef.Name, err)
}
usageRatio := float64(utilization) / float64(targetUtilization)
if math.Abs(1.0-usageRatio) <= c.tolerance {
// return the current replicas if the change would be too small
return currentReplicas, utilization, timestamp, nil
}
return int32(math.Ceil(usageRatio * float64(currentReplicas))), utilization, timestamp, nil
}
GetObjectMetric是一個介面,有兩個方法,就是上面圖所示的heapster和自定義custom介面。heapster這個就是呼叫heapster介面去獲取效能指標,本blog著重介紹自定義效能指標,在啟動controller-manager時候指定–horizontal-pod-autoscaler-use-rest-clients就可以使用自定義的效能指標了
pkg/controller/podautoscaler/metrics/rest_metrics_client.go
func (c *customMetricsClient) GetObjectMetric(metricName string, namespace string, objectRef *autoscaling.CrossVersionObjectReference) (int64, time.Time, error) {
gvk := schema.FromAPIVersionAndKind(objectRef.APIVersion, objectRef.Kind)
var metricValue *customapi.MetricValue
var err error
if gvk.Kind == "Namespace" && gvk.Group == "" {
// handle namespace separately
// NB: we ignore namespace name here, since CrossVersionObjectReference isn't
// supposed to allow you to escape your namespace
metricValue, err = c.client.RootScopedMetrics().GetForObject(gvk.GroupKind(), namespace, metricName)
} else {
metricValue, err = c.client.NamespacedMetrics(namespace).GetForObject(gvk.GroupKind(), objectRef.Name, metricName)
}
if err != nil {
return 0, time.Time{}, fmt.Errorf("unable to fetch metrics from API: %v", err)
}
return metricValue.Value.MilliValue(), metricValue.Timestamp.Time, nil
}
上面的objectRef針對本blog只為
{Kind:Service,Name:wordpress,APIVersion:,},就是我們在yaml檔案裡面metrics裡面定義。
上面通過vendor/k8s.io/metrics/pkg/client/custom_metrics/client.go
func (m *rootScopedMetrics) GetForObject(groupKind schema.GroupKind, name string, metricName string) (*v1alpha1.MetricValue, error) {
// handle namespace separately
if groupKind.Kind == "Namespace" && groupKind.Group == "" {
return m.getForNamespace(name, metricName)
}
resourceName, err := m.client.qualResourceForKind(groupKind)
if err != nil {
return nil, err
}
res := &v1alpha1.MetricValueList{}
err = m.client.client.Get().
Resource(resourceName).
Name(name).
SubResource(metricName).
Do().
Into(res)
if err != nil {
return nil, err
}
if len(res.Items) != 1 {
return nil, fmt.Errorf("the custom metrics API server returned %v results when we asked for exactly one", len(res.Items))
}
return &res.Items[0], nil
}
通過client傳送https請求獲取metrics。具體傳送如下所示object:
https://localhost:6443/apis/custom-metrics.metrics.k8s.io/v1alpha1/namespaces/default/services/wordpress/requests-per-second
如果是pod則傳送的請求是
https://localhost:6443/apis/custom-metrics.metrics.k8s.io/v1alpha1/namespaces/default/pods/%2A/memory_usage_bytes?labelSelector=app%3Dwordpress%2Ctier%3Dfrontend
至於group為啥是custom-metrics.metrics.k8s.io這個不是別的,是程式碼裡面寫死的,vendor/k8s.io/metrics/pkg/apis/custom_metrics/v1alpha1/register.go
// GroupName is the group name use in this package
const GroupName = "custom-metrics.metrics.k8s.io"
// SchemeGroupVersion is group version used to register these objects
var SchemeGroupVersion = schema.GroupVersion{Group: GroupName, Version: "v1alpha1"}
這樣k8s的部分已經講解完畢。下面就是adapter的部分了。