这篇blog名字起的搞得和我写论文一样,xxxx的设计与实现。其实这个东西原理很简单,kubernetes的hpa使用的是heapster,heapster是k8s那帮家伙在搞的,所以k8s还是喜欢自己搞的东西,所以k8s的hpa默认使用的heapster,但在业内,还有一个比heapster更好的监控方案,那就是prometheus。如果按照写论文的方式,我这边应该分别介绍一下k8s和prometheus,但真的没有那个闲功夫,在此略过,我之前blog也做过它们的源码分析。

k8s部署的prometheus通过rancher修改ConfigMaps k8s prometheus adapter_Group


上面的图片展示了整个体系结构

下面就分析一下adapter的具体实现,这里需要结合上一篇blog关于api聚合的功能,这个adapter就是通过api聚合的方式注册到apiserver上面。

先看一个hpa的例子

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2alpha1
metadata:
  name: wordpress
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: wordpress
  minReplicas: 1
  maxReplicas: 3
  metrics:
  - type: Pods
    pods:
      metricName: memory_usage_bytes
      targetAverageValue: 100000

上面的代码添加一个内存使用量的hpa例子,通过targetAverageValue去设定阈值还有最大最小副本数,以及管理的deployment。在此额外说一下,hpa支持三种指标,分别是Object:描述k8s对象的某种指标,譬如ingress的hits-per-second。Pods:pod的平均指标,譬如transactions-processed-per-second,描述每个pod的事务每秒事务数,Resource是描述pod资源用量,譬如CPU或者内存。下面举一个object的例子,pod的http请求数。

- type: Object
    object:
      target:
        kind: Service
        name: sample-metrics-app
      metricName: http_requests
      targetValue: 100

先从hpa代码开始pkg/controller/podautoscaler/horizontal.go,里面的computeReplicasForMetrics方法,它是负责获取监控指标并且计算副本数的方法。针对不同资源调用不同的方法:

case autoscalingv2.ObjectMetricSourceType:
GetObjectMetricReplicas
...

case autoscalingv2.PodsMetricSourceType:
GetMetricReplicas

case autoscalingv2.ResourceMetricSourceType:
GetRawResourceReplicas

先看看GetObjectMetricReplicas这个方法
pkg/controller/podautoscaler/replica_calculator.go

func (c *ReplicaCalculator) GetObjectMetricReplicas(currentReplicas int32, targetUtilization int64, metricName string, namespace string, objectRef *autoscaling.CrossVersionObjectReference) (replicaCount int32, utilization int64, timestamp time.Time, err error) {
    utilization, timestamp, err = c.metricsClient.GetObjectMetric(metricName, namespace, objectRef)
    if err != nil {
        return 0, 0, time.Time{}, fmt.Errorf("unable to get metric %s: %v on %s %s/%s", metricName, objectRef.Kind, namespace, objectRef.Name, err)
    }

    usageRatio := float64(utilization) / float64(targetUtilization)
    if math.Abs(1.0-usageRatio) <= c.tolerance {
        // return the current replicas if the change would be too small
        return currentReplicas, utilization, timestamp, nil
    }

    return int32(math.Ceil(usageRatio * float64(currentReplicas))), utilization, timestamp, nil
}

GetObjectMetric是一个接口,有两个方法,就是上面图所示的heapster和自定义custom接口。heapster这个就是调用heapster接口去获取性能指标,本blog着重介绍自定义性能指标,在启动controller-manager时候指定–horizontal-pod-autoscaler-use-rest-clients就可以使用自定义的性能指标了
pkg/controller/podautoscaler/metrics/rest_metrics_client.go

func (c *customMetricsClient) GetObjectMetric(metricName string, namespace string, objectRef *autoscaling.CrossVersionObjectReference) (int64, time.Time, error) {
    gvk := schema.FromAPIVersionAndKind(objectRef.APIVersion, objectRef.Kind)
    var metricValue *customapi.MetricValue
    var err error
    if gvk.Kind == "Namespace" && gvk.Group == "" {
        // handle namespace separately
        // NB: we ignore namespace name here, since CrossVersionObjectReference isn't
        // supposed to allow you to escape your namespace
        metricValue, err = c.client.RootScopedMetrics().GetForObject(gvk.GroupKind(), namespace, metricName)
    } else {
        metricValue, err = c.client.NamespacedMetrics(namespace).GetForObject(gvk.GroupKind(), objectRef.Name, metricName)
    }

    if err != nil {
        return 0, time.Time{}, fmt.Errorf("unable to fetch metrics from API: %v", err)
    }

    return metricValue.Value.MilliValue(), metricValue.Timestamp.Time, nil
}

上面的objectRef针对本blog只为
{Kind:Service,Name:wordpress,APIVersion:,},就是我们在yaml文件里面metrics里面定义。

上面通过vendor/k8s.io/metrics/pkg/client/custom_metrics/client.go

func (m *rootScopedMetrics) GetForObject(groupKind schema.GroupKind, name string, metricName string) (*v1alpha1.MetricValue, error) {
    // handle namespace separately
    if groupKind.Kind == "Namespace" && groupKind.Group == "" {
        return m.getForNamespace(name, metricName)
    }

    resourceName, err := m.client.qualResourceForKind(groupKind)
    if err != nil {
        return nil, err
    }

    res := &v1alpha1.MetricValueList{}
    err = m.client.client.Get().
        Resource(resourceName).
        Name(name).
        SubResource(metricName).
        Do().
        Into(res)

    if err != nil {
        return nil, err
    }

    if len(res.Items) != 1 {
        return nil, fmt.Errorf("the custom metrics API server returned %v results when we asked for exactly one", len(res.Items))
    }

    return &res.Items[0], nil
}

通过client发送https请求获取metrics。具体发送如下所示object:

https://localhost:6443/apis/custom-metrics.metrics.k8s.io/v1alpha1/namespaces/default/services/wordpress/requests-per-second

如果是pod则发送的请求是

https://localhost:6443/apis/custom-metrics.metrics.k8s.io/v1alpha1/namespaces/default/pods/%2A/memory_usage_bytes?labelSelector=app%3Dwordpress%2Ctier%3Dfrontend

至于group为啥是custom-metrics.metrics.k8s.io这个不是别的,是代码里面写死的,vendor/k8s.io/metrics/pkg/apis/custom_metrics/v1alpha1/register.go

// GroupName is the group name use in this package
const GroupName = "custom-metrics.metrics.k8s.io"

// SchemeGroupVersion is group version used to register these objects
var SchemeGroupVersion = schema.GroupVersion{Group: GroupName, Version: "v1alpha1"}

这样k8s的部分已经讲解完毕。下面就是adapter的部分了。