diff --git a/docs/DESIGN.md b/docs/DESIGN.md index 156efea7f78c..7a94e9fbb2ce 100644 --- a/docs/DESIGN.md +++ b/docs/DESIGN.md @@ -217,13 +217,13 @@ The Horizontal Pod Autoscaler (HPA) is a metrics driven pod autoscaling solution Unified autoscaling is a powerful concept, as it means that the same implementation can be shared for all autoscaled resources within a cluster. We want to avoid forcing premature alignment, but as long as it doesn’t compromise the design, there is value in keeping these interfaces as similar as possible. Customers need only learn a single architecture for autoscaling, reducing complexity and cognitive load. -There are a couple drawbacks to using the HPA’s API directly. The most obvious is the name, which would be more aptly called HorizontalAutoscaler. Most of its abstractions extend cleanly to Node Groups (e.g. [ScaleTargetRef](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HorizontalPodAutoscalerSpec), [MetricTarget](https://godoc.org/k8s.io/api/autoscaling/v2beta2#MetricTarget), [ScalingPolicy](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HPAScalingPolicy), [MinReplicas](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HorizontalPodAutoscalerSpec), [MaxReplicas](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HorizontalPodAutoscalerSpec), [Behavior](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HorizontalPodAutoscalerBehavior), StabilizationWindowSeconds (https://godoc.org/k8s.io/api/autoscaling/v2beta2#HPAScalingRules)). Others require slight adjustments (e.g. [ScalingPolicyType](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HPAScalingPolicyType) needs to be tweaked to refer to “replicas” instead of “pods”). However, [MetricSpec](https://godoc.org/k8s.io/api/autoscaling/v2beta2#MetricSpec) is specific to pods and requires changes if relied upon. MetricsSpec has four subfields corresponding to different metrics sources. [ResourceMetricSource](https://godoc.org/k8s.io/api/autoscaling/v2beta2#ResourceMetricSource), which uses the [Resource Metrics API](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/resource-metrics-api.md) and provides CPU and memory for pods and nodes. [PodsMetricSource](https://godoc.org/k8s.io/api/autoscaling/v2beta2#PodsMetricSource), which is syntactic sugar for [ObjectMetricSource](https://godoc.org/k8s.io/api/autoscaling/v2beta2#ObjectMetricSource), each of which each retrieve metrics from the [Custom Metrics API](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/custom-metrics-api.md). [ExternalMetricSource](https://godoc.org/k8s.io/api/autoscaling/v2beta2#ExternalMetricSource), which uses the [External Metrics API](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/external-metrics-api.md) to map metric name and namespace to an external object like an AWS SQS Queue. +There are a couple drawbacks to using the HPA’s API directly. The most obvious is the name, which would be more aptly called HorizontalAutoscaler. Most of its abstractions extend cleanly to Node Groups (e.g. [ScaleTargetRef](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HorizontalPodAutoscalerSpec), [MetricTarget](https://godoc.org/k8s.io/api/autoscaling/v2beta2#MetricTarget), [ScalingPolicy](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HPAScalingPolicy), [MinReplicas](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HorizontalPodAutoscalerSpec), [MaxReplicas](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HorizontalPodAutoscalerSpec), [Behavior](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HorizontalPodAutoscalerBehavior), [StabilizationWindowSeconds](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HPAScalingRules)). Others require slight adjustments (e.g. [ScalingPolicyType](https://godoc.org/k8s.io/api/autoscaling/v2beta2#HPAScalingPolicyType) needs to be tweaked to refer to “replicas” instead of “pods”). However, [MetricSpec](https://godoc.org/k8s.io/api/autoscaling/v2beta2#MetricSpec) is specific to pods and requires changes if relied upon. MetricsSpec has four subfields corresponding to different metrics sources. [ResourceMetricSource](https://godoc.org/k8s.io/api/autoscaling/v2beta2#ResourceMetricSource), which uses the [Resource Metrics API](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/resource-metrics-api.md) and provides CPU and memory for pods and nodes. [PodsMetricSource](https://godoc.org/k8s.io/api/autoscaling/v2beta2#PodsMetricSource), which is syntactic sugar for [ObjectMetricSource](https://godoc.org/k8s.io/api/autoscaling/v2beta2#ObjectMetricSource), each of which each retrieve metrics from the [Custom Metrics API](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/custom-metrics-api.md). [ExternalMetricSource](https://godoc.org/k8s.io/api/autoscaling/v2beta2#ExternalMetricSource), which uses the [External Metrics API](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/external-metrics-api.md) to map metric name and namespace to an external object like an AWS SQS Queue. One approach would be to use the MetricsSpec and its four sources as-is. This requires sourcing all metrics from the Kubernetes metrics APIs (see limitations above). It’s also somewhat awkward, as users would likely never use the PodsMetricSpec or ResourceMetricsSpec to scale their node groups. The primary reason to go this route is alignment with the HorizontalPodAutoscaler and existing Kubernetes metrics APIs. The current Kubernetes metrics architecture is arguably too pod specific and could be changed to be more generic, but we consider engagement with SIG Instrumentation to be out of scope for the short term. Another option would be use ObjectMetricsSpec and ExternalMetricsSpec and omit pod-specific metrics APIs. This generically covers metrics for both in-cluster and external objects (i.e. custom.metrics.k8s.io and external.metrics.k8s.io). This approach is cleaner from the perspective of a node autoscaler, but makes future alignment with the HPA more challenging. Pod metrics could still specified, but this removes the syntactic sugar that simplifies the most common use cases for pod autoscaling. -If we choose to integrate with directly with Prometheus metrics (discussed above), there will need to be a new option in the MetricsSpec to specify it as a metrics source (e.g PrometheusMetricSource). Customers would specify a [promql query](https://prometheus.io/docs/prometheus/latest/querying/basics/) to retrieve the metric. The decision to create a PrometheusMetricSource is orthogonal from whether or not we keep existing HPA metrics sources. Either way requires changes to the MetricsSpec; Prometheus support can be built alongside or replace existing metrics sources. +If we choose to integrate directly with Prometheus metrics (discussed above), there will need to be a new option in the MetricsSpec to specify it as a metrics source (e.g PrometheusMetricSource). Customers would specify a [promql query](https://prometheus.io/docs/prometheus/latest/querying/basics/) to retrieve the metric. The decision to create a PrometheusMetricSource is orthogonal from whether or not we keep existing HPA metrics sources. Either way requires changes to the MetricsSpec; Prometheus support can be built alongside or replace existing metrics sources. We could also completely diverge from the HPA and start with a minimal autoscaler definition that covers initial node autoscaling use cases. This avoids premature abstraction of a generic autoscaling definition. However, we’re cautious to start from scratch, as it presumes we can design autoscaling APIs better than the HPA. It also makes alignment more challenging in the future.