diff --git a/vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa/README.md b/vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa/README.md new file mode 100644 index 000000000000..104cba546e30 --- /dev/null +++ b/vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa/README.md @@ -0,0 +1,298 @@ +# Support Customized Recommenders for Vertical Pod Autoscalers + + +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [User Stories](#user-stories-optional) + - [Story 1](#story-1) + - [Story 2](#story-2) + - [Implementation Details](#implementation-details) + - [Deployment Details](#deployment-details) +- [Design Details](#design-details) + - [Test Plan](#test-plan) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) +- [Alternatives](#alternatives) +- [Out of Scope](#out-out-scope) +- [Implementation History](#implementation-history) + + +## Summary + +Today, the current VPA recommends CPU/Mem requests based on one recommender, +which recommends the future requests based on the historical usage observed in a +rolling time window. As there is no universal recommendation policy that applies to all +types of workload, this KEP suggests supporting multiple customized recommenders in VPA. +Thus, users can run different recommenders for different workloads, as they may exhibit +very distinct resource usage behaviors. + +## Motivation + +A VPA is used to recommend the requested resources of containers in pods when the actual CPU/memory usage of a container +is significantly different from the resources requested. Resource usage-based recommendation +is the basic approach that resizes containers according to the actual usage observed and is +implemented in the default VPA recommender. Users can configure the time window and a certain +percentile of observed usage in the past as the prediction of future requests/limits for CPU/memory. + +However, as containers running different types of workloads may have different resource usage patterns, +there is no universal policy that applies to all. The existing VPA recommender may not accurately +predict future resource usage when containers exhibit certain resource usage behaviors, +such as trending, periodically changing, or occasional spikes, resulting in significant +over-provisioning and OOM kills for microservices. Learning different types of resource usage +behaviors for workloads and applying different algorithms to improve resource utilization +(CPU and Memory) predictions can significantly reduce over-provisioning and OOM kills in VPA. + +### Goals + +- Allow the VPA object to specify a customized recommender to use. +- Allow the VPA object to use the default recommender when no recommender is specified. + +### Non-Goals + +- We assume no pod uses two recommenders at the same time. +- We do not resolve conflicts between recommenders. + +## Proposal + +### User Stories + +#### Story 1 + +- Containers with Cyclic Patterns in Resource Usage + +Containers used in monitoring may receive load periodically to process but need to be long-running +to listen to incoming traffic. Thus, these containers usually exhibit cyclic patterns, alternating +between usage spikes and idling. Resizing containers according to usage observed in the previous +time window may always lead to under-provision for a short period when the load spikes just arrive. +The problem will happen for memory if the cyclic pattern length is >8 days. For CPU, the problem may +be visible for example with lower usage on the weekend. The problem will even lead to frequent pod evictions +when the pod was resized according to the idling period and the host resource has been taken by other pods. + +#### Story 2 + +- Containers with Different but Recurrent Behaviors in Resource Usage + +Containers running spark/deep learning training workloads are known to show recurring and repeating +patterns in resource usage. Prior research has shown that different but recurrent behaviors occur +for different containerized tasks, such as Spark or deep learning training. These common patterns can +be represented by phases, which display similar resource usage of computational resources over time. +There are common sequences of patterns for different executions of the workload and they can be used +to proactively predict future resource usage more accurately. The default recommender in the current +VPA adopts a reactive approach so a more proactive recommender is needed for these types of workload. + +### Implementation Details + +The following describes the details of implementing a first-citizen approach to support the customized +recommender. Namely, a dedicated field `recommenderName` is added to the VPA crd definition in +`deploy/vpa-v1.crd.yaml`. + +```yaml +validation: + # openAPIV3Schema is the schema for validating custom objects. + openAPIV3Schema: + type: object + properties: + spec: + type: object + required: [] + properties: + recommenderName: + type: string + targetRef: + type: object + updatePolicy: + type: object +``` + +Correspondingly, the `VerticalPodAutoscalerSpec` in `pkg/apis/autoscaling.k8s.io/v1/types.go` +should be updated to include the `recommenderName` field. + +```golang +// VerticalPodAutoscalerSpec is the specification of the behavior of the autoscaler. +type VerticalPodAutoscalerSpec struct { + // TargetRef points to the controller managing the set of pods for the + // autoscaler to control - e.g. Deployment, StatefulSet. VerticalPodAutoscaler + // can be targeted at controller implementing scale subresource (the pod set is + // retrieved from the controller's ScaleStatus) or some well known controllers + // (e.g. for DaemonSet the pod set is read from the controller's spec). + // If VerticalPodAutoscaler cannot use specified target it will report + // ConfigUnsupported condition. + // Note that VerticalPodAutoscaler does not require full implementation + // of scale subresource - it will not use it to modify the replica count. + // The only thing retrieved is a label selector matching pods grouped by + // the target resource. + TargetRef *autoscaling.CrossVersionObjectReference `json:"targetRef" protobuf:"bytes,1,name=targetRef"` + + // Describes the rules on how changes are applied to the pods. + // If not specified, all fields in the `PodUpdatePolicy` are set to their + // default values. + // +optional + UpdatePolicy *PodUpdatePolicy `json:"updatePolicy,omitempty" protobuf:"bytes,2,opt,name=updatePolicy"` + + // Controls how the autoscaler computes recommended resources. + // The resource policy may be used to set constraints on the recommendations + // for individual containers. If not specified, the autoscaler computes recommended + // resources for all containers in the pod, without additional constraints. + // +optional + ResourcePolicy *PodResourcePolicy `json:"resourcePolicy,omitempty" protobuf:"bytes,3,opt,name=resourcePolicy"` + + // Name of the recommender responsible for generating recommendation for this object. + RecommenderName []string `json:"recommenderName,omitempty" protobuf:"bytes,4,opt,name=recommenderName"` +} +``` + +When creating a recommender object for recommendations, the recommender main routine should initiate itself +with a predefined recommender name, which can be defined as a constant in the `pkg/recomender/main.go` routine, + +```golang +const RecommenderName = "default" + +recommender := routines.NewRecommender(config, *checkpointsGCInterval, useCheckpoints, RecommenderName, *vpaObjectNamespace) +``` + +where the routines.NewRecommender can pass the `RecommenderName` to the clusterState object. + +```golang +// NewRecommender creates a new recommender instance. +// Dependencies are created automatically. +// Deprecated; use RecommenderFactory instead. +func NewRecommender(config *rest.Config, checkpointsGCInterval time.Duration, useCheckpoints bool, recommender_name string, namespace string) Recommender { + clusterState := model.NewClusterState(recommender_name) + return RecommenderFactory{ + ClusterState: clusterState, + ClusterStateFeeder: input.NewClusterStateFeeder(config, clusterState, *memorySaver, namespace), + CheckpointWriter: checkpoint.NewCheckpointWriter(clusterState, vpa_clientset.NewForConfigOrDie(config).AutoscalingV1()), + VpaClient: vpa_clientset.NewForConfigOrDie(config).AutoscalingV1(), + PodResourceRecommender: logic.CreatePodResourceRecommender(), + CheckpointsGCInterval: checkpointsGCInterval, + UseCheckpoints: useCheckpoints, + }.Make() +} + + +// NewClusterState returns a new ClusterState with no pods. +func NewClusterState(recommender_name string) *ClusterState { + return &ClusterState{ + RecommenderName: recommender_name, + Pods: make(map[PodID]*PodState), + Vpas: make(map[VpaID]*Vpa), + EmptyVPAs: make(map[VpaID]time.Time), + aggregateStateMap: make(aggregateContainerStatesMap), + labelSetMap: make(labelSetMap), + } +} +``` + +Therefore, when loading VPA objects to the `clusterStateFeeder`, it can use the field selector to select VPA CRDs that +have `recommenderName` equal to the current clusterState’s `RecommenderName`. +```golang +// Fetch VPA objects and load them into the cluster state. +func (feeder *clusterStateFeeder) LoadVPAs() { + // List VPA API objects. + allVpaCRDs, err := feeder.vpaLister.List(labels.Everything()) + if err != nil { + klog.Errorf("Cannot list VPAs. Reason: %+v", err) + return + } + + var vpaCRDs []*vpa_types.VerticalPodAutoscaler + for _, vpaCRD := range allVpaCRDs { + currentRecommenderName := feeder.clusterState.RecommenderName + if (vpaCRD.Spec.RecommenderName != currentRecommenderName) && (vpaCRD.Spec.RecommenderName != "") { + klog.V(6).Infof("Ignoring the vpaCRD as its name %v is not equal to the current recommender's name %v", vpaCRD.Spec.RecommenderName, currentRecommenderName) + continue + } + vpaCRDs = append(vpaCRDs, vpaCRD) + + klog.V(3).Infof("Fetched %d VPAs.", len(vpaCRDs)) + // Add or update existing VPAs in the model. + vpaKeys := make(map[model.VpaID]bool) + + … + + feeder.clusterState.ObservedVpas = vpaCRDs +} +``` + +Accordingly, the VPA object yaml should include the `recommenderName` as the default `RecommenderName`. +```yaml +apiVersion: "autoscaling.k8s.io/v1" +kind: VerticalPodAutoscaler +metadata: + name: hamster-vpa +Spec: + recommenderName: default + targetRef: + apiVersion: "apps/v1" + ... ... +``` + +### Deployment Details +The customized recommender is supposed to be deployed as a separate deployment that is chosen +by different sets of VPA objects.Each VPA object is supposed to choose only one recommender at a time. +The way how the default recommender and the customized recommender are running and interacting with VPA objects +are shown in the following drawing. + +deployment + +Though we do not support a VPA object to use multiple recommenders in this proposal, we leave the possibility of necessary +changes of using multiple recommenders in the future. Namely, we define `recommenderName` to be an array instead of a string, but we support one element only in this proposal. We modify the admission controller to validate that the array has <= 1 elements. + +We will add the following check in the `func validateVPA(vpa *vpa_types.VerticalPodAutoscaler, isCreate bool)` function. +``` + if len(vpa.Spec.RecommenderName) > 1 { + return fmt.Errorf("VPA object shouldn't specify more than one recommenderNames.") + } +``` + + +## Design Details + +### Test Plan +- Add e2e test demonstrating the default recommender ignores a VPA which specifies an alternate recommender. + +### Upgrade / Downgrade Strategy +For cluster upgrades, the VPA from the previous version will continue working as before. +There is no change in behavior or flags which have to be enabled or disabled. + +## Alternatives + +### Develop a plugin framework for customizable recommenders. +Add a webhook system for customized recommendations. The default VPA recommender would +call any available recommendation webhooks, and if any of them make a recommendation, +the recommender would use that recommendation instead of making its own. If none make +a recommendation, it would make its recommendation as it currently does. The plugin alternative +is rejected because it involves much more design changes and code changes. It might be considered in the future if there are +more use cases where running multiple recommenders for the same VPA object is needed. + +### Develop a label selector approach. +Add a label for the CRD object to denote the recommender’s name. When making +recommendations in the recommender, only the VpaCrds with the label +`recommender=default` will be loaded and updated by the existing recommender. +A label selector approach is rejected because it is too powerful and users can easily +ignore those labels and misconfigure the VPA objects. + +## Out of Scope + +- Although this proposal will enable alternate recommenders, no alternate recommenders +will be created as part of this proposal. +- This proposal will not support running multiple recommenders for the same VPA object. Each VPA object +is supposed to use only one recommender. + +## Implementation History + + + + + diff --git a/vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa/images/deployment.png b/vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa/images/deployment.png new file mode 100644 index 000000000000..94d79ce96add Binary files /dev/null and b/vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa/images/deployment.png differ