Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

round 2 #2

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 108 additions & 49 deletions docs/proposals/autoscaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ written later that sets thresholds automatically based on past behavior (CPU use
* The auto-scaler must be aware of user defined actions so it does not override them unintentionally (for instance someone
explicitly setting the replica count to 0 should mean that the auto-scaler does not try to scale the application up)
* It should be possible to write and deploy a custom auto-scaler without modifying existing auto-scalers
* Auto-scalers must be able to monitor multiple replication controllers while only targeting a single scalable object

## Use Cases

Expand Down Expand Up @@ -77,39 +78,12 @@ use a client/cache implementation to receive watch data from the data aggregator
scaling the application. Auto-scalers are created and defined like other resources via REST endpoints and belong to the
namespace just as a `ReplicationController` or `Service`.

There are two options for implementing the auto-scaler:

1. Annotations on a `ReplicationController`

Pros:

* uses an existing resource, not another component that must be defined separately
* easy to know what the target of the auto-scaler is since the config for the scaler is attached to the target

Cons:

* configuration in annotations is marginally more difficult than plain old json
* rather than watching explicitly for new auto-scaler definitions, the auto-scaler controller must watch all
`ReplicationController`s and create auto-scalers when appropriate. As new auto-scalable resources are defined the
auto-scaler controller must also watch those resources

1. As a new resource

Pros:

* auto-scalers are managed by the user independent of the `ReplicationController`
* flexible by using a selector to the scalable resource (that implements the `resize` verb), future implementations
*may* require no extra work on the auto-scaler side

Cons:

* one more resource to store, manage, and monitor

For this proposal, the auto-scaler is a resource:
Since an auto-scaler is a durable object it is best represented as a resource.

```go
//The auto scaler interface
type AutoScalerInterface interface {
//Adjust a resource's replica count. Calls resize endpoint. Args to this are based on what the endpoint
//ScaleApplication adjusts a resource's replica count. Calls resize endpoint. Args to this are based on what the endpoint
//can support. See https://github.com/GoogleCloudPlatform/kubernetes/issues/1629
ScaleApplication(num int) error
}
Expand All @@ -128,21 +102,28 @@ For this proposal, the auto-scaler is a resource:
}

type AutoScalerSpec struct {
//Thresholds
//AutoScaleThresholds holds a collection of AutoScaleThresholds that drive the auto scaler
AutoScaleThresholds []AutoScaleThreshold

//turn auto scaling on or off
//Enabled turns auto scaling on or off
Enabled boolean

//max replicas that the auto scaler can use, empty is unlimited
//MaxAutoScaleCount defines the max replicas that the auto scaler can use. This value must be greater than
//0 and >= MinAutoScaleCount
MaxAutoScaleCount int

//min replicas that the auto scaler can use, empty == 0 (idle)
//MinAutoScaleCount defines the minimum number replicas that the auto scaler can reduce to,
//0 means that the application is allowed to idle
MinAutoScaleCount int

//ObjectReference (pre-existing structure) provides the scalable target. Right now this is a ReplicationController
//in the future it could be a job or any resource that implements resize
//ScalableTarget provides the resizeable target. Right now this is a ReplicationController
//in the future it could be a job or any resource that implements resize.
ScalableTarget ObjectReference

//Selector defines a set of capacity that the auto-scaler is monitoring (replication controllers). Generally, the auto-scaler is
//driven by the AutoScaleThresholds, however, this gives visibility of aggregate items like total number of pods
//backing a service without having to aggregate them into a statistic
Selector map[string]string
}

type AutoScalerStatus struct {
Expand All @@ -152,29 +133,29 @@ For this proposal, the auto-scaler is a resource:
}


//abstracts the data analysis from the auto-scaler
//AutoScaleThresholdInterface abstracts the data analysis from the auto-scaler
//example: scale by 1 (Increment) when RequestsPerSecond (Type) pass comparison (Comparison) of 50 (Value) for 30 seconds (Duration)
type AutoScaleThresholdInterface interface {
//called by the auto-scaler to determine if this threshold is met or not
ShouldScale() boolean
}

type StatisticType string

//generic type definition
//AutoScaleThreshold is a single statistic used to drive the auto-scaler in scaling decisions
type AutoScaleThreshold struct {
//scale up or down by this increment
//Increment determines how the auot-scaler should scale up or down (positive number to scale up based on this threshold
//negative number to scale down by this threshold)
Increment int
//by querying this statistic
//example: RequestsPerSecond StatisticType = "requestPerSecond"
Type StatisticType
//after this duration
//Selector is the statistics selector that determines how the Threshold receives data from aggregated (or single) statistics.
Selector map[string]string
//Duration is the time lapse after which this threshold is considered passed
Duration time.Duration
//when this value is passed
//Value is the number at which, after the duration is passed, this threshold is considered to be triggered
Value float
//using this comparison (greater than, less than to support falling above or below a value)
//Comparison component to be applied to the value.
Comparison string
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be an enum?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is an implication that we will have a defined lexicon for comparisons. It should be something like "gt", "eq", etc. Will define more.

}
```

#### Boundary Definitions
The `AutoScaleThreshold` definitions provide the boundaries for the auto-scaler. By defining comparisons that form a range
Expand All @@ -197,6 +178,34 @@ resolves to a value that can be checked against a configured threshold.
Of note: If the statistics gathering mechanisms can be initialized with a registry other components storing statistics can
potentially piggyback on this registry.

### Interactions with a deployment

In a deployment it is likely that multiple replication controllers must be monitored. For instance, in a [rolling deployment](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/replication-controller.md#rolling-updates)
there will be multiple replication controllers, with one scaling up and another scaling down. This means that an
auto-scaler must be aware of the entire set of capacity that backs a service so it does not fight with the deployer. `AutoScalerSpec.Selector`
is what provides this ability. By using a selector that spans the entire service the auto-scaler can monitor capacity
of multiple replication controllers and check that capacity against the `AutoScalerSpec.MaxAutoScaleCount` and
`AutoScalerSpec.MinAutoScaleCount` while still only targeting a specific `ReplicationController`.

In the course of a rolling update it would be up to the deployment orchestration to re-target the auto-scaler.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be somewhat problematic - how do you know which auto-scalers target you? How do you transactionally guarantee that? It also means every component that creates and manipulates replication controllers also has to know about auto-scalers. That's not necessarily a blocking issue, but it does create complexity when it comes to meaning. It's possible that the deployment process itself is the thing that should be taking the target of the autoscaler (via the resize verb), rather than the underlying individual replication controllers. In the OpenShift case, we have a deployment target (the deployment config) which can support resize. In the current Kube code, kubectl has no target, so it would have to talk to the controller instead.

It seems like the deployer really needs a way to communicate to the autoscaler that is tied to the replication controllers directly. If you had that, you could mark the replication controllers semantically as you went through a deployment. However, that has some issues too.

Basically, kubectl having to know about auto-scalers to do its job feels wrong. We need to establish whether it's impossible to keep kubectl isolated from the auto-scaler, or whether we just need a better abstraction (some combination of a marker on the RC + the autoscaler selector).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I think I've digested this. My thoughts are, we can solve the issue of knowing what autoscalers target a resizeable object by adding a registration component to the interface. When an autoscaler is initialized it can attempt to register itself with its target.

Transactionally guaranteeing the state of "which scalers target me" seems harder. Since autoscalers can be updated by external components (api calls and deployments) I don't think we can guarantee that the scalers registered to component A are still targeting component A at any point in time.

We could, perhaps, separate the decision making of the autoscaler from something that calls the resize verb. In that use case, the autoscaler would take two object references, one for the expected target of the resize (the replication controller) and one for the resizer object. Any action the autoscaler wants to take for the target produces an event with the semantics "resizer A: take action B on target C". If resizer A no longer targets target C the event is dropped.

We essentially end up with a big circle of communication in the case of 'hey replication controller D, retarget your auto scalers to E'. Repl controller D iterates through its registered autoscalers and calls a method that emits an event of "resizer A: take retarget action, target E if you currently target C". This design implies that operations within the resizer must be atomic.

Finally, I'm not sure where kubectl comes in to play. The use case that stands out to me (which is not currently called out in the proposal) is a scenario where manual changing of the replication controller conflicts with the autoscaler and how policy deals with that. Can you elaborate on what you were thinking there?

/cc @pmorie

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Feb 11, 2015, at 5:43 PM, Paul [email protected] wrote:

In docs/proposals/autoscaling.md:

@@ -197,6 +178,34 @@ resolves to a value that can be checked against a configured threshold.
Of note: If the statistics gathering mechanisms can be initialized with a registry other components storing statistics can
potentially piggyback on this registry.

+### Interactions with a deployment
+
+In a deployment it is likely that multiple replication controllers must be monitored. For instance, in a rolling deployment
+there will be multiple replication controllers, with one scaling up and another scaling down. This means that an
+auto-scaler must be aware of the entire set of capacity that backs a service so it does not fight with the deployer. AutoScalerSpec.Selector
+is what provides this ability. By using a selector that spans the entire service the auto-scaler can monitor capacity
+of multiple replication controllers and check that capacity against the AutoScalerSpec.MaxAutoScaleCount and
+AutoScalerSpec.MinAutoScaleCount while still only targeting a specific ReplicationController.
+
+In the course of a rolling update it would be up to the deployment orchestration to re-target the auto-scaler.
Ok, so I think I've digested this. My thoughts are, we can solve the issue of knowing what autoscalers target a resizeable object by adding a registration component to the interface. When an autoscaler is initialized it can attempt to register itself with its target.

Transactionally guaranteeing the state of "which scalers target me" seems harder. Since autoscalers can be updated by external components (api calls and deployments) I don't think we can guarantee that the scalers registered to component A are still targeting component A at any point in time.

We could, perhaps, separate the decision making of the autoscaler from something that calls the resize verb. In that use case, the autoscaler would take two object references, one for the expected target of the resize (the replication controller) and one for the resizer object. Any action the autoscaler wants to take for the target produces an event with the semantics "resizer A: take action B on target C". If resizer A no longer targets target C the event is dropped.

We essentially end up with a big circle of communication in the case of 'hey replication controller D, retarget your auto scalers to E'. Repl controller D iterates through its registered autoscalers and calls a method that emits an event of "resizer A: take retarget action, target E if you currently target C". This design implies that operations within the resizer must be atomic.

The two references is reasonable. For deployment configs, we don't have a problem (one target, which hides the fact there are multiple items). For rolling update of rcs, the target has to remain fixed, but maybe we allow the count to be done via selector. Then it's the rolling updates responsibility to handle swapping out the target. I.e the rolling updater first creates a copy of the current rc, but with no label selector and size 0. Then it clears the label
Selector of the existing one (or disables it somehow) and updates the old one to have the right pattern. We may not have the primitives we need though.

We need to be having this discussion on the existing pull - can you update and summarize this thread?

Finally, I'm not sure where kubectl comes in to play. The use case that stands out to me (which is not currently called out in the proposal) is a scenario where manual changing of the replication controller conflicts with the autoscaler and how policy deals with that. Can you elaborate on what you were thinking there?

/cc @pmorie


Reply to this email directly or view it on GitHub.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, thanks for the initial feedback

By re-targeting the auto-scaler as part of the deployment process we can ensure that the deployment process and
auto-scaler do not fight over a scenario where `AutoScalerSpec.MinAutoScaleCount` is greater than zero but the deployment
orchestration must scale to zero.

During deployment orchestration the auto-scaler may be making decisions to scale its target up or down. In order to prevent
the scaler from fighting with a deployment process that is scaling one replication controller up and scaling another one
down the deployment process must assume that the current replica count may be changed by objects other than itself and
account for this in the scale up or down process. Therefore, the deployment process may no longer target an exact number
of instances to be deployed. It must be satisfied that the replica count for the deployment meets or exceeds the number
of requested instances.

Auto-scaling down in a deployment scenario is a special case. In order for the deployment to complete successfully the
deployment orchestration must ensure that the desired number of instances that are supposed to be deployed has been met.
If the auto-scaler is trying to scale the application down (due to no traffic, or other statistics) then the deployment
process and auto-scaler are fighting to increase and decrease the count of the targeted replication controller. In order
to prevent this, deployment orchestration should notify the auto-scaler that a deployment is occurring. This will
temporarily disable negative decrement thresholds until the deployment process is completed. It is more important for
an auto-scaler to be able to grow capacity during a deployment than to shrink the number of instances precisely.

## Use Case Realization

Expand All @@ -210,11 +219,12 @@ potentially piggyback on this registry.
"apiVersion": "v1beta1",
"maxAutoScaleCount": 50,
"minAutoScaleCount": 1,
"selector": ["name": "myTargetedReplicationControllers"]
"thresholds": [
{
"id": "myapp-rps-up",
"kind": "AutoScaleThreshold",
"type": "requestPerSecond",
"selector": ["name": "requestPerSecond"],
"durationVal": 30,
"durationInterval": "seconds",
"value": 50,
Expand All @@ -224,18 +234,67 @@ potentially piggyback on this registry.
{
"id": "myapp-rps-down",
"kind": "AutoScaleThreshold",
"type": "requestPerSecond",
"selector": ["name": "requestPerSecond"],
"durationVal": 30,
"durationInterval": "seconds",
"value": 25,
"comp": "less",
"inc": -1
}
],
"selector": "myapp-replcontroller"
"scalableTarget": "myapp-replcontroller"
}

1. The auto-scaler controller watches for new `AutoScaler` definitions and creates the resource
1. Periodically the auto-scaler loops through defined thresholds and determine if a threshold has been exceeded
1. If the app must be scaled the auto-scaler calls the `resize` endpoint for `myapp-replcontroller`


### Rolling deployment

1. User defines the application's auto-scaling resources

{
"id": "myapp-autoscaler",
"kind": "AutoScaler",
"apiVersion": "v1beta1",
"maxAutoScaleCount": 50,
"minAutoScaleCount": 0,
"selector": ["name": "myTargetedReplicationControllers"]
"thresholds": [
{
"id": "myapp-rps-up",
"kind": "AutoScaleThreshold",
"selector": ["name": "requestPerSecond"],
"durationVal": 30,
"durationInterval": "seconds",
"value": 50,
"comp" : "greater",
"inc": 1
},
{
"id": "myapp-rps-down",
"kind": "AutoScaleThreshold",
"selector": ["name": "requestPerSecond"],
"durationVal": 30,
"durationInterval": "seconds",
"value": 25,
"comp": "less",
"inc": -1
}
],
"scalableTarget": "myapp-replcontrollerA"
}

1. The existing environment has `ReplicationController` A, with a label of "myTargetedReplicationControllers" with replica count of 1
1. A deployment occurs and creates a new `ReplicationController` B, with a label of "myTargetedReplicationControllers" and a desired
count of 2
1. The deployment orchestration signals that a deployment is in progress, the auto-scaler will ignore "myapp-rps-down" threshold
triggers
1. The deployment orchestration changes the auto-scaler's scalable target to replication controller B
1. The deployment orchestration checks the current replica count of B, sees it is 0, and creates a new pod
1. A burst of traffic comes in, the auto-scaler triggers the "myapp-rps-up" threshold and increases the replica count of
replication controller B to 2
1. The deployment orchestration checks the current replica count of B, sees it is a 2 and understands that it has met the requested
capacity
1. The deployment orchestration signals the deployment is complete, the auto-scaler will no longer ignore the "myapp-rps-down" threshold.