-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature(serving): Add --autoscale-window #614
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rhuss The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Not 100% sure about the option name:
'guess autoscale is the best choice here ... |
No idea about the IT test error:
Let's check whether this is a flake. /retest |
I believe this PR uses this annotation: https://github.com/knative/serving/blob/master/pkg/apis/autoscaling/register.go#L67 right? Note the comment in there says:
If so, does |
In fact its not only for scaling down but for autoscaling in general (let me update that). If I understand correctly (but please correct me @markusthoemmes ), this value specifies the window after which the traffic average is caculated and the:
So the shortest possible description is probably "Time window after which to determine whether to autoscale" Does this sound better ? |
It's the window over which statistics are averaged. We actually recalculate every two seconds today. The sideffect is that we only scale down to 0 after we receive straight 0s for the entire window. That's because of math 😂 . The average will only be 0 if all values are 0. |
That's better and if we can't come up with anything else then that's ok for now. So, perhaps something more like: |
That's not true. It in fact directly affects when things are scaled to 0 because of the maths behind it. The distinction between checking after and calculating over is super important here. The window simply specifies the amount of time we look in the past for historic data, maybe that's more useful?
|
To harden my understanding, for the first scaling event to happen, you need at least wait for this time window to pass, correct ? . I mean, if the window is 30 minutes, you have to wait 30 minutes until you scale up because of increased load. After that, the scaling decision is made every 2 seconds (with the data from the last 30 minutes). Is this correct? (asking because that was what I actually observer, i.e. that no scale-up happens when I set this window really large and I'm doing a load test before that window has been reached for the first time). |
I like that, maybe adding some hint that scale to zero happens when no request comes in during that time:
|
The following is the coverage report on the affected files.
|
I don't think this is true because I can see things scale up immediately when I set CC=1 and all pods are busy. This is why I said this window is "part" of the logic w.r.t. scaling up. But I'm sure @markusthoemmes will correct me if I'm wrong ;-) |
We don't wait for the window to fill first, no. |
ok, then I will have to investigate further why upscale didn't work for me when setting the window to 30 minutes, with a concurrency-target of 10 I get not more than 2 containers for 100 concurrent clients. |
Depending on which version you're running, the behavior might be slightly different. As of our latest code, we assume that the entire window is "there". For a fresh revision we're therefore considering for example 29 minutes of straight 0s in your case. That'll heavily bias the scale towards the lower end. |
Ah, I see. But wouldn't it be better that you just take what you have during the first 30 minutes to calculate the average (e.g. average only over 10minutes if your service is running for 10minutes, not filling up 20 minutes with 0), maybe with some portion of leeway (eg soften by multiplying with some weighting function or adding a shorter duration of zeros, like 5minutes) ? But that's just a autoscaling greenhorn speaking here :) |
/lgtm |
* Create presubmit jobs for knative/client Also enable tide for the new repo. * Add the missing coverage job/tabs This is a workaround for knative#615.
Fixes #613