-
Notifications
You must be signed in to change notification settings - Fork 880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support dynamic scaling of stable ReplicaSet as inverse of canary weight #1430
Conversation
c522218
to
b2beef5
Compare
b2beef5
to
708a59f
Compare
708a59f
to
cf6b29a
Compare
Codecov Report
@@ Coverage Diff @@
## master #1430 +/- ##
==========================================
+ Coverage 81.67% 81.73% +0.06%
==========================================
Files 110 112 +2
Lines 14798 15070 +272
==========================================
+ Hits 12086 12318 +232
- Misses 2078 2107 +29
- Partials 634 645 +11
Continue to review full report at Codecov.
|
cf6b29a
to
a612456
Compare
@perenesenko can you please review and validate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested different scenarios of promotion and abortion with Istio and Nginx. All works fine
Just have two questions...
Q1 what about using the In case of steps
After the |
Q2 The |
For Q1, it is confirmed that #1382 (comment) setCanaryScale and dynamicStableScale will work together. I would like to see what is the expected behavior if both values set. @jessesuen can you please provide us the behavior when both are configured. @perenesenko when you validated, 20%of pods are taking 80% of traffic? and can you provide what is the stable replica set count at that time? |
@harikrongali |
Yes, this is a risk with the
They independent choices. Both can be configured at the same time. dynamicStableScaling can be simply understood as: "whatever the current canary traffic weight is, the stable will be scaled to be: So if canaryWeight is set to 80, and there are 10 spec.replicas, then stable scale will be
correct. scaleDownDelaySeconds is not expected to work with dynamicStableScaling and is ignored. |
Just to clarify. Theres no harm in using
I added validation to make sure these two options are not used together. |
…ry weight Signed-off-by: Jesse Suen <[email protected]>
Signed-off-by: Jesse Suen <[email protected]>
Signed-off-by: Jesse Suen <[email protected]>
47762d1
to
d8c081e
Compare
Signed-off-by: Jesse Suen <[email protected]>
d8c081e
to
0e01d6b
Compare
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fixes #1029
Introduces a
canary.dynamicStableScale
flag which allows the stable ReplicaSet to dynamically scale down/up depending on canary traffic weight. This feature is needed to run less pods during an update, but at the cost of slower aborts (since the stable will need to scale back up).The feature introduces a new
weights
field into the rollout statusstatus.canary.weights
. e.g.:The tracking of canary weights into status is needed for us to understand the weight which was set in the traffic router. This allows us to calculate the amount of replicas which we can safely reduce the scale the stable (or canary during abort). For example, it is not safe to scale down the stable, until we know traffic has shifted away from it. It also avoids us to have all traffic routers implement a GetCurrentWeight() function because we instead remember what we set last.
We could also use this information to show in the Rollout dashboard in the future, or the CLI. Currently the actual weight field in the CLI is not very accurate.
canary,abortScaleDownDelaySeconds
.Signed-off-by: Jesse Suen [email protected]