Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature to pause HPA scaling during an analysis run #1245

Open
rbaybutt opened this issue Jun 2, 2021 · 4 comments
Open

Add feature to pause HPA scaling during an analysis run #1245

rbaybutt opened this issue Jun 2, 2021 · 4 comments
Labels
enhancement New feature or request no-issue-activity

Comments

@rbaybutt
Copy link

rbaybutt commented Jun 2, 2021

Summary

At present when an AnalysisRun is in progress the HPA can continue to scale. While this behavior makes sense as a default in order to ensure the needs of an application are met it can also lead to additional complexity. The ask is to add a feature that will pause HPA scaling during the analysis. This will avoid confusion when deploying and will simplify the analysis which may need to account for the auto-scaling if it is comparing traffic on the stable against the canary traffic.
The feature would also reduce the time a rollout takes if the HPA scaled up and caused the cluster autoscaler to add nodes during the analysis.

Use Cases

This would be an option that could be specified as part of a Rollout spec and would be used whenever a Rollout and the HPA are used in conjunction with each other.


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

@rbaybutt rbaybutt added the enhancement New feature or request label Jun 2, 2021
@DanTulovsky
Copy link

. Isn't this dangerous? What happens if your load goes up during the Analysis run and you would have needed HPA to scale up? You can't predict these things...

@rbaybutt
Copy link
Author

rbaybutt commented Jun 9, 2021

I definitely don't see this as the default behavior. For workloads with predictable behavior in terms of traffic load I see this as simplifying the rollout. What we've seen in production is that the HPA will decide to scale up by one or two pods as the canary is being promoted. This particular application just runs one pod per node so the CA also scales up one or two nodes accordingly. Now the canary also has to scale up before it completes the promotion. With the additional time needed to add nodes to the cluster it has caused a fair amount of confusion where individuals running deploys assume their Rollout has failed in some manner.
We run with enough headroom that if the pods had not been placed we would still be able to process all traffic.
I could be convinced that by virtue of addressing #1029 this feature would not be needed.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 6, 2022

This issue is stale because it has been open 60 days with no activity.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2023

This issue is stale because it has been open 60 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request no-issue-activity
Projects
None yet
Development

No branches or pull requests

2 participants