Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPA: Protect against runaway pods #2359

Closed
johanneswuerbach opened this issue Sep 18, 2019 · 8 comments
Closed

VPA: Protect against runaway pods #2359

johanneswuerbach opened this issue Sep 18, 2019 · 8 comments
Assignees
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@johanneswuerbach
Copy link
Contributor

After #2049 VPA is adjusting the specified limit proportionally to the requested resources.

This seems problematic when recommended resources become fairly low. E.g. in a statefulset we ran with limit: 1000m and requested: 500m and once recommended target got lowered to 25m (due to low usage) with a limit of 50m. While this works normally, any spike in consumption now means that the pod is immediately throttled, the health check fails and the pod is recreated, but never seems to cause an increase of the recommended value.

I would see three ways to prevent this:

  1. set a min target for each VPA resource, which seems a bit like anti-pattern as the magic of VPA is that we have to worry less about resources.
  2. remove the limits & requests from the statefulset, which works, but also means a runaway pod could affect the entire node as CPU is no longer throttled.
  3. modify VPA to set the MaxAllowed as pod limits if no other limit is specified, which would allow sudden resource changes in certain boundaries

Overall I like 3) the most, would you accept a PR implementing this change? Is there anything I'm missing?

@johanneswuerbach
Copy link
Contributor Author

/assign bskiba

@bskiba
Copy link
Member

bskiba commented Sep 20, 2019

3 seems a bit problematic, as this means that effectively VPA would start changing pod's qos class which we are explicitly trying to avoid.

My recommendation is a variation of 2 - remove the resource limit for CPU on the pod.
In general we don't recommend setting limits on CPU (except in very very specific cases maybe), since with CPU, a pod will run away only in case there are free CPU resources on the node. If there are other pods requesting cpu on the same node, the pod will be throttled to what it actually requested, so it will never cannibalize the node resources.
With limit gone, the pod will be throttled only if a pod is very tightly packed, but if it has any free space to burst over the request, VPA should notice and bump the resource request up accordingly.

We do recommend limits on memory but with memory you will get Out of Memory exceptions in situations you described, which will cause VPA to bump the memory recommendation up.

@phsiao
Copy link
Contributor

phsiao commented Sep 20, 2019

I have been trying to understand the implications of letting VPA managing limit, and I have came to similar thought --- since CPU is "elastic" and cgroups can provide an effective isolation, removing CPU limit so it can't be changed in a surprised way seems to make sense; on the other hand, memory usage run-away can cause the entire system to OOM and trigger expensive reaping process and cause other issues, so limits on memory must remain at least to reduce the risk.

However, I also want to suggest that managing limit should be made optional if it is not already so. I think the main use case of vpa is to make requested resource closer to actual resource so scheduling and load balancing can be efficient. Limit does not play a role here, afaik, and can have complexity that the main use case users do not want to carry.

@johanneswuerbach
Copy link
Contributor Author

Instead of 3) I would also be happy if the proportional scaling of requested & limits can be disabled if the limit != requested.

I see this definitely useful when both are equal to remain inside the same QoS class, but what is the benefit of changing the limit proportional in other cases? I found that fairly surprising.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 19, 2019
@bskiba
Copy link
Member

bskiba commented Jan 17, 2020

Closing this in favour of #2387 that seems to be asking for same thing (configure if limits should be modified)

@bskiba
Copy link
Member

bskiba commented Jan 17, 2020

/close

@k8s-ci-robot
Copy link
Contributor

@bskiba: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

5 participants