node-installer: remove resource limits #948

Freax13 · 2024-10-22T13:13:34Z

There's no strong reason for having resource limits on the node-installer and given the unpredictable nature of the kernel's resource accounting, we should remove the resource limits so that we don't run into unexpected OOMs.

Fixes f5da52d
Cc @blenessy

burgerdev

I think we should still set a request based on the average case, but removing the limit to accommodate the worst case sgtm.

There's no strong reason for having resource limits on the node-installer and given the unpredictable nature of the kernel's resource accounting, we should remove the resource limits so that we don't run into unexpected OOMs. Fixes f5da52d

katexochen · 2024-10-23T06:49:25Z

On Kubernetes, the best practice is to set memory limit=request.

https://home.robusta.dev/blog/kubernetes-memory-limit

cc @3u13r

3u13r · 2024-10-23T06:57:48Z

On Kubernetes, the best practice is to set memory limit=request.

https://home.robusta.dev/blog/kubernetes-memory-limit

Ah, so I remembered correctly. I was just not sure to bring it up. Ideally the node-installer is more akin to a (daemon set) job than a application deployment which actually does something with the requested memory, right?
The question is, can we have a feasible limit if the needed memory scales with the amount of memory on the host?
Alternatively, we can add a somewhat high limit and then explain the failure in the docs and how the user can increase it?

burgerdev · 2024-10-23T06:59:48Z

On Kubernetes, the best practice is to set memory limit=request.

https://home.robusta.dev/blog/kubernetes-memory-limit

cc @3u13r

This may be true for apps, but is not for essential system components. We don't want to compete with normal pods for resources.

katexochen · 2024-10-23T07:04:48Z

This may be true for apps, but is not for essential system components. We don't want to compete with normal pods for resources.

I'd say that this is a rather subjective measurement of what is "essential". There could be other important applications running in the same cluster. The concept of contrast is a "next to", so we shouldn't think we're the most critical component.

burgerdev · 2024-10-23T07:07:40Z

The question is, can we have a feasible limit if the needed memory scales with the amount of memory on the host? Alternatively, we can add a somewhat high limit and then explain the failure in the docs and how the user can increase it?

The problem is that we need memory in a single burst and don't have much influence over how it's accounted. Setting a request=limit=1G is a little extreme and I'd say a small request with a large limit may be fair, too, but don't quite see what the limit would buy us.

burgerdev · 2024-10-23T07:11:52Z

Funny thing: the linked blog post even has a section on our issue, I think (Unintuitive page-cache behaviour can lead to unecessary OOMs).

katexochen · 2024-10-23T07:29:39Z

Okay, just dropping some of the links here again for quicker access, fine to merge as is.

Freax13 requested a review from burgerdev October 22, 2024 13:13

Freax13 requested a review from katexochen as a code owner October 22, 2024 13:13

Freax13 added the no changelog PRs not listed in the release notes label Oct 22, 2024

burgerdev approved these changes Oct 22, 2024

View reviewed changes

node-installer: remove resource limits

3c5722d

There's no strong reason for having resource limits on the node-installer and given the unpredictable nature of the kernel's resource accounting, we should remove the resource limits so that we don't run into unexpected OOMs. Fixes f5da52d

Freax13 force-pushed the tom/no-limits branch from 677d971 to 3c5722d Compare October 22, 2024 13:24

burgerdev approved these changes Oct 22, 2024

View reviewed changes

Freax13 merged commit 7f50e0c into main Oct 23, 2024
9 checks passed

Freax13 deleted the tom/no-limits branch October 23, 2024 07:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

node-installer: remove resource limits #948

node-installer: remove resource limits #948

Freax13 commented Oct 22, 2024

burgerdev left a comment

katexochen commented Oct 23, 2024

3u13r commented Oct 23, 2024

burgerdev commented Oct 23, 2024

katexochen commented Oct 23, 2024

burgerdev commented Oct 23, 2024

burgerdev commented Oct 23, 2024

katexochen commented Oct 23, 2024

node-installer: remove resource limits #948

node-installer: remove resource limits #948

Conversation

Freax13 commented Oct 22, 2024

burgerdev left a comment

Choose a reason for hiding this comment

katexochen commented Oct 23, 2024

3u13r commented Oct 23, 2024

burgerdev commented Oct 23, 2024

katexochen commented Oct 23, 2024

burgerdev commented Oct 23, 2024

burgerdev commented Oct 23, 2024

katexochen commented Oct 23, 2024