Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime-k3s-qemu-snp.yml: node-installer has too little memory #943

Merged
merged 1 commit into from
Oct 21, 2024

Conversation

blenessy
Copy link
Contributor

The OOM Killer kicks in when below 600M on my Milan. Setting to 700M to have some slack.

You can reproduce this with runtime-k3s-qemu-snp.yml v1.1.0, which limits memory to 100M memory.

The OOM Killer kicks in when below 600M on my Milan.
Setting to 700M to have some slack.
@Freax13 Freax13 requested review from Freax13 and removed request for katexochen October 21, 2024 09:01
@Freax13 Freax13 added the bug fix Fixing a user facing bug label Oct 21, 2024
Copy link
Contributor

@Freax13 Freax13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!

Reproducing this wasn't easy.

We used to use a kernel config based on Ubuntu 22.04 on our AMD server. I didn't see any OOMs on that config. However, after I switched our AMD server to a config based on Ubuntu 24.10, I did get OOMs as well. I didn't look much into what changed, but I strongly suspect that either Canonical changed the kernel config for Ubuntu 24.10 or that there are some new config options that the old Ubuntu 22.04 kernel didn't have and that change the behavior of the OOM killer.
Interestingly, we use a semi-official kernel for TDX based on Ubuntu 24.04, and I also don't get any issues on that server either.

Now that our AMD CI machine is using the 24.10-based config, we should catch this much earlier in the future.

700 MiB (or even just 600 MiB) seems a bit excessive to me, I don't know why the node-installer is using that much memory, but if that's what it takes, that's fine with me.

Again, our CI can't handle outside contributors well, so I tested this manually.

@Freax13 Freax13 merged commit f5da52d into edgelesssys:main Oct 21, 2024
5 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix Fixing a user facing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants