-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decrease CPU usage #61
Comments
I agree that could be a good idea. However, the check uses very little CPU already, so I'm not sure if it's worth it. How much CPU usage do you see from earlyoom? |
The situation may be different on embedded systems though. These have much weaker CPUs. |
IMHO in any case such optimization will not make it worse. Earlyoom uses 0 - 0.7% CPU on my Pentium B960. |
There is a downside: longer reaction time |
The reaction time will be sufficient to prevent OOM because the sleep period will decrease if the available memory decreases. This is proven in practice, this algorithm is used by nohang and works great. Further I will display a nohang output when calculating sleep time as follows: I execute
Further you can see nohang output if I execute
And
OOM has been prevented without problem.
and
|
Nohang output when execute
Nohang output when execute
and
OOM has been totally prevented. It is very important improvement. As you see, 0.1s sleep period is very short for me. I want you to accept this algorithm and add CLI options for change mem/swap rates. |
tail /dev/zero without swap? |
With swap and without swap both. |
Ah, in the other post. Sorry, missed that! |
|
The behavoir in your tests look very good. However, I would like the algorithm to be easier to predict for the user. How about just dropping the poll rate to 1Hz instead of 10Hz when (available ram + free swap) > 1gb ? |
In control theory terms, your algorithm is a continous controller, and with those you usually need extra checks to keep the value from going too low or too high. So we have four values to adjust:
On the other hand, a two-state controller needs:
But I have to admit that you could say the switching threshold are actually two values because they count ram and swap with a multiplier of one. Hmm. Did you use an upper and lower limit in nohang? |
rate_mem = 6000000 mem_min_sigterm = 10 % swap_min_sigterm = 10 % No more. |
MA/1000000 means that OOM preventer have time to prevent OOM if MemAv decreases with speed 1000000 kB/s. Is is easy to predict. |
Also you can add the monitoring intensity option to be able to more rarely checks memory on systems with stable memory usage (embedded, servers). |
Maybe better then 0.1s, but worse than |
I added a small test tool, membomb, to find out how fast RAM and Swap can be depleted. This is on a Pentium G630 from 2011, with 8 GB DDR3 RAM, and 1 GB of swap on an SSD:
|
With stable 0.1s? |
Yes |
I noticed a problem with swap enableb: sometimes membomb is killed, and 0.1s later a chrome tab is killed, because the memory is still low. Seems like the kernel needs more than 0.1s to clean up the process. |
That's why I use min delay after all sigkills in nohang.
|
Delay after send signal should be after any SIGKILL, not only after fail. |
Mem 5.7 GiB, Zram swap 5.7 GiB
|
Mem 5.7 GiB, Zram swap 5.7 GiB
|
I wonder if we can do better than just wait (for how long?). Wait until we see the memory usage drop? |
So far I have not been able to think of anything better. Seems like simple delays are not bad idea. |
I have added an extra 200ms sleep after a kill, but only if swap is enabled (commit). I have never seen that happen without swap. |
|
Do you also see this with SIGKILL? |
yes, I saw this with SIGKILL. |
2% SIGTERM, 1% SIGKILL t = MA/6000000
Are you going to accept new sleep time algorithm? PS. Nice new output in 1.1! |
Membomb is not quickest! |
Yes the new sleep algorithm will go into earlyoom 1.2 |
Stress is better (by speed). https://people.seas.harvard.edu/~apw/stress/ |
Interesting. Maybe because it is 4x parallel (-m 4) |
of course |
The idea is simple: if memory and swap can only fill up so fast, we know how long we can sleep without risking to miss a low memory event. #61
Implemented via b8b3c32 |
There is no sense to check memory 10 times per second if there is more than one gigabyte of available memory. You can optimize the frequency of memory checks to decrease the frequency of memory checks and decrease CPU usage. Now the sleep period is 0.1. You can find the time until the next check as follows:
t1 = MemAvailable / 4000000
t2 = SwapFree / 1000000
sleep_until_next_mem_check = t1 + t2
I implemented a similar algorithm in nohang, which is OOM preventer written in Python. If nohang checked memory 10 times per second, then the CPU usage would be too large. Now when using this algorithm for calculating the period between memory checks, the CPU usage is significant only at a low memory level. If there is a lot of memory available, then nohang can use the processor even less than earlyoom. Demo https://youtu.be/8vjeolxw7Uo
I think it's possible to improve earlyoom in the same way. In most cases it will reduce the CPU usage by an order of magnitude.
The text was updated successfully, but these errors were encountered: