-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
io-wait used 100% cpu after 6.1.39 and 6.5.1 #943
Comments
isn't IO wait time is what you want to see? That is time app is blocked waiting on io (at least it was the case for file IO), but CPU has nothing else to do, so that time is accounted as IO wait. |
Note that io-wait doesn't burn CPU cycles, it's sleeping and so it's not a problem apart from reporting. There was a change doing that, and I think it actually makes more sense reporting a task waiting for io_uring completions as io-wait |
Yes this is expected. iowait literally just means "waiting on IO", which is what the task is doing. It does NOT mean that it's busy 100% of the time, in fact if you have 100% iowait it means 1 task is sleeping waiting on IO 100% of the time. |
noticed that, when this happen, system load is only 0.0x we are using io-uring in a storage service, so is very sensitive about the storage/network load, previously io-wait is a good indicator to check this. and I calculate a performance score after each test based on total cpu usage, just ignore the io-wait part looks like not a good solution... |
looks no more comments, closed. |
@axboe, one of Dragonfly's users, also reported this as a behavioural change: Dragonfly, which does not use disk IO, bumps up the IOWAIT metric to 100%. If they run it using the epoll API, it does not affect IOWAIT. I am just double checking whether this change is indeed intended. |
emmm... reopen? |
IOwait has been traditionally thought of as "waiting for disk io", which will always complete. Since io_uring can be used to wait on the network, which has unbounded waiting time, it changes the metric considerably. For example, many monitoring systems have alerts for iowait being high, correctly or not assuming it to be a proxy for disk contention. |
Here's what I think we should do:
And that should be it. That gives the app control over whether iowait should be used or not. |
will test when avaliable |
@axboe, I just mentioned it in the mailing list, but even though I don't understand why people are taken aback by hi iowait from io_uring waiting, but I think we should just revert that change, there has been too many reports from different people regarding this one. We should be able to do the optimisation that was the reason for the change without reporting iowait. |
We can't just revert it, as it solved a real problem. I have my doubts that we can separate the cpufreq side from iowait in a way that would make the scheduler side happy. If we can, I'd be all for it, and would love to see a patch. |
A great discussion about this topic on lore.kernel.org. Just for us to understand, once we call Another interesting comment I read is about multiple rings. Currently https://github.com/romange/helio has ring-per-thread architecture. @axboe are you saying that sometimes it makes sense to have two rings? For what use-cases it makes sense? |
The long story. There is a patch upstream since a while ago which does two unrelated things: first it enables some cpu governor optimisation useful for QD1 and not only, and it also changes the io-wait stat behaviour as per this thread. They're coupled together for implementation reasons, it's much easier going this way. So, the optimisation is already in the kernel and always enabled, let's say it's a free lunch. Now, that I have to say that it's quite a horrendous approach, having side effects from seemingly an optimisation, mixing responsibilities and levels at what the feature enabled and the iowait stat is observed, and so on. I think the register_iowait patch should never be given the light, at least as far as it mixes things together. |
IMHO, it doesn't make sense apart maybe from some IOPOLL + normal ring weird cases. However, sometimes it happens (unfortunately). For instance, when a library / framework you use has some io_uring support inside, and then the app creates another rings for its own purposes. |
There will be no register iowait, the current pending fixes are here: https://git.kernel.dk/cgit/linux/log/?h=iowait.2 and will be posted for review soon, so they can get into the 6.10 kernel. |
@axboe @isilence |
ready for testing, for-6.10/io_uring? for-6.10/block or for-next? thanks. |
emmm.... any updates? |
still happening for us too |
Nothing was merged yet, as it's a low priority reporting issue. However, there is interest in that for some other reasons, and it's in the backlog. will get picked up hopefully soon. |
|
after this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.1.39&id=f32dfc802e8733028088edf54499d5669cb0ef69
a running io-uring object will cause one cpu showing 100% usage of io-wait, in my environment 8 rings/threads make 8 cpu got 100% io-wait usage. beside this nothing else found.
try to revert this commit from 6.1.39 then everything is ok.
6.1.51/6.5.1 it is the same, can not direct revert this commit so not tested yet.
all following test got the same performance result:
6.1.38
6.1.39
6.1.39(reverted)
6.1.51
re-produce:
use this testing program: https://github.com/axboe/liburing/files/9571382/issue-643-v2.zip, build and just run server, it will showing io-wait 100% on one cpu.
The text was updated successfully, but these errors were encountered: