io-wait used 100% cpu after 6.1.39 and 6.5.1 #943

beldzhang · 2023-09-04T19:53:02Z

after this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.1.39&id=f32dfc802e8733028088edf54499d5669cb0ef69
a running io-uring object will cause one cpu showing 100% usage of io-wait, in my environment 8 rings/threads make 8 cpu got 100% io-wait usage. beside this nothing else found.

try to revert this commit from 6.1.39 then everything is ok.
6.1.51/6.5.1 it is the same, can not direct revert this commit so not tested yet.

all following test got the same performance result:
6.1.38
6.1.39
6.1.39(reverted)
6.1.51

re-produce:
use this testing program: https://github.com/axboe/liburing/files/9571382/issue-643-v2.zip, build and just run server, it will showing io-wait 100% on one cpu.

redbaron · 2023-09-05T10:39:54Z

isn't IO wait time is what you want to see? That is time app is blocked waiting on io (at least it was the case for file IO), but CPU has nothing else to do, so that time is accounted as IO wait.

isilence · 2023-09-05T14:08:16Z

Note that io-wait doesn't burn CPU cycles, it's sleeping and so it's not a problem apart from reporting. There was a change doing that, and I think it actually makes more sense reporting a task waiting for io_uring completions as io-wait

axboe · 2023-09-05T15:05:28Z

Yes this is expected. iowait literally just means "waiting on IO", which is what the task is doing. It does NOT mean that it's busy 100% of the time, in fact if you have 100% iowait it means 1 task is sleeping waiting on IO 100% of the time.

beldzhang · 2023-09-05T20:41:07Z

Yes this is expected. iowait literally just means "waiting on IO", which is what the task is doing. It does NOT mean that it's busy 100% of the time, in fact if you have 100% iowait it means 1 task is sleeping waiting on IO 100% of the time.

noticed that, when this happen, system load is only 0.0x

we are using io-uring in a storage service, so is very sensitive about the storage/network load, previously io-wait is a good indicator to check this.
and before kernel 5.0, iostat's %util value can be used also, but after 5.0, this number will easily up to 100% even on a small load. many articles also said this number is not reliable.
is there any other way to check the disk load?

and I calculate a performance score after each test based on total cpu usage, just ignore the io-wait part looks like not a good solution...

beldzhang · 2023-09-13T01:48:54Z

looks no more comments, closed.

romange · 2023-11-16T07:14:54Z

@axboe, one of Dragonfly's users, also reported this as a behavioural change: Dragonfly, which does not use disk IO, bumps up the IOWAIT metric to 100%. If they run it using the epoll API, it does not affect IOWAIT. I am just double checking whether this change is indeed intended.

rickytato · 2023-11-25T16:20:30Z

I see same issue with Kernel 6.5.11 (on Proxmox)

beldzhang · 2023-12-05T07:09:06Z

emmm... reopen?

RX14 · 2024-02-22T16:05:23Z

IOwait has been traditionally thought of as "waiting for disk io", which will always complete. Since io_uring can be used to wait on the network, which has unbounded waiting time, it changes the metric considerably. For example, many monitoring systems have alerts for iowait being high, correctly or not assuming it to be a proxy for disk contention.

axboe · 2024-02-22T17:05:06Z

Here's what I think we should do:

Default to not using iowait, as it is indeed somewhat confusing for networked or mixed network/storage workloads. I do think iowait is an awful metric that makes very little sense for async workloads, even just pure storage based ones. Lots of consumers will assume it's busy time, or has a direct correlation with disk usage, which is just wrong.
Add an IORING_ENTER_IOWAIT flag that can be used in conjunction with IORING_ENTER_GETEVENTS. If set, iowait will be used. Storage can use this, if they so wish.
Add an IORING_FEAT_IOWAIT flag, which tells the app/liburing that this feature is available.
Add liburing helpers, ala io_uring_set_iowait() and io_uring_clear_iowait(), which can be used to toggle this flag, if IORING_FEAT_IOWAIT is set. Storage based workloads can set this.

And that should be it. That gives the app control over whether iowait should be used or not.

beldzhang · 2024-02-22T19:13:37Z

will test when avaliable

isilence · 2024-02-24T15:57:24Z

@axboe, I just mentioned it in the mailing list, but even though I don't understand why people are taken aback by hi iowait from io_uring waiting, but I think we should just revert that change, there has been too many reports from different people regarding this one. We should be able to do the optimisation that was the reason for the change without reporting iowait.

axboe · 2024-02-24T18:57:10Z

We can't just revert it, as it solved a real problem. I have my doubts that we can separate the cpufreq side from iowait in a way that would make the scheduler side happy. If we can, I'd be all for it, and would love to see a patch.

romange · 2024-03-15T07:35:00Z

A great discussion about this topic on lore.kernel.org.

Just for us to understand, once we call io_uring_register_iowait, it will flag networking I/O as iowait but iouring will run in more efficient manner?

Another interesting comment I read is about multiple rings. Currently https://github.com/romange/helio has ring-per-thread architecture. @axboe are you saying that sometimes it makes sense to have two rings? For what use-cases it makes sense?

isilence · 2024-03-15T14:15:50Z

A great discussion about this topic on lore.kernel.org.

Just for us to understand, once we call io_uring_register_iowait, it will flag networking I/O as iowait but iouring will run in more efficient manner?

The long story. There is a patch upstream since a while ago which does two unrelated things: first it enables some cpu governor optimisation useful for QD1 and not only, and it also changes the io-wait stat behaviour as per this thread. They're coupled together for implementation reasons, it's much easier going this way. So, the optimisation is already in the kernel and always enabled, let's say it's a free lunch. Now, that io_uring_register_iowait() patch would disable the optimisation by default and turn it back on only if you call the function.

I have to say that it's quite a horrendous approach, having side effects from seemingly an optimisation, mixing responsibilities and levels at what the feature enabled and the iowait stat is observed, and so on. I think the register_iowait patch should never be given the light, at least as far as it mixes things together.

isilence · 2024-03-15T14:22:09Z

Another interesting comment I read is about multiple rings. Currently https://github.com/romange/helio has ring-per-thread architecture. @axboe are you saying that sometimes it makes sense to have two rings? For what use-cases it makes sense?

IMHO, it doesn't make sense apart maybe from some IOPOLL + normal ring weird cases. However, sometimes it happens (unfortunately). For instance, when a library / framework you use has some io_uring support inside, and then the app creates another rings for its own purposes.

axboe · 2024-03-15T15:19:11Z

There will be no register iowait, the current pending fixes are here:

https://git.kernel.dk/cgit/linux/log/?h=iowait.2

and will be posted for review soon, so they can get into the 6.10 kernel.

beldzhang · 2024-03-16T15:10:00Z

https://git.kernel.dk/cgit/linux/log/?h=iowait.2

@axboe
brief tested, io-wait is gone, will following up. mailing list followed also.

@isilence
to the end users, they are sensitive to the latency and high response time of server, to sys admin, io-wait and load are directly showing the situations. generally storage parts is slowest in whole system, user/admin didn't care about the waiting of io read/write is sync or async, they just want to know how much loading of the entire server
I already remove iostat %util displaying because it's non-sense anymore. but 100% iowait of io-uring on cpu, is terrify a lot of users/admins.

beldzhang · 2024-04-19T03:16:36Z

ready for testing, for-6.10/io_uring? for-6.10/block or for-next? thanks.

beldzhang · 2024-06-13T15:12:20Z

emmm.... any updates?

solarvm · 2024-07-22T11:41:22Z

still happening for us too

isilence · 2024-07-22T12:50:39Z

Nothing was merged yet, as it's a low priority reporting issue. However, there is interest in that for some other reasons, and it's in the backlog. will get picked up hopefully soon.

beldzhang · 2024-08-27T13:48:22Z

iowait.4 branch tested, no iowait usage. detailed test pending, no regression for now.

beldzhang closed this as completed Sep 13, 2023

beldzhang closed this as not planned Won't fix, can't repro, duplicate, stale Sep 13, 2023

romange mentioned this issue Nov 16, 2023

High IOWAIT usage dragonflydb/dragonfly#2181

Closed

axboe reopened this Feb 22, 2024

romange mentioned this issue Mar 14, 2024

High IO Wait CPU usage dragonflydb/dragonfly#2729

Closed

tvijverb mentioned this issue Apr 22, 2024

High WA% without workload dragonflydb/dragonfly#2270

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

io-wait used 100% cpu after 6.1.39 and 6.5.1 #943

io-wait used 100% cpu after 6.1.39 and 6.5.1 #943

beldzhang commented Sep 4, 2023

redbaron commented Sep 5, 2023

isilence commented Sep 5, 2023

axboe commented Sep 5, 2023

beldzhang commented Sep 5, 2023 •

edited

Loading

beldzhang commented Sep 13, 2023

romange commented Nov 16, 2023 •

edited

Loading

rickytato commented Nov 25, 2023

beldzhang commented Dec 5, 2023

RX14 commented Feb 22, 2024

axboe commented Feb 22, 2024

beldzhang commented Feb 22, 2024

isilence commented Feb 24, 2024

axboe commented Feb 24, 2024

romange commented Mar 15, 2024

isilence commented Mar 15, 2024

isilence commented Mar 15, 2024

axboe commented Mar 15, 2024

beldzhang commented Mar 16, 2024

beldzhang commented Apr 19, 2024

beldzhang commented Jun 13, 2024

solarvm commented Jul 22, 2024

isilence commented Jul 22, 2024

beldzhang commented Aug 27, 2024

io-wait used 100% cpu after 6.1.39 and 6.5.1 #943

io-wait used 100% cpu after 6.1.39 and 6.5.1 #943

Comments

beldzhang commented Sep 4, 2023

redbaron commented Sep 5, 2023

isilence commented Sep 5, 2023

axboe commented Sep 5, 2023

beldzhang commented Sep 5, 2023 • edited Loading

beldzhang commented Sep 13, 2023

romange commented Nov 16, 2023 • edited Loading

rickytato commented Nov 25, 2023

beldzhang commented Dec 5, 2023

RX14 commented Feb 22, 2024

axboe commented Feb 22, 2024

beldzhang commented Feb 22, 2024

isilence commented Feb 24, 2024

axboe commented Feb 24, 2024

romange commented Mar 15, 2024

isilence commented Mar 15, 2024

isilence commented Mar 15, 2024

axboe commented Mar 15, 2024

beldzhang commented Mar 16, 2024

beldzhang commented Apr 19, 2024

beldzhang commented Jun 13, 2024

solarvm commented Jul 22, 2024

isilence commented Jul 22, 2024

beldzhang commented Aug 27, 2024

beldzhang commented Sep 5, 2023 •

edited

Loading

romange commented Nov 16, 2023 •

edited

Loading