-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rt: Tolerate slow task polls #6315
Comments
It may be worth trying to further define what "well-behaved application behavior" actually is, as it's probably not immediately obvious to some developers (or me). I don't know how long tasks are reasonably allowed to execute, or what set of circumstances may lead to degraded performance. Anecdotally, I come from working on JVM-based webservices, and have some implicit fear of GC. For the sake of argument, let's say a typical GC pause may be 20ms. Thus, I'd draw the line at something like a task taking 20ms to complete. Does this mean that due to the pathological nature of the scheduler, accidentally running a task that takes 20ms could result in my tokio/rust application pausing for longer than GC pauses we typically see in the JVM world?
Last thought for now: the problem is very apparent if you run a very lengthy task (something that takes 1000ms), as demo'd in the past github issues. However, is the problem still present, just at a more difficult to measure level, with tasks that take 10us? The point being that if this actually impacts tasks both long and short, then this issue does represent a performance issue for reasonably well-behaved applications. That is, what is the overall system throughput penalty resulting from running a spurious task that takes 1us, or 10us, or 100us, or 1ms, or 10ms, or 100ms, or 1000ms? What is throughput penalty of various fixes posed in the previous github issues? |
Yes, 20ms is definitely too long. In my article on blocking, I recommend 10μs to 100μs as an upper limit. In general, there are three ways that long polls can cause degraded performance:
In the first case, there is nothing we can do. The second case is tracked by #4941. This issue tracks the third case. In all other cases, work stealing will kick in and move tasks to a different worker so that they do not have to wait for the blocking future. The way that this issue gets triggered is as follows:
An important distinction here is that work-stealing is working correctly. There's only a single ready-to-run task on the entire runtime, so there is no work stealing that could happen. The problem is that the tasks will not get marked as ready until the runtime next asks the OS for IO events. None of this happens if there are other active workers, because they will take over IO events when they next become idle. This is why you can make your own monitor thread by regularly spawning tasks on the runtime. A silver lining of the above is that the issue can only really happen when all threads are idle and a single incoming IO event arrives, so it will not happen while the runtime is under load. |
Hey -- I just landed up being affected by #4730 -- in my case, only 2 out of the 8 Tokio threads I had available were in deadlock but the rest of the 6 were not proceeding.
Idea: in addition to having this "monitor thread" doing automatic spawning on the runtime, would it make sense to allow for manually signaling the runtime to do wake-up/shuffling of tasks/workers to allow movement in this case when it gets stuck? This wouldn't solve the issue definitively, but it would also allow testing whether a currently "not-well-formed" program is either only partially deadlocked or entirely deadlocked. I can see something like this being valuable while debugging.
If I remember from the discussion, there was more than 1 solution being considered. Eventually, as implementations come about, they would just be gated behind |
Does current thread runtime with localset suffer those 2 /3 issue @Darksonn ? |
@inevity The current-thread runtime can only suffer from case 1. After all, if you are blocking one thread, you are blocking all of them. (The current-thread runtime has no lifo slot.)
You can call But I want to clarify that there is no shuffling of tasks that can happen here. It's not like there are a bunch of tasks on those two threads that could run if they were just moved to a different worker thread. Other than your blocking tasks, all other tasks on your runtime are considered idle. It's just that if you polled the IO driver, then that would result in more tasks being woken. But since the IO driver isn't being polled, those tasks are considered idle, and that is why the runtime isn't polling them. |
I had a thought on this: what if we had a mode (or a way to mark tasks) as poorly behaved. Each poll would run as if the user had wrapped the code with block_in_place. There are none of the normal block in place risks here — we know nothing else is going to make progress on that task anyway while a single poll is in progress. This obviously would degrade performance for some applications, but it seems like a potential thing to try for a more tolerant Tokio. In some experiments even 200us polls were benefitted by this behavior. |
Currently, Tokio relies on well-behaved application behavior, including tasks completing their polls in a reasonable time. When user tasks are not well-behaved, the application can suffer high latencies due to the Tokio scheduler being blocked by the user task and unable to process other tasks.
Fundamentally, because Tokio is an asynchronous task scheduler, it requires well-behaved user tasks. Any attempt to mitigate poorly behaved tasks will most likely be less effective than an OS's thread scheduler. That said, it is still valuable to be able to detect and mitigate these poorly behaved tasks.
I don't know what the best strategy for this is. My gut is it will be a background process, either in a dedicated background thread or handled by idle workers, that checks that active workers are making consistent progress.
Any solution cannot impact the existing performance of well-behaved apps, so if the best path forward comes with additional overhead, it needs to be opt-in.
Moving this issue forward will require the following:
Refs: #4730
The text was updated successfully, but these errors were encountered: