-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Condition variables (again) #3892
Comments
This is most likely a bad assumption. For example, most web servers will cancel a request handler if the remote connection is closed, and I'm guessing this also applies to rocket. I do not think we can just ignore cancellation. |
Well, I think that yak is mostly shaved now. The result so far is here: Here is now I dealt with the cancellation issue: I haven't yet revisited #3741 which would make the use rather more convenient. I think Tokio could do with a native condvar which uses the same Baton technique for handling cancellation, but which is not necessarily fair. My runtime-agnostic crate can't optimise which task to wake but Tokio's condvars could. I think the ability to use different mutexes is important. There are good reasons for choosing both sync and async ones. |
I agree that the cancellation problems are avoidable if you allow spurious wakeups. I do like the idea of the I think we will run into trouble if we attempt to implement a |
I think spurious wakeups are part of the spec :-). They don't cause trouble in practical uses of condvars, provided that they aren't frequent enough to be perf issue.
I didn't find this a problem for |
Just because they are parts of the specification for |
I have poked my tests some more and it turned out that all my test cases were using single-threaded executors. I added a
This is rather disappointing because actually the Anyway, it can be made to work with |
I've verified that with |
Hi all, I thought about this some more. As far as I understand, using It is a little more tricky with multiple waiters. The problem here is that after checking the condition and releasing the lock, the other task might call Additionally, it might be good to have some kind of helper function/macro to build a waiting loop. I usually write something like this which seems very verbose loop {
// Scope MutexGuard to avoid deadlock
{
let lock = mutex.lock().await;
if check_condition(&*lock) {
break;
}
}
// Note this only is safe if the other task calls notify_one, not notify_all
notify.notified().await;
} Unrelated: I am also not super happy with the current terminology in |
Any thoughts or progress on this? I can help creating a new pull request if needed. |
I think this is a really important function for Tokio. At least for my purposes it seems so. It would be wonderful if there was a Tokio compatible async Condvar Thanks to @ijackson to show how this can even be done. |
It's probably worth noting there is a MIT/Apache async Condvar in the async-std crate. It lacks a lot of the wonderful functionality the Condvar from @ijackson does, but it could be re-worked and included in Tokio without affecting Tokio's licensing. |
I have been using a slightly modified Notify as a Condition Variable for a while now (similar to the approach I described before). I don't have time right now to make a properly documented pull request but I opened a draft here so it can be used for future reference: #4668 |
FYI: I put a very basic implementation that has been working fine for me here. |
That's unfortunately a big no-no for me due to the license |
Summary
(Hi. This is my first issue/MR against Tokio, so firstly: Hello and thanks for all the hard work making an impressive system!)
I need a condition variable. (I am updating a program from Rocket 0.4 to 0.5, so it is becoming async. I was using a condvar and restructuring my program to some other primitive would be awkward and un-idiomatic.)
Condvars are a very general and powerful synchronisation primitive, with relatively programmer-friendly properties. Implementing them by hand is complex and error-prone, and often achieving the same results by another method is too. I think the lack of condvars in the Tokio ecosystem is a serious omission.
Requested solution
A synchronisation primitive with the standard semantics of a condition variable (condvar) as found for example in pthreads or
std::sync::Condvar
.Alternatives and prior history
None of the alternatives seem palatable. I doubt that I can replace a
std::sync::Condvar
with something fromtokio::sync
other than by effectively implementing condvars myself. Even if this is possible, it would involve me doing a complex multithreaded programming proof to convince myself that my approach was correct.In general, a condvar might be needed because an existing program is being converted to async and uses a
std::sync::Condvar
, or because a program is being converted from C where condvars are available, or because it's just the most convenient primitive for the situation.Expecting every programmer to do this individually, perhaps per-project, whenever a condvar is needed, is a really bad idea.
I reviewed the history of this feature request in Tokio. I found:
Notify
to notify all waitersTask cancellation
I don't agree that the issues with task cancellation need to be a blocker. Let me consider
notify_all
andnotify_one
separately.notify_all
notify_all
does not need to worry about task cancellation. The guarantee we want is this:Consider a task W, which holds the mutex and calls
wait
.If any other task N, which has acquired the mutex after W released it as a part of
wait
, callsnotify_broadcast
, then W will stop waiting - either, because it was cancelled, or because it wakes up with the mutex held and continues.This applies to all qualifying tasks W
If there are no waiters, then
notify_all
is a no-op. If all the waiters end up cancelled then it is a no-op. This is all fine.notify_one
I think in an async context, it only makes sense to
notify_one
if you know that the tasks which are waiting will not be cancelled. After all, in practice, task cancellation is a thing that there are few good ways to defend yourself against in the task itself, and which it is difficult to code in a way that you can be sure to avoid. Whereas the whole idea behindnotify_one
is to be able to pass a "baton" or some such, and that depends on the recipient carrying on properly: not only completing the baton exchange, but actually carrying on with the underlying activities.(In pthreads, a thread might be cancelled after being woken due to pthread's condvar notify. The pthreads docs say that in this case the thread won't eat a condvar notification. This is different to what I am proposing for Tokio. NB that pthread thread cancellation is a nightmare, quite unlike Rust async task cancellation, so people with any sense use it hardly ever if at all.)
So the guarantee we want is this:
Suppose a task N acquires (and perhaps releases) the mutex, and calls
notify_one
.The candidate tasks W are those which at the time of N's mutex acquisition, were, blocked on
wait
.Not all candidate tasks will remain blocked on
wait
. That is, at least one candidate task will either wake up or have been cancelled.(If there are no candidate tasks,
notify_one
is a no-op and this is not detected, although it is probably a mishap.)Or to put it another way we resolve the cancellation issue this way:
notify_one
is ineffective.I think the same rule ought to apply to timeouts. The
std
docs are not clear about this, butpthread_cond_timedwait
can "eat" a notification frompthread_cond_notify
even if it returns timed out.Spurious notifications
In #3742 (comment) a qualm was discussed, including this comment:
Spurious wakeups are of course allowed. This is even stated in the docs for
std::sync::Condvar
. (They're not desirable for perf reasons.)Stupid design sketch
ISTM that it might be possible to cook up a cardboard cutout of a condvar implementation based on
tokio::sync::watch
. Rust's async system makes it possible for a task to obtain theReceiver::changed
future before disposing of the mutex guard.The resulting impolementation would not have
notify_one
, justnotify_all
. That would be enough for my application. (Of course, implementingnotify_one
as a call tonotify_all
would be correct - it would uphold all the guarantees - but it would not be very useful; I think it would be better to leave it missing than provide such a poor implementation.)To have the desired API (ie to avoid needing a separate
&Mutex
passed intoCondvar::wait
) I think something like #3741 will be needed.Way forward
I need to see this fixed. One way or another I guess I am going to have to find (and double-check) or write an implementation of at least condvar broadcast for Tokio.
Currently I am about half a bottle of wine down and not in any state to think very hard about concurrent code :-).
I will look at this again tomorrow, in particular at the code in #3742, and at the other primitives available in Tokio, and decide what to do next.
The text was updated successfully, but these errors were encountered: