Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent deadlock if sender/receiver is forgotten #49

Merged
merged 3 commits into from
Sep 27, 2022
Merged

Prevent deadlock if sender/receiver is forgotten #49

merged 3 commits into from
Sep 27, 2022

Conversation

sbarral
Copy link
Contributor

@sbarral sbarral commented Aug 9, 2022

Fixes #45.

This is mostly similar to PR #16 by another author, but its motivation is correctness rather than the purported performance advantage. Differences with PR #16:

  • it adds correctness tests for the case of forgotten sender and forgotten receiver,
  • the commit is based on 1.7.0,
  • the listener is not discarded when the future completes: in my understanding this would only help in the rare case where a sender/receiver completes due to a spurious wake-up (without consuming its notification); if my understanding is correct, I would say that the fixed cost of the extra conditional does not carry its weight.

I have made some benchmarking in various scenarios and did not observe a meaningful difference in terms of performance.

That being said, we can't exclude that performance could be degraded in extreme cases. In theory degraded performance could happen with a very large number of fast senders combined with a fast receiver loop: a receiver which rapidly empties the channel would generate many notifications but if the first sender consume all the capacity before the remaining senders are polled, most notifications would be wasted. A symmetric situation could occur with many receivers and a fast sender loop.

@sbarral
Copy link
Contributor Author

sbarral commented Aug 9, 2022

Here are the results of the only benchmark that showed significant enough differences.

This benchmark is a beast so I can't really share it easily (I made it for my own purposes), but in a nutshell this is a "funnel" benchmark where each receiver is connected to 13 senders which each send a very small payload (a usize) in a tight loop, and the receiver does basically nothing except receive these payloads in a tight loop. The benchmark runs simultaneously 61 such channels to ensure that all executor threads are working at all time. The below figures were obtained with tokio using 4 threads.

Number of messages per microsecond as a function of channel capacity before the commit (higher is better):

        capacity=1        2.390 msg/µs
        capacity=10      13.094 msg/µs
        capacity=100     24.756 msg/µs
        capacity=1000    28.006 msg/µs
        capacity=10000   28.008 msg/µs

The same after the commit:

        capacity=1        2.342 msg/µs
        capacity=10       4.964 msg/µs
        capacity=100     30.280 msg/µs
        capacity=1000    30.839 msg/µs
        capacity=10000   33.092 msg/µs

So it's a mixed bag really, and overall one could even claim that the performance is better after, but in my opinion "funnel-type" benchmarks are a bit artificial anyway (it's unusual to have both a very fast receiver and very fast senders) so I would not put too much weight on these results.

@unixpickle
Copy link

I hit the bug that this fixes. It drove me nuts for days and was tough to find. Please merge this!

tests/bounded.rs Outdated

fn ms(ms: u64) -> Duration {
Duration::from_millis(ms)
}

fn poll_now<F: Future + Unpin>(f: &mut F) -> Poll<F::Output> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably just use poll_once from futures_lite here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I discarded this solution because the future must be kept alive so can't be consumed, it didn't come to my mind that passing a mutable ref would work just as well since it implement Future too...

@@ -966,14 +965,7 @@ impl<'a, T> Send<'a, T> {
let msg = self.msg.take().unwrap();
// Attempt to send a message.
match self.sender.try_send(msg) {
Ok(()) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you get rid of this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not totally sure what your question is about.
The same thing is done for the Recv future, and this is because the original mechanism was meant to implement a "slow" notification chain to limit the number of simultaneous in-flight notifications. So only one notification was sent at a time, but this made it necessary to unconditionally send another notification, whether this notification was needed or not.
This is no longer necessary since now all notifications are sent, whether or not one is already in-flight.
Note that PR #16 did the exact same thing and the original author seemed to agree it was correct. And all tests pass, which is another indicator that it should be OK :-)

unixpickle added a commit to unixpickle/learning-rust that referenced this pull request Sep 24, 2022
I was getting burned by this issue, hanging the scraper:
smol-rs/async-channel#49
Copy link
Member

@notgull notgull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@notgull
Copy link
Member

notgull commented Sep 27, 2022

@sbarral Do you have any other benchmarks, e.g. basic SPSC or MPSC? I'd like to make sure this PR doesn't regress the common case before I merge it.

@sbarral
Copy link
Contributor Author

sbarral commented Sep 27, 2022

@sbarral Do you have any other benchmarks, e.g. basic SPSC or MPSC? I'd like to make sure this PR doesn't regress the common case before I merge it.

The "funnel" benchmark mentioned above is what I would think of as the archetypical MPSC benchmark, and is indeed the one that shows that there is some trade-off.

Truth be told, I didn't really expect any difference in the SPSC case since in that case there can be only one registered listener on either side (1 sender/1 receiver). But just to be sure, I changed the parameters of the "funnel" benchmark to have just one sender, making it effectively an SPSC. So this tests spawns 61 sender/receiver pairs on tokio and measures the throughput.

Number of messages per microsecond as a function of channel capacity on v1.7.1, before the commit (higher is better):

capacity=1        4.257 msg/µs
capacity=10      15.226 msg/µs
capacity=100     22.409 msg/µs
capacity=1000    24.989 msg/µs
capacity=10000   24.930 msg/µs

The same after the commit:

capacity=1        4.353 msg/µs
capacity=10      18.416 msg/µs
capacity=100     30.980 msg/µs
capacity=1000    32.917 msg/µs
capacity=10000   32.954 msg/µs

So that's a nice surprise... My best guess is that this comes from the deletion of the additional notification which you inquired about. In the SPSC case this extra notification is never needed, but even when there is no interested listener it introduces a full fence (see here). This would explain why there is no difference for capacity=1 since this was special-cased before the commit to prevent the additional notification.

@notgull
Copy link
Member

notgull commented Sep 27, 2022

I don't see any reason not to merge this, thanks!

@notgull notgull merged commit 736bc72 into smol-rs:master Sep 27, 2022
@sbarral
Copy link
Contributor Author

sbarral commented Oct 28, 2022

While there is still time to revert this commit, I wanted to highlight a caveat which I initially missed so that the decision to proceed with this PR (or not) is fully informed.

While I continue to believe that this PR fixes a real flaw, I realize now that the current behaviour was superior in one aspect: it warranted strict fairness (edit: not totally true, see below) by making sure that senders and receivers were unblocked in the same order they were blocked. After this PR, they are still notified in fair order, but nothing guarantees that they will get scheduled fairly. Unfortunately this strict fairness was also the direct cause of issue #45.

A nice idea is tokio's MPSC strategy, which I only became aware of recently. It sends all notifications right away like in this PR and does not strictly ensure fairness, but it will only let a sender push a value if this leaves enough free slots for senders that were notified earlier but not yet scheduled. This prevents deadlocks like in issue #45 and still provides fairness for all practical purposes.

Now I don't really see a way to implement tokio's strategy with the current architecture, so in practice we will probably have to choose between accepting easy deadlocks or resigning from strict fairness.

Edit: worth noting, however, that the behaviour before this PR was not actually strictly fair either: it would still let a non-notified sender/receiver steal a slot from a notified sender/receiver.

This was referenced Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Bounded channel fully deadlocked when sender is forgotten
3 participants