Speed-up waker by using uninitialized array #4055

glebpom · 2021-08-23T01:29:00Z

Motivation

I was exploring the flamegraph of my simple warp benchmark, and found that a lot of time was spent in wake0 function. After playing with the size of wakers array I realized that reducing the size increases the performance. Array doesn't allocate, so the only reason is the overhead of copying Option<Waker> during the initialization and dropping on exiting from the function.

Solution

Using MaybeUninit prevents the need for initialization and prevents dropping.

On my simple warp plaintext benchmark, the performance increased from

Requests/sec: 1115746.29
Transfer/sec:    141.52MB

to

Requests/sec: 1161816.14
Transfer/sec:    147.36MB

…ion<Waker>

tokio/src/io/driver/scheduled_io.rs

paolobarbolini

Just dropping by since I'm subscribed to notifications. I'm not sure if wake could ever panic but nonetheless this feels very dangerous

tokio/src/util/wake_list.rs

Co-authored-by: Alice Ryhl <[email protected]>

…tation Co-authored-by: Alice Ryhl <[email protected]>

hawkw

This looks like a potentially very nice optimization, thanks for the PR!

There is some very similar code in Tokio's synchronization primitives, where an array of wakers is initialized with a bunch of options (in fact, I believe the current code in scheduled_io was based on the code in tokio::sync. I know this pattern is used in Notify, here

tokio/tokio/src/sync/notify.rs

Lines 394 to 397 in 8198ef3

    
           const NUM_WAKERS: usize = 32; 
        
           let mut wakers: [Option<Waker>; NUM_WAKERS] = Default::default(); 
        
           let mut curr_waker = 0;

and in batch_semaphore, here:

tokio/tokio/src/sync/batch_semaphore.rs

Line 242 in 8198ef3

let mut wakers: [Option<Waker>; 8] = Default::default();

It seems like it should be possible to use the WakeList type in tokio::sync, as well. That way, the synchronization primitives could potentially avoid the overhead of initializing a large array of wakers as well. It would be fine to make this change in a follow-up branch once this merges, though --- I just thought it would be worth pointing out.

tokio/src/util/wake_list.rs

This reverts commit 905dc93.

Darksonn · 2021-08-24T06:27:58Z

tokio/src/io/driver/scheduled_io.rs

    fn wake0(&self, ready: Ready, shutdown: bool) {
-        const NUM_WAKERS: usize = 32;
-
-        let mut wakers: [Option<Waker>; NUM_WAKERS] = Default::default();
-        let mut curr = 0;
+        let mut wakers = WakeList::new();


Here's a thought: Are we not protecting against panics here in the wrong way? Should we really be panicking and not waking the other wakers just because someone gave us a waker that emitted a panic?

Maybe we should just catch all panics that happen when calling wake and ignore them.

What is the typical scenario of waker panic? Does it invoke polling directly? I'm not familiar with all internals, but it seems like there should be some validation of the solution. Writing the proper test cases with panicking would definitely help.

These wakers are user supplied and could run literally any code in the wake call. That said, a well behaved waker should never panic.

Should we really be panicking and not waking the other wakers just because someone gave us a waker that emitted a panic?

Hmm, that's a good point. If one waker panics, failing to wake the others could result in those tasks never being notified.

On the other hand, what's the overhead of adding a catch_unwind in this fairly hot loop? Is that worth introducing to handle a case which can only happen if a user-supplied waker is not "well-behaved"?

We can still have the destructor call wake on the others without a catch_unwind.

We can still have the destructor call wake on the others without a catch_unwind.

A destructor panicing would be even worse, since it could cause a double panic, which would result into an abort

Can't user-supplied waker's Drop implementation panic as well? This will cause double panic in the Drop implementation even without waking

Yes, if the wakers panic in the destructor, you can get a double abort.

It looks like it's absolutely critical to not panic in any of the waker functions. The alternative approach is not to try to fix the erroneous implementation, but to forcefully abort the execution

Darksonn

I will approve this for now and open a new issue to discuss the question about catching panics.

This commit updates `tokio::sync::Notify` to use the `WakeList` type added in PR #4055. This may improve performance somewhat, as it will avoid initializing a bunch of empty `Option`s when waking. I'd like to make similar changes to `BatchSemaphore`, but this is a somewhat larger change, as the wakers stored in the array are not `Waker`s but an internal type, and permit assigning operations are performed prior to waking.

This commit updates the internal semaphore implementation (`batch_semaphore.rs`) to use the new `WakeList` type added in PR #4055.

tokio/src/util/wake_list.rs

## Motivation PR #4055 added a new `WakeList` type, to manage a potentially uninitialized array when waking batches of wakers. This has the advantage of not initializing a bunch of empty `Option`s when only a small number of tasks are being woken, potentially improving performance in these cases. Currently, `WakeList` is used only in the IO driver. However, `tokio::sync` contains some code that's almost identical to the code in the IO driver that was replaced with `WakeList`, so we can apply the same optimizations there. ## Solution This branch changes `tokio::sync::Notify` and `tokio::sync::batch_semaphore::Semaphore` to use `WakeList` when waking batches of wakers. This was a pretty straightforward drop-in replacement. Signed-off-by: Eliza Weisman <[email protected]>

This release features some performance improvements: tokio has been updated to pick up tokio-rs/tokio#4055, and link-time optimizations have been enabled in release builds. These changes reduce CPU and memory overhead in benchmarks. Inbound policy enforcement has been updated so that TCP forwarding is interrupted if a policy update revokes a previously-established authorization. New metrics are exposed to reflect how policies are used by the proxy: `inbound_http_authz_{allow,deny}_total` and `inbound_tcp_authz_{allow,deny,terminate}_total`. The proxy's error metrics, `{inbound,outbound}_{http,tcp}_errors_total`, have been updated to include the traffic target. And the `traffic_addr` metric label is augmented by `target_ip` and `target_port` labels to support more flexible prometheus queries. Inbound TCP metrics now only include a `srv_name` label, as it can't be expected for all inbound connections to include authorization labels (hence the new authz metrics). However, all inbound HTTP metrics--except for the HTTP errors metric, which includes only a `srv_name` label--include both `srv_name` and `saz_name` label. Finally, the inbound and outbound proxies now only exports Route-oriented metrics when a ServiceProfile is enabled, preventing redundant metrics from being exported with no differentiating labels. --- * profiles: Avoid creating a default route stack (linkerd/linkerd2-proxy#1223) * build(deps): bump arbitrary from 1.0.1 to 1.0.2 (linkerd/linkerd2-proxy#1224) * build(deps): bump trust-dns-resolver from `f08860c` to `3d0667a` (linkerd/linkerd2-proxy#1225) * build(deps): bump libc from 0.2.100 to 0.2.101 (linkerd/linkerd2-proxy#1226) * Enable link-time optimizations (linkerd/linkerd2-proxy#1227) * build(deps): bump serde_json from 1.0.66 to 1.0.67 (linkerd/linkerd2-proxy#1228) * build(deps): bump flate2 from 1.0.20 to 1.0.21 (linkerd/linkerd2-proxy#1230) * build(deps): bump thiserror from 1.0.26 to 1.0.28 (linkerd/linkerd2-proxy#1231) * build(deps): bump futures from 0.3.16 to 0.3.17 (linkerd/linkerd2-proxy#1232) * build(deps): bump parking_lot from 0.11.1 to 0.11.2 (linkerd/linkerd2-proxy#1234) * build(deps): bump trust-dns-resolver from `3d0667a` to `v0.21.0-alpha.2` (linkerd/linkerd2-proxy#1233) * Rename push_on_response to push_on_service (linkerd/linkerd2-proxy#1235) * build(deps): bump tokio from 1.10.1 to 1.11.0 (linkerd/linkerd2-proxy#1236) * metrics: Add `target_ip` and `target_port` labels (linkerd/linkerd2-proxy#1238) * inbound: Improve policy metrics (linkerd/linkerd2-proxy#1237) * inbound: Include server labels in tap responses (linkerd/linkerd2-proxy#1239) * Revert rustc update for release builds

Use uninitialized array in wake0 to prevent initialiation of each Opt…

7fd2ae7

…ion<Waker>

Darksonn added A-tokio Area: The main tokio crate M-io Module: tokio/io labels Aug 23, 2021

Darksonn reviewed Aug 23, 2021

View reviewed changes

tokio/src/io/driver/scheduled_io.rs Outdated Show resolved Hide resolved

tokio/src/io/driver/scheduled_io.rs Outdated Show resolved Hide resolved

glebpom changed the title ~~Speed-up waker by unsing uninitialized array~~ Speed-up waker by using uninitialized array Aug 23, 2021

Gleb Pomykalov added 4 commits August 23, 2021 11:36

avoid using Option

01b91ab

rustfmt

2bb9da5

implement WakeList to support Drop on it

c0aad94

don't return from push

b932ea5

paolobarbolini reviewed Aug 23, 2021

View reviewed changes

tokio/src/util/wake_list.rs Outdated Show resolved Hide resolved

Darksonn reviewed Aug 23, 2021

View reviewed changes

tokio/src/util/wake_list.rs Outdated Show resolved Hide resolved

tokio/src/util/wake_list.rs Outdated Show resolved Hide resolved

glebpom and others added 7 commits August 23, 2021 15:22

Prevent later dropping of uninitialized Waker if wake panics.

03103d8

Co-authored-by: Alice Ryhl <[email protected]>

Fix can_push condition - allow usage of the last waker.

9dd73eb

Use slice_from_raw_parts_mut/drop_in_place in WakerList Drop implemen…

0fb5f46

…tation Co-authored-by: Alice Ryhl <[email protected]>

remove unused import

c7e65e6

fmt

5040110

try to fix imports

7068c3e

clippy

958d43b

hawkw approved these changes Aug 23, 2021

View reviewed changes

tokio/src/util/wake_list.rs Outdated Show resolved Hide resolved

tokio/src/util/wake_list.rs Show resolved Hide resolved

tokio/src/util/wake_list.rs Outdated Show resolved Hide resolved

Gleb Pomykalov added 4 commits August 23, 2021 21:48

import core::ptr

7a55899

use for loop with rev iterator in wake_all

905dc93

Revert "use for loop with rev iterator in wake_all"

cbdcfd3

This reverts commit 905dc93.

rustfmt

0267a17

Darksonn reviewed Aug 24, 2021

View reviewed changes

Darksonn approved these changes Aug 25, 2021

View reviewed changes

Darksonn merged commit 51f4f05 into tokio-rs:master Aug 25, 2021

Darksonn mentioned this pull request Aug 25, 2021

Panics in user-supplied wakers #4069

Open

hawkw added a commit that referenced this pull request Aug 25, 2021

sync: use WakeList in batch_semaphore

ae4653f

This commit updates the internal semaphore implementation (`batch_semaphore.rs`) to use the new `WakeList` type added in PR #4055.

hawkw mentioned this pull request Aug 25, 2021

sync: use WakeList in Notify and batch_semaphore #4071

Merged

Shnatsel reviewed Aug 26, 2021

View reviewed changes

tokio/src/util/wake_list.rs Show resolved Hide resolved

This was referenced Aug 26, 2021

util: add safety comment to assume_init #4075

Merged

chore: prepare Tokio v1.11.0 #4083

Merged

olix0r mentioned this pull request Sep 3, 2021

proxy: v2.155.0 linkerd/linkerd2#6802

Merged

paolobarbolini mentioned this pull request Apr 28, 2024

util: make WakeList::wake_all use FIFO ordering #6521

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed-up waker by using uninitialized array #4055

Speed-up waker by using uninitialized array #4055

glebpom commented Aug 23, 2021

paolobarbolini left a comment

hawkw left a comment

Darksonn Aug 24, 2021

glebpom Aug 24, 2021

Darksonn Aug 24, 2021 •

edited

Loading

hawkw Aug 24, 2021

Darksonn Aug 24, 2021

paolobarbolini Aug 24, 2021

glebpom Aug 24, 2021

Darksonn Aug 24, 2021

glebpom Aug 24, 2021

Darksonn left a comment

	const NUM_WAKERS: usize = 32;

	let mut wakers: [Option<Waker>; NUM_WAKERS] = Default::default();
	let mut curr_waker = 0;

Speed-up waker by using uninitialized array #4055

Speed-up waker by using uninitialized array #4055

Conversation

glebpom commented Aug 23, 2021

Motivation

Solution

paolobarbolini left a comment

Choose a reason for hiding this comment

hawkw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Darksonn Aug 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Darksonn left a comment

Choose a reason for hiding this comment

Darksonn Aug 24, 2021 •

edited

Loading