Spawn VID with `spawn_blocking` #2369

bfish713 · 2024-01-09T21:54:25Z

This PR:

Spawn the VID calculation on a different thread from the async executor to not block other tasks. From this blog https://ryhl.io/blog/async-what-is-blocking/ from the tokio creator we should not be going more than ~100 micro seconds. VID disperse takes an order of magnitude or longer than this so we should run it on it a dedicated thread for blocking. The blog recommends rayon for cpu tasks but also says if we don't have a lot of CPU work than spawn_blocking is ok

I also increased the async threads to two in the just file because I believe the CI infra has 2 threads (except for maybe self hosted). Personally I think we should remove this restriction entirely.

This PR does not:

Look at other places where we block async tasks for cpu heavy work. We should probably audit the code for other places where we might go a long time between awaits in our code. Potentially signature aggregation, qc validation, and serialization.

It also doesn't make the VID disperse run on multiple threads. As mentioned in the task we should not be dividing up the block on the consensus side. The further parallelization will be done in jellyfish.

Key places to review:

vid.rs where we now spawn a task. Note that the unwrap for the tokio version just passes on any panics that happen during the thread, so it is "safe", i.e it's not adding a new place we can panic.

This should improve the consistency of our CI since we should be able to run faster. I think a key issue with our tests was that all nodes would block each other from running while the leader calculated the vid disperse and blocked the main executor. In a single node setting it's probably not so bad since a leader would probably not have other work to do while it was blocked on vid calculation.

shenkeyao

LGTM, just a minor comment.

shenkeyao · 2024-01-10T18:34:26Z

crates/task-impls/src/vid.rs

-                let vid_disperse = vid.disperse(encoded_transactions.clone()).unwrap();
-
+                let vid_disperse = spawn_blocking(move || {
+                    let vid = VidScheme::new(chunk_size, num_quorum_committee, &srs).unwrap();


Can we add the note below (copied from the PR description) as a comment here?

Note that the unwrap for the tokio version just passes on any panics that happen during the thread, so it is "safe", i.e it's not adding a new place we can panic.

bfish713 added 4 commits January 9, 2024 16:45

spawn vid in new task

27cac14

run on two threads

c18244f

fix tokio

6249b2c

ignore network_task test

2bfbd48

bfish713 requested review from DieracDelta, shenkeyao and rob-maron January 10, 2024 15:45

bfish713 self-assigned this Jan 10, 2024

bfish713 added the optimize-vid label Jan 10, 2024

bfish713 added this to the Sprint 7 milestone Jan 10, 2024

Merge remote-tracking branch 'origin/main' into bf/v/vid-spawn-blocking

0c77163

bfish713 marked this pull request as ready for review January 10, 2024 18:28

bfish713 requested a review from elliedavidson as a code owner January 10, 2024 18:28

shenkeyao previously approved these changes Jan 10, 2024

View reviewed changes

comment about unwrap

6424215

bfish713 dismissed shenkeyao’s stale review via 6424215 January 10, 2024 18:46

bfish713 requested a review from shenkeyao January 10, 2024 18:46

shenkeyao approved these changes Jan 10, 2024

View reviewed changes

bfish713 merged commit 1d03326 into main Jan 10, 2024
13 checks passed

bfish713 deleted the bf/v/vid-spawn-blocking branch January 10, 2024 19:15

bfish713 changed the title ~~Spawn VID with spawn_blocking WIP~~ Spawn VID with spawn_blocking Jan 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spawn VID with `spawn_blocking` #2369

Spawn VID with `spawn_blocking` #2369

bfish713 commented Jan 9, 2024 •

edited

Loading

shenkeyao left a comment

shenkeyao Jan 10, 2024

Spawn VID with spawn_blocking #2369

Spawn VID with spawn_blocking #2369

Conversation

bfish713 commented Jan 9, 2024 • edited Loading

This PR:

This PR does not:

Key places to review:

shenkeyao left a comment

Choose a reason for hiding this comment

shenkeyao Jan 10, 2024

Choose a reason for hiding this comment

Spawn VID with `spawn_blocking` #2369

Spawn VID with `spawn_blocking` #2369

bfish713 commented Jan 9, 2024 •

edited

Loading