perf: dynamically batch tx sender recovery #1834

onbjerg · 2023-03-18T13:05:48Z

The performance regression in the sender recovery stage was caused by us effectively queuing 5000 really "fast" (relatively) jobs, leading to a lot of time lost on Rayon's worker threads trying to steal more jobs.

The solution is to reintroduce batching. For now, we create batches based on the number of worker threads in the Rayon threadpool. This works since we are limited by memory, and can't crank the commit threshold too much, and separate config for batch sizes in this case doesn't make much sense.

This is a perf grab of the current sender recovery stage:

As we can see here (on the top right), almost 50%(!) of the time is spent trying to get more work.

Compare this with this PR:

We almost spend no time trying to get more work.

The speedup for me is that sender recovery now feels snappy again - before, it felt like it took 20-30s per 5k blocks, now it takes about 3-4.

codecov-commenter · 2023-03-18T13:19:37Z

Codecov Report

Merging #1834 (ffe7c3b) into main (995c5ad) will increase coverage by 0.00%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@           Coverage Diff           @@
##             main    #1834   +/-   ##
=======================================
  Coverage   73.50%   73.51%           
=======================================
  Files         410      410           
  Lines       50515    50527   +12     
=======================================
+ Hits        37131    37143   +12     
  Misses      13384    13384

Flag	Coverage Δ
integration-tests	`19.70% <0.00%> (-0.01%)`	⬇️
unit-tests	`67.86% <100.00%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
crates/stages/src/stages/sender_recovery.rs	`94.14% <100.00%> (+2.95%)`	⬆️

... and 6 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

mattsse

gg

mattsse · 2023-03-18T13:27:40Z

crates/stages/src/stages/sender_recovery.rs

+                    .for_each(|result: Result<_, StageError>| {
+                        let _ = tx.send(result);
+                    });


sending them one by one is totally fine

mattsse · 2023-03-18T13:28:22Z

crates/stages/src/stages/sender_recovery.rs

+        for chunk in
+            &tx_walker.chunks(self.commit_threshold as usize / rayon::current_num_threads())


👍

in hindsight this is kinda obvious

onbjerg added A-staged-sync Related to staged sync (pipelines and stages) C-perf A change motivated by improving speed, memory usage or disk footprint labels Mar 18, 2023

onbjerg requested a review from rkrasiuk as a code owner March 18, 2023 13:05

perf: dynamically batch tx sender recovery

ffe7c3b

onbjerg force-pushed the onbjerg/sender-recovery-perf branch from 82b89a9 to ffe7c3b Compare March 18, 2023 13:08

mattsse approved these changes Mar 18, 2023

View reviewed changes

onbjerg merged commit a05b3ff into main Mar 18, 2023

onbjerg deleted the onbjerg/sender-recovery-perf branch March 18, 2023 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: dynamically batch tx sender recovery #1834

perf: dynamically batch tx sender recovery #1834

onbjerg commented Mar 18, 2023 •

edited

Loading

codecov-commenter commented Mar 18, 2023 •

edited

Loading

mattsse left a comment

mattsse Mar 18, 2023 •

edited

Loading

mattsse Mar 18, 2023

onbjerg Mar 18, 2023

		for chunk in
		&tx_walker.chunks(self.commit_threshold as usize / rayon::current_num_threads())

perf: dynamically batch tx sender recovery #1834

perf: dynamically batch tx sender recovery #1834

Conversation

onbjerg commented Mar 18, 2023 • edited Loading

codecov-commenter commented Mar 18, 2023 • edited Loading

Codecov Report

mattsse left a comment

Choose a reason for hiding this comment

mattsse Mar 18, 2023 • edited Loading

Choose a reason for hiding this comment

mattsse Mar 18, 2023

Choose a reason for hiding this comment

onbjerg Mar 18, 2023

Choose a reason for hiding this comment

onbjerg commented Mar 18, 2023 •

edited

Loading

codecov-commenter commented Mar 18, 2023 •

edited

Loading

mattsse Mar 18, 2023 •

edited

Loading