Make ReplayStage own the threadpool for tx replay #190

steviez · 2024-03-11T19:37:49Z

Problem

The threadpool used to replay multiple transactions in parallel is
currently global state via a lazy_static definition. Making this pool
owned by ReplayStage will enable subsequent work to make the pool
size configurable on the CLI.

More context in #105

Summary of Changes

Make ReplayStage create the threadpool and pass it down to blockstore_processor

This items is currently global state via a lazy_static in blockstore_processor.rs. Making it owned by ReplayStage is setup to allow configuration of the size of the pool via CLI

steviez

This PR maintains the existing thread-pool size for non-test cases. That being said, I noticed this line that will probably need to be evaluated when the change to reduce this threadpool size comes up

agave/ledger/src/blockstore_processor.rs

Line 450 in 5c1df15

let target_batch_count = get_thread_count() as u64;

steviez · 2024-03-11T19:48:52Z

ledger/src/blockstore_processor.rs

@@ -978,9 +991,11 @@ fn verify_ticks(
    Ok(())
 }

+#[allow(clippy::too_many_arguments)]


These functions as well as ReplayStage are in need of some cleanup. One easy win I think would be packing all the stuff that only gets accessed by the bottom layer of the call-stack (execute_batch()) into a struct and unpacking the struct in the bottom layer instead of passing each individual piece in every function.

That being said, I would like to defer that work after this PR - soon enough, work on this project will be lots of experiments / data collection. At that point, I'll have more "idle" time that I could come back to perform this cleanup if there is nothing else more pressing for me

I'll have more "idle" time

you wish

steviez · 2024-03-11T20:45:03Z

ledger/src/blockstore_processor.rs

@@ -871,10 +882,12 @@ pub fn process_blockstore_from_root(
        .meta(start_slot)
        .unwrap_or_else(|_| panic!("Failed to get meta for slot {start_slot}"))
    {
+        let replay_tx_thread_pool = create_thread_pool(get_max_thread_count());


Another pro of this change is that we'll be able to configure this value to be different size for startup and steady-state. That's good because startup involves replaying entire blocks that are already in ledger, so higher amount of parallelism might be warranted vs. steady-state when blocks are gradually streamed over

Makes sense. Looks like we're using 1 thread per core already, so opportunity seems to be dialing this down for steady state

so opportunity seems to be dialing this down for steady state

Exactly - already have a good bit of data to support this being the correct decision in #25

core/src/replay_stage.rs

where XY is one of {0, ..., num_threads_in_pool}

bw-solana

LGTM

t-nelson · 2024-03-12T05:35:59Z

should we not loft this even higher so banking stage can use the same pool while replay is idle?

steviez · 2024-03-12T06:25:59Z

should we not loft this even higher so banking stage can use the same pool while replay is idle?

Maybe ? If I recall correctly, I thought BankingStage started doing some work prior to its' leader slots actually starting. Something like organization of tx's so that it things are primed when the leader slots actually start. If that is correct, sharing the pool could have some adverse effects, ie slowing down replay just prior / slowing down the banking stage "priming".

I can followup with someone in the morning. I also already plan on attempting to use the same pool for entry verification and tx replay; that should be a pretty good gain. And comparatively, BankingStage isn't as abusive on thread counts (can't say the same for the tx ingestion pipeline) so we could also keep things separate for now to claim the low hanging fruit, and then circle back to try to further condense

steviez · 2024-03-12T16:50:13Z

should we not loft this even higher so banking stage can use the same pool while replay is idle?

Maybe ? If I recall correctly, I thought BankingStage started doing some work prior to its' leader slots actually starting. Something like organization of tx's so that it things are primed when the leader slots actually start. If that is correct, sharing the pool could have some adverse effects, ie slowing down replay just prior / slowing down the banking stage "priming".

I can followup with someone in the morning...

I talked to @apfitzge, and the short answer is yes, there appears to be some work that BankingStage is doing prior to leader slots actually starting. I will have to defer to him to go into more detail, but given that, I'm inclined to push this PR as-is and pursue any combination of resources between ReplayStage and BankingStage in the future

apfitzge · 2024-03-12T17:00:27Z

I think it could probably be done, but will need to be careful with how we do that. BankingStage still does work before becoming leader:

receiving packets from sigverify
sanitizing transactions
buffering transactions (priority ordered - O(nlogn) operation!)
clean buffer - removes txs that were already processed or expired

We probably could make the ConsumeWorkers in the new central-scheduler mode of banking stage use a shared thread-pool but not the scheduler thread itself.
Workers should be sleeping (blocking crossbeam receive) except when we are leader - scheduler thread has sent them work.

The threadpool used to replay multiple transactions in parallel is currently global state via a lazy_static definition. Making this pool owned by ReplayStage will enable subsequent work to make the pool size configurable on the CLI. This makes `ReplayStage` create and hold the threadpool which is passed down to blockstore_processor::confirm_slot(). blockstore_processor::process_blockstore_from_root() now creates its' own threadpool as well; however, this pool is only alive while for the scope of that function and does not persist the lifetime of the process.

steviez added 3 commits March 11, 2024 14:26

Make ReplayStage own the threadpool for tx replay

04fe8ed

This items is currently global state via a lazy_static in blockstore_processor.rs. Making it owned by ReplayStage is setup to allow configuration of the size of the pool via CLI

cargo fmt and clippy

e8588db

Restrict thread pool size to single thread for test cases

1469a4e

steviez commented Mar 11, 2024

View reviewed changes

steviez requested review from t-nelson and bw-solana March 11, 2024 20:45

bw-solana reviewed Mar 11, 2024

View reviewed changes

core/src/replay_stage.rs Show resolved Hide resolved

Rename parallel fork replay pool to solReplayForkXY

6f3f759

where XY is one of {0, ..., num_threads_in_pool}

bw-solana approved these changes Mar 11, 2024

View reviewed changes

steviez merged commit 7a144e2 into anza-xyz:master Mar 12, 2024
35 checks passed

steviez deleted the rs_own_tx_replay_tpool branch March 12, 2024 18:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make ReplayStage own the threadpool for tx replay #190

Make ReplayStage own the threadpool for tx replay #190

steviez commented Mar 11, 2024

steviez left a comment

steviez Mar 11, 2024 •

edited

Loading

bw-solana Mar 11, 2024

steviez Mar 11, 2024

bw-solana Mar 11, 2024

steviez Mar 11, 2024

bw-solana left a comment

t-nelson commented Mar 12, 2024

steviez commented Mar 12, 2024

steviez commented Mar 12, 2024 •

edited

Loading

apfitzge commented Mar 12, 2024

Make ReplayStage own the threadpool for tx replay #190

Make ReplayStage own the threadpool for tx replay #190

Conversation

steviez commented Mar 11, 2024

Problem

Summary of Changes

steviez left a comment

Choose a reason for hiding this comment

steviez Mar 11, 2024 • edited Loading

Choose a reason for hiding this comment

bw-solana Mar 11, 2024

Choose a reason for hiding this comment

steviez Mar 11, 2024

Choose a reason for hiding this comment

bw-solana Mar 11, 2024

Choose a reason for hiding this comment

steviez Mar 11, 2024

Choose a reason for hiding this comment

bw-solana left a comment

Choose a reason for hiding this comment

t-nelson commented Mar 12, 2024

steviez commented Mar 12, 2024

steviez commented Mar 12, 2024 • edited Loading

apfitzge commented Mar 12, 2024

steviez Mar 11, 2024 •

edited

Loading

steviez commented Mar 12, 2024 •

edited

Loading