Add benchmark for execute_batch #34717

apfitzge · 2024-01-09T19:17:58Z

Problem

We have a benchmark for the equivalent processing path for block-production, but it doesn't exist on the block-verification side

Summary of Changes

Add a benchmark for execute_batch with varying batch size: 1, 32, 64

Sample result

test bench_execute_batch_full_batch                        ... bench:     740,785 ns/iter (+/- 28,929)
test bench_execute_batch_full_batch_disable_tx_cost_update ... bench:     739,307 ns/iter (+/- 43,300)
test bench_execute_batch_half_batch                        ... bench:     410,188 ns/iter (+/- 13,334)
test bench_execute_batch_half_batch_disable_tx_cost_update ... bench:     407,061 ns/iter (+/- 11,797)
test bench_execute_batch_unbatched                         ... bench:      20,892 ns/iter (+/- 573)
test bench_execute_batch_unbatched_disable_tx_cost_update  ... bench:      22,791 ns/iter (+/- 1,451)

Fixes #

codecov · 2024-01-09T20:38:10Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (b814497) 81.8% compared to head (2d36845) 81.8%.
Report is 57 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff            @@
##           master   #34717    +/-   ##
========================================
  Coverage    81.8%    81.8%            
========================================
  Files         824      824            
  Lines      222394   222394            
========================================
+ Hits       181957   182057   +100     
+ Misses      40437    40337   -100

ryoqun · 2024-01-11T05:52:37Z

ledger/benches/blockstore_processor.rs

+
+    let mut timing = ExecuteTimings::default();
+    bencher.iter({
+        let bank = bank.clone();


nit: a single Arc::clone() won't skew the results, but I'd like to remove .clone() here like this mainly for code simplicity:

$ git diff apfitzge/bench_execute_batch diff --git a/ledger/benches/blockstore_processor.rs b/ledger/benches/blockstore_processor.rs index b5d83144a6..e0d19853fe 100644 --- a/ledger/benches/blockstore_processor.rs +++ b/ledger/benches/blockstore_processor.rs @@ -113,11 +113,12 @@ fn bench_execute_batch( prioritization_fee_cache, } = setup(apply_cost_tracker_during_replay); let transactions = create_transactions(&bank, 2_usize.pow(20)); + let bank2 = bank.clone(); let batches: Vec<_> = transactions .chunks(batch_size) .map(|txs| { let mut batch = - TransactionBatch::new(vec![Ok(()); txs.len()], &bank, Cow::Borrowed(txs)); + TransactionBatch::new(vec![Ok(()); txs.len()], &bank2, Cow::Borrowed(txs)); batch.set_needs_unlock(false); TransactionBatchWithIndexes { batch, @@ -128,20 +129,17 @@ fn bench_execute_batch( let mut batches_iter = batches.into_iter(); let mut timing = ExecuteTimings::default(); - bencher.iter({ - let bank = bank.clone(); - move || { - let batch = batches_iter.next().unwrap(); - execute_batch( - &batch, - &bank, - None, - None, - &mut timing, - None, - &prioritization_fee_cache, - ) - } + bencher.iter(|| { + let batch = batches_iter.next().unwrap(); + execute_batch( + &batch, + &bank, + None, + None, + &mut timing, + None, + &prioritization_fee_cache, + ) }); }

The Arc::clone wasn't actually happening per iteration, but just in the creation of the closure because I marked the closure with move...which I'm not sure was necessary.

It wasn't! d2dfb14

ryoqun · 2024-01-11T07:06:33Z

ledger/benches/blockstore_processor.rs

+        let bank = bank.clone();
+        move || {
+            let batch = batches_iter.next().unwrap();
+            execute_batch(


I'd prefer to process the same total number of transactions regardless batch_size to show the overhead more clearly:

diff --git a/ledger/benches/blockstore_processor.rs b/ledger/benches/blockstore_processor.rs index e0d19853fe..7ec2b17d97 100644 --- a/ledger/benches/blockstore_processor.rs +++ b/ledger/benches/blockstore_processor.rs @@ -130,16 +130,18 @@ fn bench_execute_batch( let mut timing = ExecuteTimings::default(); bencher.iter(|| { - let batch = batches_iter.next().unwrap(); - execute_batch( - &batch, - &bank, - None, - None, - &mut timing, - None, - &prioritization_fee_cache, - ) + for _ in 0..(64/batch_size) { // EDIT: well, using `.take()` is prefered... + let batch = batches_iter.next().unwrap(); + execute_batch( + &batch, + &bank, + None, + None, + &mut timing, + None, + &prioritization_fee_cache, + ).unwrap(); + } }); }

result:

running 6 tests test bench_execute_batch_full_batch ... bench: 740,169 ns/iter (+/- 39,800) test bench_execute_batch_full_batch_disable_tx_cost_update ... bench: 774,346 ns/iter (+/- 28,495) test bench_execute_batch_half_batch ... bench: 824,189 ns/iter (+/- 29,661) test bench_execute_batch_half_batch_disable_tx_cost_update ... bench: 811,608 ns/iter (+/- 20,936) test bench_execute_batch_unbatched ... bench: 1,381,782 ns/iter (+/- 49,475) test bench_execute_batch_unbatched_disable_tx_cost_update ... bench: 1,334,157 ns/iter (+/- 88,541) test result: ok. 0 passed; 0 failed; 0 ignored; 6 measured; 0 filtered out; finished in 42.13s

Yeah that's a great idea!

Didn't use take since that will consume the iterator and not let us use it the next iteration of the benchmark.

I made a similar change for the consumer benchmarks: #34752

Thanks for the suggestion, no more math to compare the throughput

ledger/benches/blockstore_processor.rs

ryoqun

lgtm; thanks for writing this for unified scheduler.

Add benchmark for execute_batch

5f41ff5

apfitzge mentioned this pull request Jan 9, 2024

Introduce primitive threading in unified scheduler #34676

Merged

apfitzge marked this pull request as ready for review January 9, 2024 23:04

apfitzge requested a review from ryoqun January 9, 2024 23:04

ryoqun reviewed Jan 11, 2024

View reviewed changes

apfitzge added 2 commits January 11, 2024 09:19

remove move and bank clone

d2dfb14

same number of txs per iteration - directly compare benches

c0d5b46

apfitzge requested a review from ryoqun January 11, 2024 17:39

apfitzge mentioned this pull request Jan 11, 2024

consumer bench same number of txs per iteration #34752

Merged

ryoqun reviewed Jan 12, 2024

View reviewed changes

ledger/benches/blockstore_processor.rs Outdated Show resolved Hide resolved

remove drop from bench

2d36845

apfitzge requested a review from ryoqun January 12, 2024 21:07

ryoqun approved these changes Jan 13, 2024

View reviewed changes

apfitzge merged commit 257ba2f into solana-labs:master Jan 13, 2024
35 checks passed

willhickey mentioned this pull request Mar 28, 2024

v1.18 commits - please ignore anza-xyz/agave#475

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark for execute_batch #34717

Add benchmark for execute_batch #34717

apfitzge commented Jan 9, 2024 •

edited by ryoqun

Loading

codecov bot commented Jan 9, 2024 •

edited

Loading

ryoqun Jan 11, 2024

apfitzge Jan 11, 2024

ryoqun Jan 11, 2024 •

edited

Loading

apfitzge Jan 11, 2024

apfitzge Jan 11, 2024

apfitzge Jan 11, 2024

apfitzge Jan 11, 2024

ryoqun left a comment

Add benchmark for execute_batch #34717

Add benchmark for execute_batch #34717

Conversation

apfitzge commented Jan 9, 2024 • edited by ryoqun Loading

Problem

Summary of Changes

Sample result

codecov bot commented Jan 9, 2024 • edited Loading

Codecov Report

ryoqun Jan 11, 2024

Choose a reason for hiding this comment

apfitzge Jan 11, 2024

Choose a reason for hiding this comment

ryoqun Jan 11, 2024 • edited Loading

Choose a reason for hiding this comment

apfitzge Jan 11, 2024

Choose a reason for hiding this comment

apfitzge Jan 11, 2024

Choose a reason for hiding this comment

apfitzge Jan 11, 2024

Choose a reason for hiding this comment

apfitzge Jan 11, 2024

Choose a reason for hiding this comment

ryoqun left a comment

Choose a reason for hiding this comment

apfitzge commented Jan 9, 2024 •

edited by ryoqun

Loading

codecov bot commented Jan 9, 2024 •

edited

Loading

ryoqun Jan 11, 2024 •

edited

Loading