-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmark for execute_batch #34717
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #34717 +/- ##
========================================
Coverage 81.8% 81.8%
========================================
Files 824 824
Lines 222394 222394
========================================
+ Hits 181957 182057 +100
+ Misses 40437 40337 -100 |
|
||
let mut timing = ExecuteTimings::default(); | ||
bencher.iter({ | ||
let bank = bank.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: a single Arc::clone() won't skew the results, but I'd like to remove .clone()
here like this mainly for code simplicity:
$ git diff apfitzge/bench_execute_batch
diff --git a/ledger/benches/blockstore_processor.rs b/ledger/benches/blockstore_processor.rs
index b5d83144a6..e0d19853fe 100644
--- a/ledger/benches/blockstore_processor.rs
+++ b/ledger/benches/blockstore_processor.rs
@@ -113,11 +113,12 @@ fn bench_execute_batch(
prioritization_fee_cache,
} = setup(apply_cost_tracker_during_replay);
let transactions = create_transactions(&bank, 2_usize.pow(20));
+ let bank2 = bank.clone();
let batches: Vec<_> = transactions
.chunks(batch_size)
.map(|txs| {
let mut batch =
- TransactionBatch::new(vec![Ok(()); txs.len()], &bank, Cow::Borrowed(txs));
+ TransactionBatch::new(vec![Ok(()); txs.len()], &bank2, Cow::Borrowed(txs));
batch.set_needs_unlock(false);
TransactionBatchWithIndexes {
batch,
@@ -128,20 +129,17 @@ fn bench_execute_batch(
let mut batches_iter = batches.into_iter();
let mut timing = ExecuteTimings::default();
- bencher.iter({
- let bank = bank.clone();
- move || {
- let batch = batches_iter.next().unwrap();
- execute_batch(
- &batch,
- &bank,
- None,
- None,
- &mut timing,
- None,
- &prioritization_fee_cache,
- )
- }
+ bencher.iter(|| {
+ let batch = batches_iter.next().unwrap();
+ execute_batch(
+ &batch,
+ &bank,
+ None,
+ None,
+ &mut timing,
+ None,
+ &prioritization_fee_cache,
+ )
});
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Arc::clone
wasn't actually happening per iteration, but just in the creation of the closure because I marked the closure with move
...which I'm not sure was necessary.
It wasn't! d2dfb14
let bank = bank.clone(); | ||
move || { | ||
let batch = batches_iter.next().unwrap(); | ||
execute_batch( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to process the same total number of transactions regardless batch_size
to show the overhead more clearly:
diff --git a/ledger/benches/blockstore_processor.rs b/ledger/benches/blockstore_processor.rs
index e0d19853fe..7ec2b17d97 100644
--- a/ledger/benches/blockstore_processor.rs
+++ b/ledger/benches/blockstore_processor.rs
@@ -130,16 +130,18 @@ fn bench_execute_batch(
let mut timing = ExecuteTimings::default();
bencher.iter(|| {
- let batch = batches_iter.next().unwrap();
- execute_batch(
- &batch,
- &bank,
- None,
- None,
- &mut timing,
- None,
- &prioritization_fee_cache,
- )
+ for _ in 0..(64/batch_size) { // EDIT: well, using `.take()` is prefered...
+ let batch = batches_iter.next().unwrap();
+ execute_batch(
+ &batch,
+ &bank,
+ None,
+ None,
+ &mut timing,
+ None,
+ &prioritization_fee_cache,
+ ).unwrap();
+ }
});
}
result:
running 6 tests
test bench_execute_batch_full_batch ... bench: 740,169 ns/iter (+/- 39,800)
test bench_execute_batch_full_batch_disable_tx_cost_update ... bench: 774,346 ns/iter (+/- 28,495)
test bench_execute_batch_half_batch ... bench: 824,189 ns/iter (+/- 29,661)
test bench_execute_batch_half_batch_disable_tx_cost_update ... bench: 811,608 ns/iter (+/- 20,936)
test bench_execute_batch_unbatched ... bench: 1,381,782 ns/iter (+/- 49,475)
test bench_execute_batch_unbatched_disable_tx_cost_update ... bench: 1,334,157 ns/iter (+/- 88,541)
test result: ok. 0 passed; 0 failed; 0 ignored; 6 measured; 0 filtered out; finished in 42.13s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's a great idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't use take
since that will consume the iterator and not let us use it the next iteration of the benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a similar change for the consumer benchmarks: #34752
Thanks for the suggestion, no more math to compare the throughput
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm; thanks for writing this for unified scheduler.
Problem
Summary of Changes
execute_batch
with varying batch size: 1, 32, 64Sample result
Fixes #