Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use parallel iterator for executing transactions in an entry #22717

Closed
wants to merge 3 commits into from

Conversation

pgarg66
Copy link
Contributor

@pgarg66 pgarg66 commented Jan 24, 2022

Problem

Replay stage is taking long to process transactions in a given entry. The transactions can be executed in parallel to take advantage of multiple CPU cores. Currently, all the transactions in a given entry are being processed on a single thread.

Summary of Changes

Use par_iter to parallelize execution of transactions in a given entry. Collect and return transaction results, timings and error counters. Timings and counters are coalesced to match the current design.

Fixes #

@pgarg66 pgarg66 changed the title Use parallel iterator for executing transactions Use parallel iterator for executing transactions in an entry Jan 24, 2022
@mergify mergify bot added the community Community contribution label Jan 24, 2022
@mergify mergify bot requested a review from a team January 24, 2022 21:43
@jeffwashington
Copy link
Contributor

@pgarg66, do you have any metrics that show the effects of this change? I certainly like the concept it in general.

@codecov
Copy link

codecov bot commented Jan 25, 2022

Codecov Report

Merging #22717 (39c7748) into master (fc21af4) will increase coverage by 0.0%.
The diff coverage is 94.7%.

@@           Coverage Diff           @@
##           master   #22717   +/-   ##
=======================================
  Coverage    81.5%    81.5%           
=======================================
  Files         558      558           
  Lines      149605   149667   +62     
=======================================
+ Hits       121989   122041   +52     
- Misses      27616    27626   +10     

@pgarg66
Copy link
Contributor Author

pgarg66 commented Jan 25, 2022

@pgarg66, do you have any metrics that show the effects of this change? I certainly like the concept it in general.

I ran ledger-tool on a snapshot (captured on the mainnet-beta). The following were the execution timings for the snapshot on master vs this PR.

execute_us: 8,292,802,837 vs 6,741,731,920

The timings for load, store, and update_stakes were mostly unchanged, as would be expected.

load_us: 366815912, 342002517
store_us: 852623861 vs 879605693
update_stakes_cache_us: 713662859 vs 741641407

Concept was proposed by @sakridge. So the credits go to him.

@jstarry
Copy link
Member

jstarry commented Jan 26, 2022

@pgarg66, do you have any metrics that show the effects of this change? I certainly like the concept it in general.

I ran ledger-tool on a snapshot (captured on the mainnet-beta). The following were the execution timings for the snapshot on master vs this PR.

Did you apply this change to a v1.8 ledger-tool? It'd be interesting to see if the improvements are the same there vs master

@pgarg66
Copy link
Contributor Author

pgarg66 commented Jan 26, 2022

@pgarg66, do you have any metrics that show the effects of this change? I certainly like the concept it in general.

I ran ledger-tool on a snapshot (captured on the mainnet-beta). The following were the execution timings for the snapshot on master vs this PR.

Did you apply this change to a v1.8 ledger-tool? It'd be interesting to see if the improvements are the same there vs master

No, but I can cherry-pick it v1.8 and run the numbers. I'll post the numbers in this PR.

@pgarg66
Copy link
Contributor Author

pgarg66 commented Jan 26, 2022

I recalculated the numbers by rebasing the PR on master, and using release build variant this time. The table below has the results. (The previous numbers were with debug build, so there is some difference in the numbers)

The code on V1.8 branch is significantly different than master, for this specific function. Are we planning to back-port the changes on V1.8 to master? If not, this PR needs to be reimplemented for V1.8.

Step PR on master master
check_us 1,652,168 1,654,005
load_us 54,868,629 49,629,530
execute_us 182,977,851 186,716,214
store_us 54,833.800 54,394,580

@jstarry
Copy link
Member

jstarry commented Jan 27, 2022

@pgarg66 Hmm it's pretty unlikely to get backported if there aren't significant perf improvements. Looks like your last results don't look great

@pgarg66
Copy link
Contributor Author

pgarg66 commented Jan 27, 2022

@pgarg66 Hmm it's pretty unlikely to get backported if there aren't significant perf improvements. Looks like your last results don't look great

execute_us does show some performance improvement (that's what is targeted by this change). I agree that it's not enough to backport.

What's your opinion on merging it to master branch?

@sakridge
Copy link
Member

@pgarg66 Hmm it's pretty unlikely to get backported if there aren't significant perf improvements. Looks like your last results don't look great

execute_us does show some performance improvement (that's what is targeted by this change). I agree that it's not enough to backport.

What's your opinion on merging it to master branch?

What's the comparison of the total wall clock time of the implementations? I think ledger-tool prints this at the end.

@pgarg66
Copy link
Contributor Author

pgarg66 commented Jan 28, 2022

@pgarg66 Hmm it's pretty unlikely to get backported if there aren't significant perf improvements. Looks like your last results don't look great

execute_us does show some performance improvement (that's what is targeted by this change). I agree that it's not enough to backport.
What's your opinion on merging it to master branch?

What's the comparison of the total wall clock time of the implementations? I think ledger-tool prints this at the end.

With PR

ledger processed in 5 minutes, 26 seconds, 601 ms, 804 µs and 432 ns. root slot is 114174058, 2 forks at 114174099, 114174100, with 39 frozen banks

Master

ledger processed in 5 minutes, 28 seconds, 811 ms, 967 µs and 408 ns. root slot is 114174058, 2 forks at 114174099, 114174100, with 39 frozen banks

@buffalu
Copy link
Contributor

buffalu commented Jan 30, 2022

wonder how many improvements we'd actually expect to see running this on ledger snapshot given the way transactions are packed is extremely suboptimal? on average, we're seeing an extremely low number of transactions per entry, so my guess is that this probably won't help looking backwards. i am working on a PR for issue #22096 and am working to pull this in there to see how much it helps. during testing, constantly getting 128 txs per entry (there's no limit too, can pack as many as you want :))

imo would be more useful to run some benchmarks with transfers from multiple signers to try to get fuller entries and see how much it helps there. still might not make sense to get in now, but have a feeling this will be extremely important in the future.

image

@pgarg66
Copy link
Contributor Author

pgarg66 commented Jan 31, 2022

wonder how many improvements we'd actually expect to see running this on ledger snapshot given the way transactions are packed is extremely suboptimal? on average, we're seeing an extremely low number of transactions per entry, so my guess is that this probably won't help looking backwards. i am working on a PR for issue #22096 and am working to pull this in there to see how much it helps. during testing, constantly getting 128 txs per entry (there's no limit too, can pack as many as you want :))

imo would be more useful to run some benchmarks with transfers from multiple signers to try to get fuller entries and see how much it helps there. still might not make sense to get in now, but have a feeling this will be extremely important in the future.

image

@buffalu that makes sense. Could you post the results here as well?

@sakridge
Copy link
Member

wonder how many improvements we'd actually expect to see running this on ledger snapshot given the way transactions are packed is extremely suboptimal? on average, we're seeing an extremely low number of transactions per entry, so my guess is that this probably won't help looking backwards. i am working on a PR for issue #22096 and am working to pull this in there to see how much it helps. during testing, constantly getting 128 txs per entry (there's no limit too, can pack as many as you want :))

imo would be more useful to run some benchmarks with transfers from multiple signers to try to get fuller entries and see how much it helps there. still might not make sense to get in now, but have a feeling this will be extremely important in the future.

Why would low transactions per entry limit the parallelism on execution?

@buffalu
Copy link
Contributor

buffalu commented Jan 31, 2022

wonder how many improvements we'd actually expect to see running this on ledger snapshot given the way transactions are packed is extremely suboptimal? on average, we're seeing an extremely low number of transactions per entry, so my guess is that this probably won't help looking backwards. i am working on a PR for issue #22096 and am working to pull this in there to see how much it helps. during testing, constantly getting 128 txs per entry (there's no limit too, can pack as many as you want :))
imo would be more useful to run some benchmarks with transfers from multiple signers to try to get fuller entries and see how much it helps there. still might not make sense to get in now, but have a feeling this will be extremely important in the future.

Why would low transactions per entry limit the parallelism on execution?

how can you execute in parallel if you’re throwing small number of entries at it?

@stale
Copy link

stale bot commented Mar 2, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Mar 2, 2022
@pgarg66 pgarg66 closed this Mar 2, 2022
@ryoqun
Copy link
Member

ryoqun commented Oct 5, 2022

(note to self: seems this is prior effort for tx-level parallism of mine: #23548)

@pgarg66 pgarg66 deleted the bank_par_iter branch April 9, 2023 00:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Community contribution stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants