-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-apply qs backpressure increase with buffer latency increase #13961
Conversation
As Quorum Store batches are bucketed, and we are looking to increase block limits, now is the time to reduce Quorum Store backpressure. We now allow 36K transactions outstanding. At 12K TPS, this is approximately 3 seconds worth of batches. For forge tests, a lot of the queuing shifts from mempool to POS-to-Proposal, so the limits need to be adjusted accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As graceful overload test failed last time, I ran graceful overload test here. It seems to be working.
https://github.com/aptos-labs/aptos-core/actions/runs/9865912053
@@ -85,8 +85,8 @@ impl Default for MempoolConfig { | |||
system_transaction_timeout_secs: 600, | |||
system_transaction_gc_interval_ms: 60_000, | |||
broadcast_buckets: DEFAULT_BUCKETS.to_vec(), | |||
eager_expire_threshold_ms: Some(10_000), | |||
eager_expire_time_ms: 3_000, | |||
eager_expire_threshold_ms: Some(15_000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the downside of increasing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new constants make the logic like this:
if there is any transaction in the mempool (that never entered parking lot) that was there for longer than eager_expire_threshold_ms (which is increased from 10s -> 15s here) - that means we have a significant backlog.
if that is the case, we pull into batches only transactions that have at least eager_expire_time_ms time left (increased from 3s to 6s)
We've increased both, as with reduced QS backpressure, we see it taking more time from batch creation to batch being included in the block.
So the side-consequence of this change is that - if that during backlog, if you use expiration <6s, your transactions will be ignored (previously <3s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. Slightly orthogonal but do we really need this logic after @vusirikala 's change to exclude expired transaction in QS backpressure calculation in https://github.com/aptos-labs/aptos-core/pull/13850/files. cc - @bchocho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
creating useless batches, and then requiring everyone to fetch them, just in order to throw them out - probably has negative effect on our overall throughput.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sitalkedia This is the run for graceful overload test with my changes and increasing QS backpressure limits without changing the eager_expire_threshold_ms
config.
#13964 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's revert the forge changes before landing.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|
✅ Forge suite
|
✅ Forge suite
|
* [quorum store] reduce backpressure significantly for more TPS (#13558) As Quorum Store batches are bucketed, and we are looking to increase block limits, now is the time to reduce Quorum Store backpressure. We now allow 36K transactions outstanding. At 12K TPS, this is approximately 3 seconds worth of batches. For forge tests, a lot of the queuing shifts from mempool to POS-to-Proposal, so the limits need to be adjusted accordingly. * increase buffer for expiration in batch creation * adding buffers on inner traffic as well --------- Co-authored-by: Brian (Sunghoon) Cho <[email protected]>
* [quorum store] reduce backpressure significantly for more TPS (#13558) As Quorum Store batches are bucketed, and we are looking to increase block limits, now is the time to reduce Quorum Store backpressure. We now allow 36K transactions outstanding. At 12K TPS, this is approximately 3 seconds worth of batches. For forge tests, a lot of the queuing shifts from mempool to POS-to-Proposal, so the limits need to be adjusted accordingly. * increase buffer for expiration in batch creation * adding buffers on inner traffic as well --------- Co-authored-by: Brian (Sunghoon) Cho <[email protected]>
… (#13997) * [quorum store] reduce backpressure significantly for more TPS (#13558) As Quorum Store batches are bucketed, and we are looking to increase block limits, now is the time to reduce Quorum Store backpressure. We now allow 36K transactions outstanding. At 12K TPS, this is approximately 3 seconds worth of batches. For forge tests, a lot of the queuing shifts from mempool to POS-to-Proposal, so the limits need to be adjusted accordingly. * increase buffer for expiration in batch creation * adding buffers on inner traffic as well ---------
Description
As Quorum Store batches are bucketed, and we are looking to increase block limits, now is the time to reduce Quorum Store backpressure.
We now allow 36K transactions outstanding. At 12K TPS, this is approximately 3 seconds worth of batches.
For forge tests, a lot of the queuing shifts from mempool to POS-to-Proposal, so the limits need to be adjusted accordingly.
Additionally, this increases time batches can take to be included in the block, so increasing eager expirations
Type of Change
Which Components or Systems Does This Change Impact?
How Has This Been Tested?
Key Areas to Review
Checklist