Skip to content
This repository has been archived by the owner on Jan 13, 2025. It is now read-only.

Use fee and sender_stake prioritization to add new packet_batch to ba… #23841

Closed
wants to merge 3 commits into from

Conversation

tao-stones
Copy link
Contributor

Problem

Banking_stage currently using FIFO to drop buffered packet_batch to make room for received new batch. It could drop highly prioritized Transactions in old batch. It should instead to drop Transactions with lowest priority.

Priority is determined by Transaction's fee-per-cu, and sender_stake if fee-per-cu is same.

Summary of Changes

  • update banking_stage::push_unprocessed() to UnprocessedPacketBatches::insert_or_swap_batch() instead of fifo;
  • UnprocessedPacketBatches::prioritize_by_fee_per_cu() to index Transaction by its fee/CU, which is calculated with working_bank
  • Weight-shuffle locators with send_stakes
  • swap_packet_by_priority implements the logic of dropping lower prioritized Transaction, indexed by prioritize_by_fee_then_stakes, find an empty patch_batch to swap with received new packet_batch.

Fixes #23369

@tao-stones tao-stones requested review from carllin, sakridge and jstarry and removed request for carllin March 22, 2022 16:32
@tao-stones tao-stones force-pushed the leader-qos-part-4 branch 2 times, most recently from 12b0b3c to cdb9cd8 Compare March 22, 2022 16:39
@tao-stones
Copy link
Contributor Author

tao-stones commented Mar 23, 2022

Ran 1-hour 5 nodes perf test, this PR reported 16K, while the tip of master is 15k-17k. It is basically inline with master.

A side note of performance, since each banking thread is doing more for prioritization, i will be increasing banking_stage thread number from 4 to 6. The test I conducted before indicates adding 2 more threads yields good benefit on performance (it also reduce each thread's buffer size).

@codecov
Copy link

codecov bot commented Mar 23, 2022

Codecov Report

Merging #23841 (cdb9cd8) into master (2da4e3e) will decrease coverage by 0.0%.
The diff coverage is 81.1%.

❗ Current head cdb9cd8 differs from pull request most recent head b040999. Consider uploading reports for the commit b040999 to get more accurate results

@@            Coverage Diff            @@
##           master   #23841     +/-   ##
=========================================
- Coverage    81.8%    81.7%   -0.1%     
=========================================
  Files         581      583      +2     
  Lines      158312   159086    +774     
=========================================
+ Hits       129518   130119    +601     
- Misses      28794    28967    +173     

core/src/banking_stage.rs Outdated Show resolved Hide resolved
core/src/banking_stage.rs Outdated Show resolved Hide resolved
core/src/unprocessed_packet_batches.rs Outdated Show resolved Hide resolved
core/src/unprocessed_packet_batches.rs Outdated Show resolved Hide resolved
core/src/unprocessed_packet_batches.rs Outdated Show resolved Hide resolved
core/src/unprocessed_packet_batches.rs Outdated Show resolved Hide resolved
core/src/unprocessed_packet_batches.rs Outdated Show resolved Hide resolved
core/src/unprocessed_packet_batches.rs Outdated Show resolved Hide resolved
core/src/unprocessed_packet_batches.rs Outdated Show resolved Hide resolved
core/src/unprocessed_packet_batches.rs Outdated Show resolved Hide resolved
Comment on lines 119 to 121
if self.len() >= batch_limit {
let (_, dropped_batches_count, dropped_packets_count) =
self.swap_packet_by_priority(deserialized_packet_batch);
Copy link
Contributor

@carllin carllin Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there are a lot of incoming packets during spikes, I see this pathway being hit repeatedly in the metrics, meaning it can't be expensive to run.

This currently a pretty expensive operation of figuring out the fees for every transaction in the queue, sorting them, dropping the unnecessary ones.

Maybe it would be better to change the queue to support ordering by the fee-per-cu on insertion, maybe a min-max-heap: https://docs.rs/min-max-heap/latest/min_max_heap/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, I paid attention on bench test and perf test. Tests didn't show noticeable degrading, but the concern is valid.

If the buffered entry is packet_batch (which I'd like to avoid changing it to minimize the change scope), I'm not sure how BTreeMap could help here.

Another constrain is fee-per-cu is bank dependent, it is not simply derived from transaction itself (at least at the moment), rather it depends on current bank's fee_structure and lamports_per_signature.

Amount the added overhead in order to be able to prioritize by fee, I think compute fee per transaction perhaps the most expensive. Sorting by fee (utilized a BTreeMap) might be second; dropping should cost very little (drop a packet is just remove its index from unprocessed vec, drop a batch is a HashMap::Swap op).

Mentioned somewhere early, I intended to increase banking_stage non-vote threads from 2 to 4 to compensate added overhead, which also automatically reduce each thread's buffer size (= TOTAL_BUFFERED_PACKETS/number_of_thredas), that'd help on reduce sorting cost a bit. (I tested it before, the benching show significant improvement with 6 threads, more test should be done ofc)

Having said that, I think to cache computed fee-per-cu for transactions for up to X slots would help. Ideally X is 1, so it computes fee-per-cu for each transaction once per bank, still would save a lot during spikes. Or we can increase X if we don't care to be super precise, or performance is prove be too bad during spikes. I'll start work on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit adds cache for fee-per-cu for up to 1 slot. Going to add some metrics

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests didn't show noticeable degrading, but the concern is valid.

Hmm I think to test this drop path properly, we would have to spam packets such that each thread hits the buffer limit, i.e. somewhere close to 500k tps, which I don't think any of the tests currently do

Another constrain is fee-per-cu is bank dependent, it is not simply derived from transaction itself (at least at the moment), rather it depends on current bank's fee_structure and lamports_per_signature.

I think if we follow your suggestion here: #23841 (comment), it won't matter too much, we just need to pick one consistent bank so that everyone is relatively equal to one another.

If the buffered entry is packet_batch (which I'd like to avoid changing it to minimize the change scope), I'm not sure how BTreeMap could help here.

Yeah this is a pain, we really want to be able to weigh each packet separately in a min-max-heap. This will help too when we need to find the highest priority transaction to process, which my 🔮 is telling me is going to be the follow-up to this 😉

With our current grouping of multiple packets into PacketBatch there's also a correctness problem when a high priority transaction is grouped into the same batch as a low priority batch. This is exemplified by compromises like this:

warn!("Cannot find eviction candidate from buffer in single iteration. New packet batch is dropped.");
where a potentially high fee transaction doesn't make it into the buffer because a bunch of low transaction transactions were grouped into the same batch as high priority transactions in a PacketBatch

I think we should consider ripping off the bandaid and switch to operating on individual Packet objects here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider ripping off the bandaid and switch to operating on individual Packet objects here.

Yeah, the benefits of dealing with individual packets is obvious, particularly from the angle we are in right now 😃 ; Just concerned if there are downsides are not apparent. Better to do this separately, imo.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh right right, I got the part about the fee-per-cu. I was just thinking in the case where fee-prioritization is a constant --- it would put the smaller validators at that inherent disadvantage when queue exceeds block size. The only way for them to get any transaction inclusion would then be to submit transactions with higher fee-priority than the larger validators. I guess I just wasn't quite sure what the goal was of the stake weight prioritization if Quic happens to work. I think for both Quic and stake-weight priority it'll be something that validators end up tossing aside at some point anyway if MEV or block rewards makes them more $ by not following those rules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I can see your point. The original idea behind stake-weight is "validator has x% stake should have x% say on what transactions to be included in block". At what stage it to be implemented is more flexible. If Quic works, we can easily toss this out from banking_stage, i think.

Copy link

@nikhayes nikhayes Apr 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm maybe the other thing that could be possible is that you give precedence to a certain sized set of immediately prior leaders (i.e. last 8 leaders), this would still be somewhat stake-weighted (because higher staked validators appear more often in schedule), but would allow for lower staked validators to appear in the ordering (since they'll pop up in the set too). Also, in the case that the prior leaders' transactions may have not all gone through, they still have some lasting prioritization beyond their leader slots (this is more loose than direct forwarding, but maybe helps?? dunno, lol).

Copy link

@nikhayes nikhayes May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, but just want to point it out stake weight here plays a small part in prioritization. Transactions in buffer are prioritized by their fee-per-cu. Sender stakes are weighted to transactions have same fee-per-cu. Arguably, it is optional to using sender stake in leader prioritization, just an extra.

Coming back to this, given fee priority rate will be quite fine-grained as of now, I think any type of stake weight ordering is a bit pointless. I think you'll see a wide variety of priority except for maybe at the base level, but if I'm a bot I'll never send at that level, I'll add a few micro lamports and send transactions across a range of priority which doesn't perceptibly cost me anything, then some other bot will add a few micro lamports in response until you never see priority fees at the exact same level and things will be too dynamic to ever fall at the same exact priority rate. There won't be any beneficial stake weighted ordering that can occur at that point anymore. I think ordering by stake weight priority might just waste valuable time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, but just want to point it out stake weight here plays a small part in prioritization. Transactions in buffer are prioritized by their fee-per-cu. Sender stakes are weighted to transactions have same fee-per-cu. Arguably, it is optional to using sender stake in leader prioritization, just an extra.

Coming back to this, given fee priority rate will be quite fine-grained as of now, I think any type of stake weight ordering is a bit pointless. I think you'll see a wide variety of priority except for maybe at the base level, but if I'm a bot I'll never send at that level, I'll add a few micro lamports and send transactions across a range of priority which doesn't perceptibly cost me anything, then some other bot will add a few micro lamports in response until you never see priority fees at the exact same level and things will be too dynamic to ever fall at the same exact priority rate. There won't be any beneficial stake weighted ordering that can occur at that point anymore. I think ordering by stake weight priority might just waste valuable time.

agree. Perhaps worth to see how it plays out.

…e fee-per-cu computing;

avoid read-lock blockhash_queue when getting lamports-per-signature;
@carllin carllin force-pushed the leader-qos-part-4 branch from 2bce1c6 to b040999 Compare April 1, 2022 02:35
@tao-stones tao-stones marked this pull request as draft April 1, 2022 21:14
@tao-stones
Copy link
Contributor Author

opened issue #24069 to discuss/document design of indexing. turn this PR to Draft until this issue is solved.

@stale
Copy link

stale bot commented Apr 16, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Apr 16, 2022
@stale
Copy link

stale bot commented Apr 27, 2022

This stale pull request has been automatically closed. Thank you for your contributions.

@stale stale bot closed this Apr 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants