-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Use fee and sender_stake prioritization to add new packet_batch to ba… #23841
Conversation
12b0b3c
to
cdb9cd8
Compare
Ran 1-hour 5 nodes perf test, this PR reported 16K, while the tip of master is 15k-17k. It is basically inline with master. A side note of performance, since each banking thread is doing more for prioritization, i will be increasing banking_stage thread number from 4 to 6. The test I conducted before indicates adding 2 more threads yields good benefit on performance (it also reduce each thread's buffer size). |
Codecov Report
@@ Coverage Diff @@
## master #23841 +/- ##
=========================================
- Coverage 81.8% 81.7% -0.1%
=========================================
Files 581 583 +2
Lines 158312 159086 +774
=========================================
+ Hits 129518 130119 +601
- Misses 28794 28967 +173 |
if self.len() >= batch_limit { | ||
let (_, dropped_batches_count, dropped_packets_count) = | ||
self.swap_packet_by_priority(deserialized_packet_batch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there are a lot of incoming packets during spikes, I see this pathway being hit repeatedly in the metrics, meaning it can't be expensive to run.
This currently a pretty expensive operation of figuring out the fees for every transaction in the queue, sorting them, dropping the unnecessary ones.
Maybe it would be better to change the queue to support ordering by the fee-per-cu on insertion, maybe a min-max-heap
: https://docs.rs/min-max-heap/latest/min_max_heap/?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point, I paid attention on bench test and perf test. Tests didn't show noticeable degrading, but the concern is valid.
If the buffered entry is packet_batch
(which I'd like to avoid changing it to minimize the change scope), I'm not sure how BTreeMap could help here.
Another constrain is fee-per-cu is bank
dependent, it is not simply derived from transaction itself (at least at the moment), rather it depends on current bank's fee_structure and lamports_per_signature.
Amount the added overhead in order to be able to prioritize by fee, I think compute fee per transaction perhaps the most expensive. Sorting by fee (utilized a BTreeMap) might be second; dropping should cost very little (drop a packet
is just remove its index from unprocessed vec
, drop a batch is a HashMap::Swap op).
Mentioned somewhere early, I intended to increase banking_stage non-vote threads from 2 to 4 to compensate added overhead, which also automatically reduce each thread's buffer size (= TOTAL_BUFFERED_PACKETS/number_of_thredas
), that'd help on reduce sorting cost a bit. (I tested it before, the benching show significant improvement with 6 threads, more test should be done ofc)
Having said that, I think to cache computed fee-per-cu for transactions for up to X slots would help. Ideally X is 1
, so it computes fee-per-cu for each transaction once per bank, still would save a lot during spikes. Or we can increase X if we don't care to be super precise, or performance is prove be too bad during spikes. I'll start work on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit adds cache for fee-per-cu for up to 1 slot. Going to add some metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests didn't show noticeable degrading, but the concern is valid.
Hmm I think to test this drop path properly, we would have to spam packets such that each thread hits the buffer limit, i.e. somewhere close to 500k tps, which I don't think any of the tests currently do
Another constrain is fee-per-cu is bank dependent, it is not simply derived from transaction itself (at least at the moment), rather it depends on current bank's fee_structure and lamports_per_signature.
I think if we follow your suggestion here: #23841 (comment), it won't matter too much, we just need to pick one consistent bank so that everyone is relatively equal to one another.
If the buffered entry is packet_batch (which I'd like to avoid changing it to minimize the change scope), I'm not sure how BTreeMap could help here.
Yeah this is a pain, we really want to be able to weigh each packet separately in a min-max-heap
. This will help too when we need to find the highest priority transaction to process, which my 🔮 is telling me is going to be the follow-up to this 😉
With our current grouping of multiple packets into PacketBatch
there's also a correctness problem when a high priority transaction is grouped into the same batch as a low priority batch. This is exemplified by compromises like this:
solana/core/src/unprocessed_packet_batches.rs
Line 400 in d0a0bf6
warn!("Cannot find eviction candidate from buffer in single iteration. New packet batch is dropped."); |
PacketBatch
I think we should consider ripping off the bandaid and switch to operating on individual Packet
objects here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should consider ripping off the bandaid and switch to operating on individual
Packet
objects here.
Yeah, the benefits of dealing with individual packets is obvious, particularly from the angle we are in right now 😃 ; Just concerned if there are downsides are not apparent. Better to do this separately, imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh right right, I got the part about the fee-per-cu. I was just thinking in the case where fee-prioritization is a constant --- it would put the smaller validators at that inherent disadvantage when queue exceeds block size. The only way for them to get any transaction inclusion would then be to submit transactions with higher fee-priority than the larger validators. I guess I just wasn't quite sure what the goal was of the stake weight prioritization if Quic happens to work. I think for both Quic and stake-weight priority it'll be something that validators end up tossing aside at some point anyway if MEV or block rewards makes them more $ by not following those rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I can see your point. The original idea behind stake-weight is "validator has x% stake should have x% say on what transactions to be included in block". At what stage it to be implemented is more flexible. If Quic works, we can easily toss this out from banking_stage, i think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm maybe the other thing that could be possible is that you give precedence to a certain sized set of immediately prior leaders (i.e. last 8 leaders), this would still be somewhat stake-weighted (because higher staked validators appear more often in schedule), but would allow for lower staked validators to appear in the ordering (since they'll pop up in the set too). Also, in the case that the prior leaders' transactions may have not all gone through, they still have some lasting prioritization beyond their leader slots (this is more loose than direct forwarding, but maybe helps?? dunno, lol).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, but just want to point it out stake weight here plays a small part in prioritization. Transactions in buffer are prioritized by their fee-per-cu. Sender stakes are weighted to transactions have same fee-per-cu. Arguably, it is optional to using sender stake in leader prioritization, just an extra.
Coming back to this, given fee priority rate will be quite fine-grained as of now, I think any type of stake weight ordering is a bit pointless. I think you'll see a wide variety of priority except for maybe at the base level, but if I'm a bot I'll never send at that level, I'll add a few micro lamports and send transactions across a range of priority which doesn't perceptibly cost me anything, then some other bot will add a few micro lamports in response until you never see priority fees at the exact same level and things will be too dynamic to ever fall at the same exact priority rate. There won't be any beneficial stake weighted ordering that can occur at that point anymore. I think ordering by stake weight priority might just waste valuable time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, but just want to point it out stake weight here plays a small part in prioritization. Transactions in buffer are prioritized by their fee-per-cu. Sender stakes are weighted to transactions have same fee-per-cu. Arguably, it is optional to using sender stake in leader prioritization, just an extra.
Coming back to this, given fee priority rate will be quite fine-grained as of now, I think any type of stake weight ordering is a bit pointless. I think you'll see a wide variety of priority except for maybe at the base level, but if I'm a bot I'll never send at that level, I'll add a few micro lamports and send transactions across a range of priority which doesn't perceptibly cost me anything, then some other bot will add a few micro lamports in response until you never see priority fees at the exact same level and things will be too dynamic to ever fall at the same exact priority rate. There won't be any beneficial stake weighted ordering that can occur at that point anymore. I think ordering by stake weight priority might just waste valuable time.
agree. Perhaps worth to see how it plays out.
d445908
to
dfa5235
Compare
…nking thread buffer
…e fee-per-cu computing; avoid read-lock blockhash_queue when getting lamports-per-signature;
dfa5235
to
b040999
Compare
2bce1c6
to
b040999
Compare
opened issue #24069 to discuss/document design of indexing. turn this PR to Draft until this issue is solved. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
This stale pull request has been automatically closed. Thank you for your contributions. |
Problem
Banking_stage currently using FIFO to drop buffered packet_batch to make room for received new batch. It could drop highly prioritized Transactions in old batch. It should instead to drop Transactions with lowest priority.
Priority is determined by Transaction's fee-per-cu, and sender_stake if fee-per-cu is same.
Summary of Changes
banking_stage::push_unprocessed()
toUnprocessedPacketBatches::insert_or_swap_batch()
instead of fifo;UnprocessedPacketBatches::prioritize_by_fee_per_cu()
to index Transaction by its fee/CU, which is calculated withworking_bank
swap_packet_by_priority
implements the logic of dropping lower prioritized Transaction, indexed byprioritize_by_fee_then_stakes
, find an empty patch_batch to swap with received new packet_batch.Fixes #23369