-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider larger lookback window in BankingStage for producing entries #22096
Comments
Each batch of packets in banking stage can contain up to 128 transactions and we greedily select as many non-conflicting transactions from each batch. That should in theory lead to a decent sized entry but I do think a fair number of transactions get locked out by other banking threads. I think that it would be helpful to get some data first on how banking stage is currently packing entries to see what low hanging fruit there is here. |
some more data from snapshot 115343448 the averages remove entries that don't have any type of transactions, so we're not getting screwed up by ticks that have 0 transactions. i had a feeling the non-voting txs weren't being batched in perfect manner in banking stage, but didn't think it was this bad! i think it might be combo of:
|
Hey Starry, my understanding is that during a Raydium IDO or NFT drop, some 98% of transactions (just taking a guess) seem to originate from these sources (even after dedupe), and with greedy packing, batches will be nearly completely full with non-parallelizable transactions? It doesn't even seem like maxing the number of banking threads will help much given the way batches are packed like this, and I think adding fee-prioritization won't resolve this issue of the batches containing a lot of non-parallelizable transactions, since I imagine a lot of jpeg-hungry people submitting all their transactions with priority at the same time. It would be great to see data for this as well... but even just based on the greedy-batch packing design I think one can predict the outcome of these situations... I believe a well-behaved, QUIC-filtered 10000 NFT drop will result in similar issues due to the same greedy packing, with overpacking of non-parallelizable transactions into batches. I worry that QUIC will help bandaid the situation for a little bit of time, but this may just hide the underlying issue with the batch design, which might delay effort into improvements to this design. Btw, @buffalu is your script to track entry data available? Maybe someone(s) could compare congestion vs non-congestion situations. If the hypothesis is that it's still duplicates that are causing these small entries, maybe experiment(s) could be run on mainnet with candymachine with more tame, controlled amounts of spam in the meantime? @carllin @taozhu-chicago @ryoqun @aeyakovenko |
|
For bulletpoint 1 that seems helpful but the small transactions in an entry would get slowed down by the presence of large transactions requiring execution? I feel like what could be an "easy" speed up before a more complex transaction scheduler is the separation of the transaction queue into transaction pools of different sizes (this could be based on compute, instruction number, etc). Threads would be dedicated to certain sized transaction pools and wouldn't need to always search through a queue that might contain many many raydium transactions, and the entry would include a lot of smaller transactions so could probably execute those chunks of transactions much more quickly. For example, for now you could have a thread/batch that processes transactions of 2 instructions or less (Pyth, transfers, tetc), one for 3 to 5 (Serum, etc), one for 6 to 10, and one for 11+. I forget if chunks are combined across threads to generate an entry or if threads create their own entry. If it's the latter it might be easier to implement this (?). Even if they are timed together, and the small transaction thread is rate limited by the big transaction thread/entry, you would still get better transaction distribution and more parallelization (since it doesn't seem that other transactions typically conflict with Raydium transactions?). |
#23066 partially fixes this, but doesn't address the entire issue |
Note: I haven't gathered data on the below yet, but from understanding how the validator works believe there could be a better way to handle batching entries.
Problem
It seems that under high load and lots of account contention, you can end up with the possibility of many small entries in a block and also lots of small entries where you have a chain of small entries accessing the same accounts. (ie entry n = only mango, entry n+1 only mango, ...)
More, smaller entries means less parallelization and the longer it would take a block to execute. The longer it takes a block to execute, the more replay_stage would blow up attempting to choose a fork.
For instance, take a look at the below example which illustrates this problem:
Proposed Solution
Instead of examining packets in batches of 128, you massively extend it a much larger (potentially the entire buffered queue) to pack entries. This would prevent high load from impacting non-related transactions (NFT minting, transfers, etc.) and make the system pack larger entries, which means more parallelization, less replay_stage blowup, and happier users!
What this would look like is:
In this particular example, there's 1 less entry, but the lookback is much longer so transactions that don't have contention with A & B can at least be processed, instead of draining the entire buffer_packets buffered where you might end up with lots of smaller entries.
Challenges:
The text was updated successfully, but these errors were encountered: