You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently discovered that a potential contributor to #24163 is due to transactions that exceed the cost model getting stuck in buffered_packet_batches and continually looped over, causing BankingStage starvation because it would not try to read more packets in.
However, we're still wasting time trying to run the packet -> SanitizedTransaction -> qos algorithms on these packets that exceed the cost model for the same slot even though we know they won't get scheduled.
We need to remove those pending packets from the buffer and put them somewhere else and re-insert them on a new bank
Proposed Solution
Perhaps any transaction that exceeds the cost model is put into a separate queue? on a new bank, can stick them all at the front, but there's a good chance theres a lot of account contention.
might be better to distribute those packets across all the batches until a better scheduling algorithm is built.
Problem
We recently discovered that a potential contributor to #24163 is due to transactions that exceed the cost model getting stuck in buffered_packet_batches and continually looped over, causing BankingStage starvation because it would not try to read more packets in.
Fixed that with #25236 and #25245
However, we're still wasting time trying to run the packet -> SanitizedTransaction -> qos algorithms on these packets that exceed the cost model for the same slot even though we know they won't get scheduled.
We need to remove those pending packets from the buffer and put them somewhere else and re-insert them on a new bank
Proposed Solution
Perhaps any transaction that exceeds the cost model is put into a separate queue? on a new bank, can stick them all at the front, but there's a good chance theres a lot of account contention.
might be better to distribute those packets across all the batches until a better scheduling algorithm is built.
@sakridge @carllin @aeyakovenko
The text was updated successfully, but these errors were encountered: