-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc: fix possible deadlock in rpc #26051
Conversation
Codecov Report
@@ Coverage Diff @@
## master #26051 +/- ##
=========================================
- Coverage 82.1% 81.8% -0.3%
=========================================
Files 628 631 +3
Lines 171471 174123 +2652
=========================================
+ Hits 140878 142595 +1717
- Misses 30593 31528 +935 |
@BurtonQin , to be clear, there is only a risk of deadlock with write-prioritized |
Yes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution
* add three gossip metrics measuring gossip loop times * add 5 metrics * rm space * rm space * Update SECURITY.md - fix nav link - add bounty split policy for duplicate reports * Add transaction index in slot to geyser plugin TransactionInfo (#25688) * Define shuffle to prep using same shuffle for multiple slices * Determine transaction indexes and plumb to execute_batch * Pair transaction_index with transaction in TransactionStatusService * Add new ReplicaTransactionInfoVersion * Plumb transaction_indexes through BankingStage * Prepare BankingStage to receive transaction indexes from PohRecorder * Determine transaction indexes in PohRecorder; add field to WorkingBank * Add PohRecorder::record unit test * Only pass starting_transaction_index around PohRecorder * Add helper structs to simplify test DashMap * Pass entry and starting-index into process_entries_with_callback together * Add tx-index checks to test_rebatch_transactions * Revert shuffle definition and use zip/unzip * Only zip/unzip if randomize * Add confirm_slot_entries test * Review nits * Add type alias to make sender docs more clear * Update SECURITY.md finish filling out the table.... * rpc: fix possible deadlock in rpc (#26051) * Add StatusCache::root_slot_deltas() and use it (#26170) * Remove InMemAccountsIndex::map() and use map_internal directly (#26189) * [quic]Decrement total_streams correctly (#26158) * remove comment * alphabetical metrics. no abbreviations * remove trailing white space * cargo fmt to update code format/readability Co-authored-by: Trent Nelson <[email protected]> Co-authored-by: Tyera Eulberg <[email protected]> Co-authored-by: Boqin Qin(秦 伯钦) <[email protected]> Co-authored-by: Brooks Prumo <[email protected]> Co-authored-by: Miles Obare <[email protected]>
Problem
There is a possible deadlock caused by double readlock in fn
get_transaction_status
.block_commitment_cache
is anArc<RwLock<...>>
.The first readlock is on L1434
solana/rpc/src/rpc.rs
Lines 1434 to 1436 in fbf7143
The second readlock is on L253 in fn
bank
solana/rpc/src/rpc.rs
Lines 250 to 254 in fbf7143
For more details on this kind of deadlock, see
https://www.reddit.com/r/rust/comments/urnqz8/different_behaviors_of_recursive_read_locks_in/
Summary of Changes
The fix is to move the first readlock after the calling of fn
bank
.But I wonder if this is correct or optimal.
optimistically_confirmed
should be protected byblock_commitment_cache
? ORr_block_commitment_cache
immediately after the creation ofconfirmations
on L1440-1448.Fixes #