Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: prevent recursive Replica.mu.RLock #97381

Merged
merged 1 commit into from
Feb 22, 2023

Conversation

pav-kv
Copy link
Collaborator

@pav-kv pav-kv commented Feb 21, 2023

Previously, we were passing an RLocked Replica to addToReplicasByKeyLocked.
That method internally visits the descriptors in replicasByKey, which calls
Desc() that may cause a recursive call to RLock if the added replica intersects
itself. This is only possible in tests which try to reinsert the Replica to the
map; in prod code each Replica is inserted to replicasByKey only once.

This PR avoids the possibility of this deadlock by shifting the responsibility
of locking to the caller.

Fixes #96931

Release note: none
Epic: none

@pav-kv pav-kv requested review from tbg and a team February 21, 2023 14:27
@pav-kv pav-kv requested a review from a team as a code owner February 21, 2023 14:27
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message would be helpful:

Previously, we were passing an RLocked *Replica to (*Store).addToReplicasByKeyLocked. That method internally visits the descriptors and called Desc(), which would constitute a recursive call to RLock (caught by the deadlock detector due to the risk of a deadlock in case of an interleaving exclusive lock attempt to the mutex). <explain here how it can happen that while inserting a replica we're visiting the same replica? Was this a split? Or just a test doing funky things?>

pkg/kv/kvserver/store_create_replica.go Show resolved Hide resolved
Previously, we were passing an RLocked Replica to addToReplicasByKeyLocked.
That method internally visits the descriptors in replicasByKey, which calls
Desc() that may cause a recursive call to RLock if the added replica intersects
itself. This is only possible in tests which try to reinsert the Replica to the
map; in prod code each Replica is inserted to replicasByKey only once.

Release note: none
Epic: none
@pav-kv pav-kv force-pushed the prevent-double-lock branch from 064f42c to a908137 Compare February 21, 2023 16:28
@pav-kv pav-kv requested a review from tbg February 21, 2023 16:30
@pav-kv
Copy link
Collaborator Author

pav-kv commented Feb 21, 2023

@tbg Could you take another look? The previous version of this PR was deadlock-full. I fixed the deadlock at the cost of some duplication: the caller now passes the replica's Desc() or descRLocked() depending on the locking state.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽
also 😬 at this locking in general

@pav-kv
Copy link
Collaborator Author

pav-kv commented Feb 22, 2023

bors r=tbg

@pav-kv
Copy link
Collaborator Author

pav-kv commented Feb 22, 2023

bors cancel

@craig
Copy link
Contributor

craig bot commented Feb 22, 2023

Canceled.

@pav-kv
Copy link
Collaborator Author

pav-kv commented Feb 22, 2023

bors r=tbg

@craig
Copy link
Contributor

craig bot commented Feb 22, 2023

Build succeeded:

@craig craig bot merged commit e9e513d into cockroachdb:master Feb 22, 2023
@pav-kv pav-kv deleted the prevent-double-lock branch February 22, 2023 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kv/kvserver: TestStoreAddRemoveRanges failed
3 participants