-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: Avoid holding raftMu during evaluation #15935
Conversation
This should give a performance bump in the hotspot case (when the hotspot isn't sufficiently split). @tamird would be interested in seeing whether your in-the-makings benchmark stuff can measure the improvement that no doubt exists. Reviewed 9 of 9 files at r1. pkg/storage/replica_raftstorage.go, line 54 at r1 (raw file):
while you're here: "the replica lock" is awfully unspecific. From looking at the body it looks like we need both locks. pkg/storage/replica_raftstorage.go, line 236 at r1 (raw file):
ditto. pkg/storage/replica_raftstorage.go, line 435 at r1 (raw file):
Lock comment here also wouldn't hurt. The comment applies elsewhere as well (I'm not going to repeat it). pkg/storage/store.go, line 3039 at r1 (raw file):
Not having looked at this code in a while, I kept scrolling up to learn about Comments from Reviewable |
Review status: all files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. pkg/storage/replica_raftstorage.go, line 243 at r1 (raw file):
Should this be Comments from Reviewable |
Review status: all files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. pkg/storage/replica_raftstorage.go, line 54 at r1 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
Changed this method to only need pkg/storage/replica_raftstorage.go, line 243 at r1 (raw file): Previously, petermattis (Peter Mattis) wrote…
Methods in the pkg/storage/store.go, line 3039 at r1 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
There's a Comments from Reviewable |
Review status: 8 of 9 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. pkg/storage/replica_raftstorage.go, line 243 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I might be missing something (I often am), but I can't see where we're holding diff --git a/pkg/storage/replica_state.go b/pkg/storage/replica_state.go
index 47910f8..9269996 100644
--- a/pkg/storage/replica_state.go
+++ b/pkg/storage/replica_state.go
@@ -619,6 +619,8 @@ func (rec ReplicaEvalContext) pushTxnQueue() *pushTxnQueue {
// FirstIndex returns the oldest index in the raft log.
func (rec ReplicaEvalContext) FirstIndex() (uint64, error) {
+ rec.repl.mu.Lock()
+ defer rec.repl.mu.Unlock()
return rec.repl.FirstIndex()
} Comments from Reviewable |
Review status: 8 of 9 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. pkg/storage/replica_raftstorage.go, line 243 at r1 (raw file): Previously, petermattis (Peter Mattis) wrote…
We erroneously don't grab Comments from Reviewable |
Now that the command-queue issues in cockroachdb#10084 have been fixed, the only reason to hold raftMu during evaluation was because it controlled access to the replicaStateLoader (and lock-ordering requirements forced us to give the lock a broad scope). By forking the replicaStateLoader into several copies (one protected by Replica.mu and one by Replica.raftMu, plus a few temporary loaders created off the critical path), we can reduce the scope of this coarse-grained lock (which in turn facilitates cockroachdb#15802).
Review status: 8 of 9 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. pkg/storage/replica_raftstorage.go, line 243 at r1 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
Yep, that's a bug. It was fine before because all evaluation was under Comments from Reviewable |
Review status: 7 of 9 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. pkg/storage/replica_raftstorage.go, line 243 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Can we rename Comments from Reviewable |
Review status: 7 of 9 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. pkg/storage/replica_raftstorage.go, line 243 at r1 (raw file): Previously, petermattis (Peter Mattis) wrote…
No, the name Comments from Reviewable |
Review status: 7 of 9 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. pkg/storage/replica_raftstorage.go, line 243 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Doh, I forgot that Comments from Reviewable |
Reviewed 2 of 2 files at r3, 1 of 1 files at r4. pkg/storage/replica_raftstorage.go, line 54 at r2 (raw file):
my understanding now is that this method in itself doesn't require pkg/storage/replica_raftstorage.go, line 73 at r2 (raw file):
"replica lock" pkg/storage/replica_raftstorage.go, line 236 at r2 (raw file):
"replica lock" Comments from Reviewable |
Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful. pkg/storage/replica_raftstorage.go, line 54 at r2 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
Yeah, in practice both locks will be held because this has no callers but raft. If it were called anywhere else, I'm not sure that holding both locks would be sufficient; I think this also relies on the fact that it is called only at raft group initialization time. pkg/storage/replica_raftstorage.go, line 73 at r2 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
You're looking at an older revision; that comment is now removed. pkg/storage/replica_raftstorage.go, line 236 at r2 (raw file): Previously, tschottdorf (Tobias Schottdorf) wrote…
You're looking at an older revision; it now says Comments from Reviewable |
The call to assertStateLocked may fail if called without the raft lock as the on-disk state may be changed concurrently. This fixes a regression introduced in cockroachdb#15935. Fixes cockroachdb#15975 Fixes cockroachdb#15979
Now that the command-queue issues in #10084 have been fixed, the only
reason to hold raftMu during evaluation was because it controlled
access to the replicaStateLoader (and lock-ordering requirements
forced us to give the lock a broad scope). By forking the
replicaStateLoader into several copies (one protected by Replica.mu
and one by Replica.raftMu, plus a few temporary loaders created off
the critical path), we can reduce the scope of this coarse-grained
lock (which in turn facilitates #15802).