-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concurrency: correctly handle lock updates that decrease its strength #112732
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table.go
line 3378 at r1 (raw file):
// // REQUIRES: kl.mu to be locked. func (kl *keyLocks) onMaybeDecreasedLockStr(st *cluster.Settings) {
I like the overall logic in this function but I wonder if you're concerned at all about the efficiency of looping through all lock holders, all waiting readers, and potentially many queued locking requests. Are there cases where we will end up doing this over and over? E.g. an intent and a bunch of queued shared locking requests where the intent's ts is bumped again and again. Every time we update the intent lock, we remove the replicated lock info, the strength decreases, we iterate over all of the above, and each of the queued requests encounters the intent again when they try to evaluate. Maybe this is not possible, but I wonder if there are other similar situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a leaked goroutine in the concurrency manager test that I added because of #111596. We'll need to get to it before merging this patch.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @miraradeva and @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table.go
line 3378 at r1 (raw file):
Every time we update the intent lock, we remove the replicated lock info, the strength decreases, we iterate over all of the above, and each of the queued requests encounters the intent again when they try to evaluate.
This is a good call out. It's not great that we're potentially in this scheme of discover -> update -> forget -> re-discover. However, it's no different than what would happen right now if there was only an intent on this key (as opposed to there being both an intent and an unreplicated shared lock).
Outside of KVNemesis, I'm also not sure how we can end up with both an unreplicated shared lock and an intent on a key -- so maybe that helps assuage some of the concerns here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table.go
line 3378 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Every time we update the intent lock, we remove the replicated lock info, the strength decreases, we iterate over all of the above, and each of the queued requests encounters the intent again when they try to evaluate.
This is a good call out. It's not great that we're potentially in this scheme of discover -> update -> forget -> re-discover. However, it's no different than what would happen right now if there was only an intent on this key (as opposed to there being both an intent and an unreplicated shared lock).
Outside of KVNemesis, I'm also not sure how we can end up with both an unreplicated shared lock and an intent on a key -- so maybe that helps assuage some of the concerns here?
The discover -> update -> forget -> re-discover cycle seems reasonable; my concern is only around adding another dimension of iterating over all waiting requests. As long as we're convinced it's rare, it should be ok. A bad case would need not only an unreplicated shared lock and an intent on a key, but also a long queue of waiters, which would be even less likely to occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table.go
line 3439 at r3 (raw file):
} if distinguishedRemoved { kl.tryMakeNewDistinguished()
Now that this function is a bit more general, this needs to call informActiveWaiters
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table.go
line 3439 at r3 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Now that this function is a bit more general, this needs to call
informActiveWaiters
instead.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 6 files at r1, 1 of 2 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)
pkg/kv/kvserver/concurrency/lock_table.go
line 3359 at r2 (raw file):
e = e.Next() if g.ts.Less(newTs) { // TODO(arul): Flip this condition; as written, this short circuits --
Nice catch!
pkg/kv/kvserver/concurrency/lock_table.go
line 2963 at r4 (raw file):
// a low watermark timestamp across multiple lock holders by being cute // here. if tl.getLockMode().Strength == lock.Exclusive ||
We decided to remove this and replace it with a call to recomputeWaitQueues
.
pkg/kv/kvserver/concurrency/lock_table.go
line 3254 at r4 (raw file):
beforeTs := tl.writeTS() beforeStr := tl.getLockMode().Strength advancedTs := beforeTs.Less(ts)
nit that's not for this PR: the way you're computing beforeStr
and then afterStr
below without pre-computing whether it is expected to change feels less fragile to me than what we're doing with timestamps. For timestamps, we precompute advancedTs
and then assume that the state transition we plan to make if this flag is true
will be reflected in the writeTS()
method. That feels backwards.
The following pattern (like you have with the strength) would be easier to understand and more obviously correct:
beforeTs := tl.writeTS()
...
if tl.isHeldUnreplicated() {
tl.unreplicatedInfo.ts.Forward(up.Txn.WriteTimestamp)
}
...
afterTs := tl.writeTS()
advancedTs := beforeTs.Less(afterTs)
pkg/kv/kvserver/concurrency/lock_table.go
line 3372 at r4 (raw file):
// locks holders and other waiting requests. Such computation is necessary when // a lock's strength has decreased[1] or locking requests have dropped out of // wait queue's[2] without actually acquiring the lock.
Or the lock's timestamp has advanced.
pkg/kv/kvserver/concurrency/lock_table.go
line 3391 at r4 (raw file):
for e := kl.holders.Front(); e != nil; e = e.Next() { holder := e.Value if strongestMode.Less(holder.getLockMode()) {
nit: avoid the double call to holder.getLockMode()
. mode := e.Value.getLockMode()
pkg/kv/kvserver/concurrency/lock_table.go
line 3404 at r4 (raw file):
} } for e := kl.queuedLockingRequests.Front(); e != nil; {
Let's give each of these loops comments explaining what their role is.
pkg/kv/kvserver/concurrency/lock_table.go
line 3408 at r4 (raw file):
curr := e e = e.Next() if !strongestMode.Empty() && lock.Conflicts(qlr.mode, strongestMode, &st.SV) {
Should we make lock.Conflicts
return false if strongestMode.Empty()
? Is it panicking right now?
pkg/kv/kvserver/concurrency/lock_table.go
line 3424 at r4 (raw file):
} else { qlr.active = false // mark as inactive if qlr.guard == kl.distinguishedWaiter {
Pointing out for future simplification that there is a lot of overlap between this function and maybeReleaseCompatibleLockingRequests
.
pkg/kv/kvserver/concurrency/lock_table.go
line 3425 at r4 (raw file):
qlr.active = false // mark as inactive if qlr.guard == kl.distinguishedWaiter { kl.distinguishedWaiter = nil
// A new distinguished waiter will be selected by informActiveWaiters.
pkg/kv/kvserver/concurrency/lock_table.go
line 3432 at r4 (raw file):
} } // NB: Non-transactional locking requests race with transactional locking
I found this comment and the logic below to be unintuitive. Is it trying to say that we only update strongestMode
if qlr
is still in the queue (i.e. we didn't hit the removeLockingRequest
path above)? If so, it would be easier to understand if that's how it was structured. You could either look at curr
or maintain a removed bool
to make the determination.
pkg/kv/kvserver/concurrency/lock/locking.go
line 132 at r4 (raw file):
// Less returns true if the receiver conflicts with fewer requests than the Mode // supplied. func (m Mode) Less(o Mode) bool {
nit: I feel like we usually use the term "weaker" or "stronger" instead of "less" or "more", so consider naming this Weaker
.
pkg/kv/kvserver/concurrency/testdata/concurrency_manager/shared_locks
line 245 at r4 (raw file):
holder: txn: 00000006-0000-0000-0000-000000000000 epoch: 0, iso: Serializable, info: unrepl [(str: Shared seq: 0)] new-request name=req8 txn=txn7 ts=10,1
# txn7 is read-committed, so it can push the timestamp of intents as if it was high priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR, RFAL!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table.go
line 3359 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Nice catch!
One way to address the TODO was to delete the function entirely 😂
pkg/kv/kvserver/concurrency/lock_table.go
line 3254 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit that's not for this PR: the way you're computing
beforeStr
and thenafterStr
below without pre-computing whether it is expected to change feels less fragile to me than what we're doing with timestamps. For timestamps, we precomputeadvancedTs
and then assume that the state transition we plan to make if this flag istrue
will be reflected in thewriteTS()
method. That feels backwards.The following pattern (like you have with the strength) would be easier to understand and more obviously correct:
beforeTs := tl.writeTS() ... if tl.isHeldUnreplicated() { tl.unreplicatedInfo.ts.Forward(up.Txn.WriteTimestamp) } ... afterTs := tl.writeTS() advancedTs := beforeTs.Less(afterTs)
I can follow this up with another PR -- I'm assuming you said "not for this PR" because we want to reduce the amount of code we're backporting to 23.2, which makes sense.
pkg/kv/kvserver/concurrency/lock_table.go
line 3408 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we make
lock.Conflicts
return false ifstrongestMode.Empty()
? Is it panicking right now?
Yeah, it was panicking before. Made the switch.
pkg/kv/kvserver/concurrency/lock_table.go
line 3424 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Pointing out for future simplification that there is a lot of overlap between this function and
maybeReleaseCompatibleLockingRequests
.
Added a TODO.
pkg/kv/kvserver/concurrency/lock_table.go
line 3432 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I found this comment and the logic below to be unintuitive. Is it trying to say that we only update
strongestMode
ifqlr
is still in the queue (i.e. we didn't hit theremoveLockingRequest
path above)? If so, it would be easier to understand if that's how it was structured. You could either look atcurr
or maintain aremoved bool
to make the determination.
Good call out. Switched to using a removed boolean as that's a bit more readable than looking at curr
and comparing it with e.Prev()
.
pkg/kv/kvserver/concurrency/lock/locking.go
line 132 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: I feel like we usually use the term "weaker" or "stronger" instead of "less" or "more", so consider naming this
Weaker
.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r5.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani)
pkg/kv/kvserver/concurrency/lock_table.go
line 2963 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We decided to remove this and replace it with a call to
recomputeWaitQueues
.
Don't we want to remove the tl.getLockMode().Strength == lock.Exclusive || ...
?
pkg/kv/kvserver/concurrency/lock_table.go
line 3254 at r4 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
I can follow this up with another PR -- I'm assuming you said "not for this PR" because we want to reduce the amount of code we're backporting to 23.2, which makes sense.
Yeah, we don't want to backport refactors that aren't directly fixing issues where we can help it.
pkg/kv/kvserver/concurrency/lock/locking.go
line 83 at r5 (raw file):
func Conflicts(m1, m2 Mode, sv *settings.Values) bool { if m1.Empty() || m2.Empty() { return false // no conflict with empty lock modes
Can we add this to a unit test?
When a lock is updated on behalf of a non-finalized transaction, the lock table simply forgets any replicated lock information it was previously tracking. Now, if a lock is held with both replicated and unreplicated durability, and the replicated lock's strength is stronger than the unreplicated lock's strength, forgetting the replicated lock means there may be requests waiting on a key that no longer conflict with what's being tracked in memory[1]. In such cases, requests that no longer conflict with the in-memory tracking should no longer wait on this key -- they should be allowed to proceed with their scan. It's not guaranteed that they no longer conflict with the lock -- they might rediscover the replicated lock again and start waiting. However, if they no longer conflict with the lock, blocking them indefinitely can lead to undetected deadlocks in rare cases[2]. [1] Concretely, consider the following construction: - key a: [txn 1, (repl intent ts@10), (unrepl shared)] - wait-queue readers: [none ts@12], [none ts@15] - wait-queue locking requests: [shared], [shared], [exclusive] [shared] In this case, updating the intent's timestamp to ts@14 (and forgetting it) should allow both the non-locking readers and first two shared locking requests to become compatible with what's being tracked in memory for this key. [2] This can happen if the lock update corresponds to a successful PUSH_TIMESTAMP. The pusher will be blocked in the lock table waiter waiting for a `waitingState` update that never comes. Now, if there was a dependency cycle between the pusher and pushee had the push timestamp been blocking, we would find ourselves in an undetected deeadlock situation. Fixes cockroachdb#112608 Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
bors r=nvanbenschoten
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table.go
line 2963 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Don't we want to remove the
tl.getLockMode().Strength == lock.Exclusive || ...
?
Yeah, we did -- it's implied if we're in this block.
pkg/kv/kvserver/concurrency/lock/locking.go
line 83 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Can we add this to a unit test?
Added.
Build succeeded: |
blathers backport 23.2 |
112078: roachtest: clean up command-line flags r=RaduBerinde a=RaduBerinde **This PR is only for the last commit. The rest are #111811** --- #### roachtest: clean up command-line flags The code around command-line flags is pretty messy: they're defined in many places; the name and description of a flag are far away from the variable; the variable names look like local variables and in many cases it's not obvious we're accessing a global. This commit moves all flags to a separate subpackage, `roachtestflags`, making all uses of global flags obvious. We also add a bit of infrastructure to allow defining all information about a flag right next to where the variable is declared. We also provide a `Changed()` function that determines if a flag value was changed (without the caller having to use the Command or even the flag name). There should be no functional changes (just some cosmetic improvements to the flag usage texts). Epic: none Release note: None 113245: kvserver: add metric and log when raft.Storage returns an error r=erikgrinaker a=sumeerbhola The raft.storage.error metric is incremented on an error, and the error is logged every 30s (across all replicas). This was motivated by a test cluster that slowed to a crawl because of deliberate data loss, but was hard to diagnose. The metric could be used for alerting, since we don't expect to see transient errors. Informs #113053 Epic: none Release note: None 113335: kvpb: delete ErrorDetail message r=nvanbenschoten a=nvanbenschoten This was unused, so delete it. The message has been unused since 0c12f6c. Epic: None Release note: None 113636: concurrency: recompute wait queues when locking requests drop out r=nvanbenschoten a=arulajmani First commit from #112732 ---- A locking request must actively wait in a lock's wait queues if: - it conflicts with any of the lock holders. - or it conflicts with a lower sequence numbered request already in the lock's wait queue. As a result, if a locking request exits a lock's wait queue without actually acquiring the lock, it may allow other locking requests to proceed. This patch recomputes wait queues whenever a locking request exits a lock's wait queues to detect such scenarios and unblock requests which were actively waiting previously not no longer need to. Fixes #111144 Release note: None Co-authored-by: Radu Berinde <[email protected]> Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Arul Ajmani <[email protected]>
When a lock is updated on behalf of a non-finalized transaction, the lock table simply forgets any replicated lock information it was previously tracking. Now, if a lock is held with both replicated and unreplicated durability, and the replicated lock's strength is stronger than the unreplicated lock's strength, forgetting the replicated lock means there may be requests waiting on a key that no longer conflict with what's being tracked in memory[*]. In such cases, requests that no longer conflict with the in-memory tracking should no longer wait on this key -- they should be allowed to proceed with their scan. It's not guaranteed that they no longer conflict with the lock -- they might rediscover the replicated lock again and start waiting. However, if they no longer conflict with the lock, blocking them indefinitely can lead to deadlocks.
[*] Concretely, consider the following construction:
Fixes #112608
Release note: None