-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concurrency: use lock.Modes to determine conflicts in tryActiveWait #104261
concurrency: use lock.Modes to determine conflicts in tryActiveWait #104261
Conversation
We'll make use of these when constructing lock modes. Epic: none Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, 1 of 2 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)
pkg/kv/kvserver/concurrency/lock_table.go
line 1640 at r2 (raw file):
var ok bool replicatedFinalizedTxn, ok = g.lt.finalizedTxnCache.get(l.holder.holder[lock.Replicated].txn.ID) assert(ok, "expected txn to be finalized")
Is the cache guaranteed to still contain the transaction at this point? It's a small LRU cache that can be manipulated concurrently.
pkg/kv/kvserver/concurrency/lock_table.go
line 1660 at r2 (raw file):
// a claim like normal. That is, it should add itself to the lock's wait // queue as an inactive writer. if !l.holder.locked && l.queuedWriters.Len() == 0 {
Is this the same condition as if transitionedToFree { return false, true }
?
pkg/kv/kvserver/concurrency/lock_table.go
line 1683 at r2 (raw file):
// prepareActiveWait and returning true here; there won't really be any // wait here. l.prepareActiveWait(g, notify, clock)
Doesn't this also call into adjustWaitingWritersWaitQueue
? I'm surprised that we need this second handling of isMaxQueueLengthExceededError
here.
pkg/kv/kvserver/concurrency/lock_table.go
line 1712 at r2 (raw file):
// it is). guardLockMode := lock.MakeModeIntent(g.ts) for e := l.queuedWriters.Front(); e != nil; e = e.Next() {
I wonder whether there's room to unify this logic with the incremental handling of lock releasing. There's some duplicated logic between this function and maybeReleaseFirstTransactionalWriter
.
I think one of the biggest sources of complexity in this code is the multiple, non-idempotent code paths that attempt to maintain the same invariants while handling different state transitions. That's only warranted if the redundancy is necessary for efficiency. But is it in this case? Could we treat queuing as the combination of:
- adding to the end of the wait queue, assuming we're going to have to wait
- releasing the first transactional writer, which may be us
pkg/kv/kvserver/concurrency/lock_table.go
line 1714 at r2 (raw file):
for e := l.queuedWriters.Front(); e != nil; e = e.Next() { qqg := e.Value.(*queuedGuard) if qqg.guard == g {
Let's call out what's happened here:
// We found our request while scanning from the front before finding any conflicting waiters. There is no one for us to wait on.
pkg/kv/kvserver/concurrency/lock_table.go
line 1721 at r2 (raw file):
} // Transactional writer. return false, transitionedToFree
Is it safe to return transitionedToFree = true
here if we just added our own waiter?
pkg/kv/kvserver/concurrency/lock_table.go
line 1725 at r2 (raw file):
waiterLockMode := lock.MakeModeIntent(qg.guard.ts) if lock.Conflicts(waiterLockMode, guardLockMode, &g.lt.settings.SV) { l.prepareActiveWait(g, notify, clock)
Same question. Are we inserting into the queue twice by first calling adjustWaitingWritersWaitQueue
directly and then calling prepareActiveWait
?
pkg/kv/kvserver/concurrency/lock_table.go
line 1850 at r2 (raw file):
} // updateWaitingState updates the waiting state for the supplied request to wait
The comment on this function is incorrect.
pkg/kv/kvserver/concurrency/lock_table.go
line 3323 at r2 (raw file):
} type maxQueueLengthExceededError struct{}
Let's give this a comment. I'm also not sure how useful it is to have an error return path that can only return one kind of error. Up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flushing replies to a few comments.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table.go
line 1583 at r2 (raw file):
// resolution (or done it), so this request does not need to do the resolution. // // TODO(XXX): The paragraph above is now stale. We're letting writers accumulate
Wanted to call out this part for discussion.
pkg/kv/kvserver/concurrency/lock_table.go
line 1640 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is the cache guaranteed to still contain the transaction at this point? It's a small LRU cache that can be manipulated concurrently.
Good point. The only reason I needed this was because I didn't want to I need a roachpb.Transaction
here. I could just modify conflictsWithLockHolder
to return the replicatedFinalizedTxn
instead of belongsToFinalizedTxn
.
pkg/kv/kvserver/concurrency/lock_table.go
line 1660 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is this the same condition as
if transitionedToFree { return false, true }
?
Yeah, it is. I can make the switch if we feel that's easier to reason about.
pkg/kv/kvserver/concurrency/lock_table.go
line 1683 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Doesn't this also call into
adjustWaitingWritersWaitQueue
? I'm surprised that we need this second handling ofisMaxQueueLengthExceededError
here.
Yeah, but handling the the isMaxQueueLengthExceededError
involves the same changes prepareForActiveWait
makes when it hits this error. I wasn't sure what's clearer -- duplicating those changes here, or reusing prepareActiveWait
. We could instead handle with this as well:
--- a/pkg/kv/kvserver/concurrency/lock_table.go
+++ b/pkg/kv/kvserver/concurrency/lock_table.go
@@ -1680,7 +1680,15 @@ func (l *lockState) tryActiveWait(
// based on their waiting state. The mechanism by which this happens is
// prepareActiveWait and returning true here; there won't really be any
// wait here.
- l.prepareActiveWait(g, notify, clock)
+ g.key = l.key
+ g.mu.startWait = true
+ g.mu.curLockWaitStart = clock.PhysicalTime()
+ ws := l.constructWaitingState(g)
+ ws.kind = waitQueueMaxLengthExceeded
+ g.updateWaitingStateLocked(ws)
+ if notify {
+ g.notify()
+ }
return true, transitionedToFree
}
panic(err) // should not happen
Is this clearer?
pkg/kv/kvserver/concurrency/lock_table.go
line 1712 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I wonder whether there's room to unify this logic with the incremental handling of lock releasing. There's some duplicated logic between this function and
maybeReleaseFirstTransactionalWriter
.I think one of the biggest sources of complexity in this code is the multiple, non-idempotent code paths that attempt to maintain the same invariants while handling different state transitions. That's only warranted if the redundancy is necessary for efficiency. But is it in this case? Could we treat queuing as the combination of:
- adding to the end of the wait queue, assuming we're going to have to wait
- releasing the first transactional writer, which may be us
Yeah. Now that you mention it, I think we need a call to informActiveWaiters
here in the future -- we may have inserted this guy at the front of the queue, and the claimant txn may have changed as a result.
pkg/kv/kvserver/concurrency/lock_table.go
line 1721 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is it safe to return
transitionedToFree = true
here if we just added our own waiter?
It should be safe, because we check if a lock is non-empty before GC-ing it. It might result in an unnecessary call to tryGCLocks
, but it shouldn't be a correctness issue. I could also see why we should just return false instead. I could go either way here -- I actually did switch this locally right before pushing. Let me know if you have a preference either way.
pkg/kv/kvserver/concurrency/lock_table.go
line 1725 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Same question. Are we inserting into the queue twice by first calling
adjustWaitingWritersWaitQueue
directly and then callingprepareActiveWait
?
adjustWaitingWritersWaitQueue
has logic to handle the case where a request already exists. In this case, it'll effectively mark the request as an active waiter (we tentatively inserted it as an inactive waiter above).
Previously, `informActiveWaiters` would unconditionally signal the new state channel, even if the waiting state for the active waiter hadn't changed. This is fine, as the contract for signaling the new state channel makes no guarantees that there has been a state change. With this patch, `informActiveWaiters` detects such no-op cases and elides signaling the channel. This will prove handy when we expand the use of `informActiveWaiters`. Release note: None
This patch refactors `tryActiveWait` to make use of lock.Modes to determine conflicts. It's done so with the imminent introduction of more locking strengths in mind. To that effect, a lot of things are already generalized and the structure is such that it'll be easy to extend to multiple locks on a key. At a high level, the journey of a request through tryActiveWait is as follows: - First, it checks if its transaction already holds a lock at a equal or higher lock strength (read: it's good enough for its purpose). If this is the case, it can proceed without bothering with any bookkeeping. - It then checks if there's a lock holder with which it conflicts. It decides to actively wait if that is the case. - Finalized transactions/lock cleanup is handled next, if required. - At this point, a non-locking request can proceed. - A locking request first inserts itself in the correct position (as dictated by its sequence number) in the lock's wait-queue. It then iterates through the head of the wait-queue until it finds itself. While doing so, it checks for conflicts with requests waiting in front of it. If there is a conflict, it decides to actively wait. Otherwise, it can proceed with its scan. - If the request is of the transactional kind, it establishes a (possibly joint) claim on the lock before proceeding with its scan. A non-transactional request does not. Closes cockroachdb#102210 Release note: None
6b6e205
to
0c637b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, RFAL.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table.go
line 1640 at r2 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Good point. The only reason I needed this was because I didn't want to I need a
roachpb.Transaction
here. I could just modifyconflictsWithLockHolder
to return thereplicatedFinalizedTxn
instead ofbelongsToFinalizedTxn
.
I meant finalizedTxn
here. Updated.
pkg/kv/kvserver/concurrency/lock_table.go
line 1660 at r2 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Yeah, it is. I can make the switch if we feel that's easier to reason about.
(haven't done this)
pkg/kv/kvserver/concurrency/lock_table.go
line 1683 at r2 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Yeah, but handling the the
isMaxQueueLengthExceededError
involves the same changesprepareForActiveWait
makes when it hits this error. I wasn't sure what's clearer -- duplicating those changes here, or reusingprepareActiveWait
. We could instead handle with this as well:--- a/pkg/kv/kvserver/concurrency/lock_table.go +++ b/pkg/kv/kvserver/concurrency/lock_table.go @@ -1680,7 +1680,15 @@ func (l *lockState) tryActiveWait( // based on their waiting state. The mechanism by which this happens is // prepareActiveWait and returning true here; there won't really be any // wait here. - l.prepareActiveWait(g, notify, clock) + g.key = l.key + g.mu.startWait = true + g.mu.curLockWaitStart = clock.PhysicalTime() + ws := l.constructWaitingState(g) + ws.kind = waitQueueMaxLengthExceeded + g.updateWaitingStateLocked(ws) + if notify { + g.notify() + } return true, transitionedToFree } panic(err) // should not happen
Is this clearer?
Made this change for now. Let me know how you feel about this.
pkg/kv/kvserver/concurrency/lock_table.go
line 1712 at r2 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Yeah. Now that you mention it, I think we need a call to
informActiveWaiters
here in the future -- we may have inserted this guy at the front of the queue, and the claimant txn may have changed as a result.
I sat with this for a bit, and I'm leaning towards not making the change. A source of complexity in tryActiveWait
is the somewhat surprising side effects of this function. For example, the calls to clearLockHolder
and lockIsFree
which can uncork unrelated requests, like I was talking to you about on Slack. Longer term, I think it'll be easier to reason about tryActiveWait
if it only performs state changes that directly fall out of whether the supplied request should wait or not.
In the future, maybeReleaseFirstTransactionalWriter
will probably be replaced by maybeReleaseAllCompatibleRequestsFromHeadOfTheQueue
[1]. Such a function is more general than what we need here. It begs the same question as above, why is tryActiveWait
uncorking other requests?
What's even more subtle is that it'll actually never uncork any other request at all! It'll either mark itself as inactive or be a no-op [2]. So making this change feels like we're needlessly expanding the scope of states we can be in at this point.
I did update this to call informActiveWaiters
though. This motivated adding a new commit (2 of 3).
[1] maybeReleaseAllCompatibleRequestsFromHeadOfTheQueue
will traverse the head of the queued locking requests list, removing any non-transactional write requests it encounters, and marking all compatible requests as inactive waiters and nudging them to proceed with their scan. It'll stop when it discovers an incompatible request.
[2] That's because if there were a request that was compatible with the head of the lock's wait queue (and potentially the lock holder), it shouldn't have been an active waiter to begin with.
pkg/kv/kvserver/concurrency/lock_table.go
line 1850 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
The comment on this function is incorrect.
Thanks, fixed.
pkg/kv/kvserver/concurrency/lock_table.go
line 3323 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Let's give this a comment. I'm also not sure how useful it is to have an error return path that can only return one kind of error. Up to you.
Added some comments. Mild preference to keep these using errors to indicate cases where we didn't modify the wait queues instead of returning a boolean, so keeping this as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)
-- commits
line 11 at r3:
What do you think about keeping this "elide signaling" change separate? It feels self-contained and probably deserves more scrutiny than it will get if grouped into this PR. A few questions to ask ourselves about it:
- what kind of bugs can occur from under-eliding? What kind of bugs can occur from over-eliding?
- is the logic in
canElideWaitingStateUpdate
prone to rot as new fields are added towaitingState
? - does the logic in
canElideWaitingStateUpdate
need to consider every field? Even observability-related fields likequeuedWriters
? - can we push this into
updateWaitingStateLocked
so that we don't add complexity to existing methods?
pkg/kv/kvserver/concurrency/lock_table.go
line 1660 at r2 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
(haven't done this)
Yeah, I think it is easier to reason about. It makes it clear to readers why this lock is now unheld and with an empty wait queue. In fact, maybe move it up to where transitionedToFree
is set so that it doesn't mix with other paths through this function.
+1 to removing the special case (in a future PR), by the way.
pkg/kv/kvserver/concurrency/lock_table.go
line 1721 at r2 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
It should be safe, because we check if a lock is non-empty before GC-ing it. It might result in an unnecessary call to
tryGCLocks
, but it shouldn't be a correctness issue. I could also see why we should just return false instead. I could go either way here -- I actually did switch this locally right before pushing. Let me know if you have a preference either way.
Related to the thread above, it's mildly confusing that we set transitionedToFree
way up above and then keep using the variable throughout. It raises questions about whether any of the state transitions that we've performed have invalidated this stale piece of information. I'd suggest only using transitionedToFree
on the one path that sets it and returning false
(the literal) everywhere else.
Closing this one in favour of #104620. None of the outstanding comments are relevant to the new structure, so a fresh slate will probably be easier. Hopefully 3rd time is the charm! Btw, thanks for the collaboration on this one @nvanbenschoten. I won't jinx it by calling it fun before we actually land this, but I'm mostly there. |
104620: concurrency: use lock modes to find conflicts during lock table scans r=nvanbenschoten a=arulajmani Previous attempt over at: #104261 First commit from: #104537 Second commit from: #104600 This patch majorly refactors the lock table scan codepath, all in the name of shared locks. At its core is a desire to use lock modes to perform conflict resolution between an incoming request and locks held on one particular key. In doing so, we rip out tryActiveWait. At a high level (for a particular key), a request's journey looks like the following: - It first checks if the transaction already holds a lock at a equal or higher lock strength (read: It's good enough for its use). If this is the case, it can proceed without any bookkeeping. - It then checks if any finalized transactions hold locks on the key. Such locks do not conflict, but need to be resolved before the transaction can evaluate. They're thus accumulated for later. - The request then enqueues itself in the appropriate wait queue. - It then determines if it needs to actively wait at this lock because of a conflict. If that's the case, the lock table scan short circuits. - Otherwise, the request lays a claim (if it can) before proceeding with its scan. Closes #102210 Release note: None 105482: sqltelemetry: add missing schema telemetry r=postamar a=rafiss CREATE [ SCHEMA | INDEX | FUNCTION | TYPE ] and ALTER FUNCTION did not have any telemetry, but they should. Epic: None Release note: None 105579: sql: disallow cross-database type references in CTAS r=chengxiong-ruan a=chengxiong-ruan Fixes: #105393 Release note (bug fix): reviously, cross-database type references could sneaked in through `CREATE TABLE...AS` statements if the source table is from another database and any of its columns is of a user defined type. This introduced bug where the source table can be dropped and type could not be found for the CTAS table. This commit disallow such CTAS as a fix. 105581: optbuilder: reset annotations when building CREATE FUNCTION r=rafiss a=rafiss In 22dabb0 we started overriding the annotations for each statement in the UDF body. We should reset them to the original values, so we don't accidentally leave the old reference. Epic: None Release note: None 105596: storage: make `storage.max_sync_duration` public and `TenantReadOnly` r=erikgrinaker a=erikgrinaker Users have asked why this setting is not public, this patch makes it so. Furthermore, these settings were `TenantWritable`. We do not want these to be writable by tenants, where they can potentially cause problems on SQL nodes, considering e.g. SQL disk spilling uses Pebble. Epic: none Release note: None Co-authored-by: Arul Ajmani <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Chengxiong Ruan <[email protected]> Co-authored-by: Erik Grinaker <[email protected]>
This patch refactors
tryActiveWait
to make use of lock.Modes todetermine conflicts. It's done so with the imminent introduction of
more locking strengths in mind. To that effect, a lot of things are
already generalized and the structure is such that it'll be easy to
extend to multiple locks on a key.
At a high level, the journey of a request through tryActiveWait is
as follows:
or higher lock strength (read: it's good enough for its purpose). If
this is the case, it can proceed without bothering with any bookkeeping.
decides to actively wait if that is the case.
dictated by its sequence number) in the lock's wait-queue. It then
iterates through the head of the wait-queue until it finds itself.
While doing so, it checks for conflicts with requests waiting in front
of it. If there is a conflict, it decides to actively wait. Otherwise,
it can proceed with its scan.
(possibly joint) claim on the lock before proceeding with its scan.
A non-transactional request does not.
Closes #102210
Release note: None