Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concurrency: use lock.Modes to determine conflicts in tryActiveWait #104261

Conversation

arulajmani
Copy link
Collaborator

This patch refactors tryActiveWait to make use of lock.Modes to
determine conflicts. It's done so with the imminent introduction of
more locking strengths in mind. To that effect, a lot of things are
already generalized and the structure is such that it'll be easy to
extend to multiple locks on a key.

At a high level, the journey of a request through tryActiveWait is
as follows:

  • First, it checks if its transaction already holds a lock at a equal
    or higher lock strength (read: it's good enough for its purpose). If
    this is the case, it can proceed without bothering with any bookkeeping.
  • It then checks if there's a lock holder with which it conflicts. It
    decides to actively wait if that is the case.
  • Finalized transactions/lock cleanup is handled next, if required.
  • At this point, a non-locking request can proceed.
  • A locking request first inserts itself in the correct position (as
    dictated by its sequence number) in the lock's wait-queue. It then
    iterates through the head of the wait-queue until it finds itself.
    While doing so, it checks for conflicts with requests waiting in front
    of it. If there is a conflict, it decides to actively wait. Otherwise,
    it can proceed with its scan.
  • If the request is of the transactional kind, it establishes a
    (possibly joint) claim on the lock before proceeding with its scan.
    A non-transactional request does not.

Closes #102210

Release note: None

We'll make use of these when constructing lock modes.

Epic: none

Release note: None
@arulajmani arulajmani requested a review from nvanbenschoten June 2, 2023 15:04
@arulajmani arulajmani requested a review from a team as a code owner June 2, 2023 15:04
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@arulajmani arulajmani changed the title concurrency: use of lock.Modes to determine conflicts in tryActiveWait concurrency: use lock.Modes to determine conflicts in tryActiveWait Jun 2, 2023
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1, 1 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)


pkg/kv/kvserver/concurrency/lock_table.go line 1640 at r2 (raw file):

			var ok bool
			replicatedFinalizedTxn, ok = g.lt.finalizedTxnCache.get(l.holder.holder[lock.Replicated].txn.ID)
			assert(ok, "expected txn to be finalized")

Is the cache guaranteed to still contain the transaction at this point? It's a small LRU cache that can be manipulated concurrently.


pkg/kv/kvserver/concurrency/lock_table.go line 1660 at r2 (raw file):

	// a claim like normal. That is, it should add itself to the lock's wait
	// queue as an inactive writer.
	if !l.holder.locked && l.queuedWriters.Len() == 0 {

Is this the same condition as if transitionedToFree { return false, true }?


pkg/kv/kvserver/concurrency/lock_table.go line 1683 at r2 (raw file):

			// prepareActiveWait and returning true here; there won't really be any
			// wait here.
			l.prepareActiveWait(g, notify, clock)

Doesn't this also call into adjustWaitingWritersWaitQueue? I'm surprised that we need this second handling of isMaxQueueLengthExceededError here.


pkg/kv/kvserver/concurrency/lock_table.go line 1712 at r2 (raw file):

	// it is).
	guardLockMode := lock.MakeModeIntent(g.ts)
	for e := l.queuedWriters.Front(); e != nil; e = e.Next() {

I wonder whether there's room to unify this logic with the incremental handling of lock releasing. There's some duplicated logic between this function and maybeReleaseFirstTransactionalWriter.

I think one of the biggest sources of complexity in this code is the multiple, non-idempotent code paths that attempt to maintain the same invariants while handling different state transitions. That's only warranted if the redundancy is necessary for efficiency. But is it in this case? Could we treat queuing as the combination of:

  1. adding to the end of the wait queue, assuming we're going to have to wait
  2. releasing the first transactional writer, which may be us

pkg/kv/kvserver/concurrency/lock_table.go line 1714 at r2 (raw file):

	for e := l.queuedWriters.Front(); e != nil; e = e.Next() {
		qqg := e.Value.(*queuedGuard)
		if qqg.guard == g {

Let's call out what's happened here:

// We found our request while scanning from the front before finding any conflicting waiters. There is no one for us to wait on.

pkg/kv/kvserver/concurrency/lock_table.go line 1721 at r2 (raw file):

			}
			// Transactional writer.
			return false, transitionedToFree

Is it safe to return transitionedToFree = true here if we just added our own waiter?


pkg/kv/kvserver/concurrency/lock_table.go line 1725 at r2 (raw file):

		waiterLockMode := lock.MakeModeIntent(qg.guard.ts)
		if lock.Conflicts(waiterLockMode, guardLockMode, &g.lt.settings.SV) {
			l.prepareActiveWait(g, notify, clock)

Same question. Are we inserting into the queue twice by first calling adjustWaitingWritersWaitQueue directly and then calling prepareActiveWait?


pkg/kv/kvserver/concurrency/lock_table.go line 1850 at r2 (raw file):

}

// updateWaitingState updates the waiting state for the supplied request to wait

The comment on this function is incorrect.


pkg/kv/kvserver/concurrency/lock_table.go line 3323 at r2 (raw file):

}

type maxQueueLengthExceededError struct{}

Let's give this a comment. I'm also not sure how useful it is to have an error return path that can only return one kind of error. Up to you.

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flushing replies to a few comments.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table.go line 1583 at r2 (raw file):

// resolution (or done it), so this request does not need to do the resolution.
//
// TODO(XXX): The paragraph above is now stale. We're letting writers accumulate

Wanted to call out this part for discussion.


pkg/kv/kvserver/concurrency/lock_table.go line 1640 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is the cache guaranteed to still contain the transaction at this point? It's a small LRU cache that can be manipulated concurrently.

Good point. The only reason I needed this was because I didn't want to I need a roachpb.Transaction here. I could just modify conflictsWithLockHolder to return the replicatedFinalizedTxn instead of belongsToFinalizedTxn.


pkg/kv/kvserver/concurrency/lock_table.go line 1660 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this the same condition as if transitionedToFree { return false, true }?

Yeah, it is. I can make the switch if we feel that's easier to reason about.


pkg/kv/kvserver/concurrency/lock_table.go line 1683 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Doesn't this also call into adjustWaitingWritersWaitQueue? I'm surprised that we need this second handling of isMaxQueueLengthExceededError here.

Yeah, but handling the the isMaxQueueLengthExceededError involves the same changes prepareForActiveWait makes when it hits this error. I wasn't sure what's clearer -- duplicating those changes here, or reusing prepareActiveWait. We could instead handle with this as well:

--- a/pkg/kv/kvserver/concurrency/lock_table.go
+++ b/pkg/kv/kvserver/concurrency/lock_table.go
@@ -1680,7 +1680,15 @@ func (l *lockState) tryActiveWait(
                        // based on their waiting state. The mechanism by which this happens is
                        // prepareActiveWait and returning true here; there won't really be any
                        // wait here.
-                       l.prepareActiveWait(g, notify, clock)
+                       g.key = l.key
+                       g.mu.startWait = true
+                       g.mu.curLockWaitStart = clock.PhysicalTime()
+                       ws := l.constructWaitingState(g)
+                       ws.kind = waitQueueMaxLengthExceeded
+                       g.updateWaitingStateLocked(ws)
+                       if notify {
+                               g.notify()
+                       }
                        return true, transitionedToFree
                }
                panic(err) // should not happen

Is this clearer?


pkg/kv/kvserver/concurrency/lock_table.go line 1712 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I wonder whether there's room to unify this logic with the incremental handling of lock releasing. There's some duplicated logic between this function and maybeReleaseFirstTransactionalWriter.

I think one of the biggest sources of complexity in this code is the multiple, non-idempotent code paths that attempt to maintain the same invariants while handling different state transitions. That's only warranted if the redundancy is necessary for efficiency. But is it in this case? Could we treat queuing as the combination of:

  1. adding to the end of the wait queue, assuming we're going to have to wait
  2. releasing the first transactional writer, which may be us

Yeah. Now that you mention it, I think we need a call to informActiveWaiters here in the future -- we may have inserted this guy at the front of the queue, and the claimant txn may have changed as a result.


pkg/kv/kvserver/concurrency/lock_table.go line 1721 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is it safe to return transitionedToFree = true here if we just added our own waiter?

It should be safe, because we check if a lock is non-empty before GC-ing it. It might result in an unnecessary call to tryGCLocks, but it shouldn't be a correctness issue. I could also see why we should just return false instead. I could go either way here -- I actually did switch this locally right before pushing. Let me know if you have a preference either way.


pkg/kv/kvserver/concurrency/lock_table.go line 1725 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Same question. Are we inserting into the queue twice by first calling adjustWaitingWritersWaitQueue directly and then calling prepareActiveWait?

adjustWaitingWritersWaitQueue has logic to handle the case where a request already exists. In this case, it'll effectively mark the request as an active waiter (we tentatively inserted it as an inactive waiter above).

Previously, `informActiveWaiters` would unconditionally signal the
new state channel, even if the waiting state for the active waiter
hadn't changed. This is fine, as the contract for signaling the new
state channel makes no guarantees that there has been a state change.
With this patch, `informActiveWaiters` detects such no-op cases and
elides signaling the channel. This will prove handy when we expand the
use of `informActiveWaiters`.

Release note: None
This patch refactors `tryActiveWait` to make use of lock.Modes to
determine conflicts. It's done so with the imminent introduction of
more locking strengths in mind. To that effect, a lot of things are
already generalized and the structure is such that it'll be easy to
extend to multiple locks on a key.

At a high level, the journey of a request through tryActiveWait is
as follows:

- First, it checks if its transaction already holds a lock at a equal
or higher lock strength (read: it's good enough for its purpose). If
this is the case, it can proceed without bothering with any bookkeeping.
- It then checks if there's a lock holder with which it conflicts. It
decides to actively wait if that is the case.
- Finalized transactions/lock cleanup is handled next, if required.
- At this point, a non-locking request can proceed.
- A locking request first inserts itself in the correct position (as
dictated by its sequence number) in the lock's wait-queue. It then
iterates through the head of the wait-queue until it finds itself.
While doing so, it checks for conflicts with requests waiting in front
of it. If there is a conflict, it decides to actively wait. Otherwise,
it can proceed with its scan.
- If the request is of the transactional kind, it establishes a
(possibly joint) claim on the lock before proceeding with its scan.
A non-transactional request does not.

Closes cockroachdb#102210

Release note: None
@arulajmani arulajmani force-pushed the integrate-conflicts-try-active-wait-v2 branch from 6b6e205 to 0c637b8 Compare June 2, 2023 20:42
Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, RFAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table.go line 1640 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Good point. The only reason I needed this was because I didn't want to I need a roachpb.Transaction here. I could just modify conflictsWithLockHolder to return the replicatedFinalizedTxn instead of belongsToFinalizedTxn.

I meant finalizedTxn here. Updated.


pkg/kv/kvserver/concurrency/lock_table.go line 1660 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Yeah, it is. I can make the switch if we feel that's easier to reason about.

(haven't done this)


pkg/kv/kvserver/concurrency/lock_table.go line 1683 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Yeah, but handling the the isMaxQueueLengthExceededError involves the same changes prepareForActiveWait makes when it hits this error. I wasn't sure what's clearer -- duplicating those changes here, or reusing prepareActiveWait. We could instead handle with this as well:

--- a/pkg/kv/kvserver/concurrency/lock_table.go
+++ b/pkg/kv/kvserver/concurrency/lock_table.go
@@ -1680,7 +1680,15 @@ func (l *lockState) tryActiveWait(
                        // based on their waiting state. The mechanism by which this happens is
                        // prepareActiveWait and returning true here; there won't really be any
                        // wait here.
-                       l.prepareActiveWait(g, notify, clock)
+                       g.key = l.key
+                       g.mu.startWait = true
+                       g.mu.curLockWaitStart = clock.PhysicalTime()
+                       ws := l.constructWaitingState(g)
+                       ws.kind = waitQueueMaxLengthExceeded
+                       g.updateWaitingStateLocked(ws)
+                       if notify {
+                               g.notify()
+                       }
                        return true, transitionedToFree
                }
                panic(err) // should not happen

Is this clearer?

Made this change for now. Let me know how you feel about this.


pkg/kv/kvserver/concurrency/lock_table.go line 1712 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Yeah. Now that you mention it, I think we need a call to informActiveWaiters here in the future -- we may have inserted this guy at the front of the queue, and the claimant txn may have changed as a result.

I sat with this for a bit, and I'm leaning towards not making the change. A source of complexity in tryActiveWait is the somewhat surprising side effects of this function. For example, the calls to clearLockHolder and lockIsFree which can uncork unrelated requests, like I was talking to you about on Slack. Longer term, I think it'll be easier to reason about tryActiveWait if it only performs state changes that directly fall out of whether the supplied request should wait or not.

In the future, maybeReleaseFirstTransactionalWriter will probably be replaced by maybeReleaseAllCompatibleRequestsFromHeadOfTheQueue[1]. Such a function is more general than what we need here. It begs the same question as above, why is tryActiveWait uncorking other requests?

What's even more subtle is that it'll actually never uncork any other request at all! It'll either mark itself as inactive or be a no-op [2]. So making this change feels like we're needlessly expanding the scope of states we can be in at this point.

I did update this to call informActiveWaiters though. This motivated adding a new commit (2 of 3).


[1] maybeReleaseAllCompatibleRequestsFromHeadOfTheQueue will traverse the head of the queued locking requests list, removing any non-transactional write requests it encounters, and marking all compatible requests as inactive waiters and nudging them to proceed with their scan. It'll stop when it discovers an incompatible request.

[2] That's because if there were a request that was compatible with the head of the lock's wait queue (and potentially the lock holder), it shouldn't have been an active waiter to begin with.


pkg/kv/kvserver/concurrency/lock_table.go line 1850 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The comment on this function is incorrect.

Thanks, fixed.


pkg/kv/kvserver/concurrency/lock_table.go line 3323 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Let's give this a comment. I'm also not sure how useful it is to have an error return path that can only return one kind of error. Up to you.

Added some comments. Mild preference to keep these using errors to indicate cases where we didn't modify the wait queues instead of returning a boolean, so keeping this as is.

@arulajmani arulajmani requested a review from nvanbenschoten June 2, 2023 20:42
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)


-- commits line 11 at r3:
What do you think about keeping this "elide signaling" change separate? It feels self-contained and probably deserves more scrutiny than it will get if grouped into this PR. A few questions to ask ourselves about it:

  • what kind of bugs can occur from under-eliding? What kind of bugs can occur from over-eliding?
  • is the logic in canElideWaitingStateUpdate prone to rot as new fields are added to waitingState?
  • does the logic in canElideWaitingStateUpdate need to consider every field? Even observability-related fields like queuedWriters?
  • can we push this into updateWaitingStateLocked so that we don't add complexity to existing methods?

pkg/kv/kvserver/concurrency/lock_table.go line 1660 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

(haven't done this)

Yeah, I think it is easier to reason about. It makes it clear to readers why this lock is now unheld and with an empty wait queue. In fact, maybe move it up to where transitionedToFree is set so that it doesn't mix with other paths through this function.

+1 to removing the special case (in a future PR), by the way.


pkg/kv/kvserver/concurrency/lock_table.go line 1721 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

It should be safe, because we check if a lock is non-empty before GC-ing it. It might result in an unnecessary call to tryGCLocks, but it shouldn't be a correctness issue. I could also see why we should just return false instead. I could go either way here -- I actually did switch this locally right before pushing. Let me know if you have a preference either way.

Related to the thread above, it's mildly confusing that we set transitionedToFree way up above and then keep using the variable throughout. It raises questions about whether any of the state transitions that we've performed have invalidated this stale piece of information. I'd suggest only using transitionedToFree on the one path that sets it and returning false (the literal) everywhere else.

@arulajmani
Copy link
Collaborator Author

Closing this one in favour of #104620. None of the outstanding comments are relevant to the new structure, so a fresh slate will probably be easier. Hopefully 3rd time is the charm!

Btw, thanks for the collaboration on this one @nvanbenschoten. I won't jinx it by calling it fun before we actually land this, but I'm mostly there.

@arulajmani arulajmani closed this Jun 8, 2023
craig bot pushed a commit that referenced this pull request Jun 27, 2023
104620: concurrency: use lock modes to find conflicts during lock table scans r=nvanbenschoten a=arulajmani

Previous attempt over at: #104261

First commit from: #104537
Second commit from: #104600

This patch majorly refactors the lock table scan codepath, all in the
name of shared locks. At its core is a desire to use lock modes to
perform conflict resolution between an incoming request and locks held
on one particular key. In doing so, we rip out tryActiveWait.

At a high level (for a particular key), a request's journey looks like
the following:

- It first checks if the transaction already holds a lock at a equal or
higher lock strength (read: It's good enough for its use). If this is
the case, it can proceed without any bookkeeping.
- It then checks if any finalized transactions hold locks on the key.
Such locks do not conflict, but need to be resolved before the
transaction can evaluate. They're thus accumulated for later.
- The request then enqueues itself in the appropriate wait queue.
- It then determines if it needs to actively wait at this lock because
of a conflict. If that's the case, the lock table scan short circuits.
- Otherwise, the request lays a claim (if it can) before proceeding with
its scan.

Closes #102210

Release note: None

105482: sqltelemetry: add missing schema telemetry r=postamar a=rafiss

CREATE [ SCHEMA | INDEX | FUNCTION | TYPE ] and ALTER FUNCTION did not
have any telemetry, but they should.

Epic: None
Release note: None

105579: sql: disallow cross-database type references in CTAS r=chengxiong-ruan a=chengxiong-ruan

Fixes: #105393

Release note (bug fix): reviously, cross-database type references could sneaked in through `CREATE TABLE...AS` statements if the source table is from another database and any of its columns is of a user defined type. This introduced bug where the source table can be dropped and type could not be found for the CTAS table. This commit disallow such CTAS as a fix.

105581: optbuilder: reset annotations when building CREATE FUNCTION r=rafiss a=rafiss

In 22dabb0 we started overriding the annotations for each statement in the UDF body. We should reset them to the original values, so we don't accidentally leave the old reference.

Epic: None
Release note: None

105596: storage: make `storage.max_sync_duration` public and `TenantReadOnly` r=erikgrinaker a=erikgrinaker

Users have asked why this setting is not public, this patch makes it so.

Furthermore, these settings were `TenantWritable`. We do not want these to be writable by tenants, where they can potentially cause problems on SQL nodes, considering e.g. SQL disk spilling uses Pebble.

Epic: none
Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

concurrency: integrate lock Modes in the lock table
3 participants