Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concurrency: use of lock.Modes to determine confilcts in tryActiveWait #102973

Conversation

arulajmani
Copy link
Collaborator

This patch refactors tryActiveWait to make use of lock.Modes to determine conflicts. In doing so, it splits out the logic into 3 different methods:

  1. To determine a conflict with a lock holder.
  2. To determine a conflict with a reservation.
  3. To modify datastructures in preparation for active wait.

In doing so, we no longer make the assumption that there will only ever be a single reservation or lock holder on a key. In the future, we'll be able to further extend tryActiveWait to account for multiple locks/reservations on a key.

Informs #102210

Release note: None

@arulajmani arulajmani requested a review from nvanbenschoten May 9, 2023 18:58
@arulajmani arulajmani requested a review from a team as a code owner May 9, 2023 18:58
@blathers-crl
Copy link

blathers-crl bot commented May 9, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@arulajmani
Copy link
Collaborator Author

@nvanbenschoten still a bit rough, but the overall structure is there. I'll ping when it's ready for a detailed look.

@arulajmani arulajmani force-pushed the integrate-conflicts-try-active-wait branch from 4d8b392 to 33b75c7 Compare May 11, 2023 16:30
This patch refactors tryActiveWait to make use of lock.Modes to
determine conflicts. In doing so, it splits out the logic into 3
different methods:

1. To determine a conflict with a lock holder.
2. To determine a conflict with a reservation.
3. To modify datastructures in preparation for active wait.

In doing so, we no longer make the assumption that there will only
ever be a single reservation or lock holder on a key. In the future,
we'll be able to further extend tryActiveWait to account for multiple
locks/reservations on a key.

Informs cockroachdb#102210

Release note: None
@arulajmani arulajmani force-pushed the integrate-conflicts-try-active-wait branch from 33b75c7 to 5a39195 Compare May 11, 2023 16:34
@arulajmani
Copy link
Collaborator Author

@nvanbenschoten I reworked this a bit after our pair programming session the other day -- I'm quite happy with how things ended up! I still plan on adding a commit here that threads in cluster settings into the lock table, and using them when constructing lock modes, but that shouldn't preclude you from having a look.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)


-- commits line 2 at r2:
s/confilcts/conflicts/


-- commits line 9 at r2:
Do we need to call out the method to determine a conflict with a waiter? Or were we planning to consider reservations to be a form of waiters? In which case, s/reservation/waiter/.


pkg/kv/kvserver/concurrency/lock_table.go line 1515 at r2 (raw file):

// tryActiveWait decides whether the request, g, should actively wait at this
// key or not. It adjusts the data-structures appropriately if the request
// needs. If the request needs to wait at this key and the supplied notify

"if the request needs"

Consider giving a hint at what this adjustment might look like. You don't need to be precise or exhaustive, but some indication that this function may add g to the lock's wait-queue would be helpful.


pkg/kv/kvserver/concurrency/lock_table.go line 1521 at r2 (raw file):

// with any locks or reservations that may be present at this key. A request
// is only allowed to proceed if it is compatible with all locks and
// reservations.

"all locks and reservations"

And also all other waiters, right?


pkg/kv/kvserver/concurrency/lock_table.go line 1524 at r2 (raw file):

//
// There is one tricky case with tryActiveWait -- when the key is only locked by
// finalized transactions. Such cases are determined by consulting the

"by finalized transactions"

This implies that we always know with confidence whether a transaction is finalized or not. Consider rephrasing to something like "when a key is only locked by transactions that are known to be finalized".


pkg/kv/kvserver/concurrency/lock_table.go line 1528 at r2 (raw file):

// are a few considerations here:
//
// - For unreplicated locks, this method will silently clear the lock. The

Consider generalizing this sentence to work with multiple lock holders. Something like "the method will remove the finalized transaction(s) from the lock, which may leave it empty".


pkg/kv/kvserver/concurrency/lock_table.go line 1569 at r2 (raw file):

// traces of them via a round of replication. This is discussed in more detail
// in #41720. Specifically, see mention of "contention footprint" and
// COMMITTED_BUT_NOT_REMOVABLE. 	Also, resolving these locks/intents would

nit: there's a tab in here. In fact, there's a bit of messed up spacing in these lines.


pkg/kv/kvserver/concurrency/lock_table.go line 1607 at r2 (raw file):

	// after nudging the lock, there's no conflict -- this lock can be gc-ed at
	// the caller.
	if !l.isHeldOrReserved() {

We only hit this on the clearLockHolder path of conflictsWithLockHolder, right? If so, we're not doing any nudging here. It's also not clear that this benefits from being separated from that logic. The split raises questions about whether there are other reasons why we'd find a lock that's not held or reserved.


pkg/kv/kvserver/concurrency/lock_table.go line 1608 at r2 (raw file):

	// the caller.
	if !l.isHeldOrReserved() {
		// TODO(arul): Should a locking request acquire a reservation instead of

It probably should. The current behavior is surprising. This would also allow us to simplify the code, right? We could eliminate the transitionedToFree logic?


pkg/kv/kvserver/concurrency/lock_table.go line 1666 at r2 (raw file):

	var ws waitingState
	// TODO(arul): try and improve the structure of this logic.
	// TODO(arul): What if the request is already in the queue?

Do we need to address this TODO now? Would it help to push the waitQueueMaxLengthExceeded handling into adjustWaitQueues?


pkg/kv/kvserver/concurrency/lock_table.go line 1765 at r2 (raw file):

		waitForState.kind = waitSelf
	} else if l.distinguishedWaiter == nil {
		l.distinguishedWaiter = g

It's surprising to see that constructWaitingState is mutating lockState. Is there a better place for this? For instance, should we be setting the l.distinguishedWaiter in adjustWaitQueues and then just checking whether g is the distinguished waiter here?


pkg/kv/kvserver/concurrency/lock_table.go line 1870 at r2 (raw file):

While doing so, it should mark all active waiters it doesn't conflict with as inactive (and nudge them to no longer wait at this lock).

Why should it do this? Is that because we know we'd only reach this point if we don't conflict with the lock holder or any reservations? Does that necessarily mean that waiters that we don't conflict with also don't conflict with the lock holder or any reservation?


pkg/kv/kvserver/concurrency/lock_table.go line 1888 at r2 (raw file):

		return // non-{locking,transactional} requests cannot acquire reservations.
	}
	if l.reservation != nil && l.reservation == g {

Again, didn't we already check this condition?


pkg/kv/kvserver/concurrency/lock_table.go line 1923 at r2 (raw file):

		return false, nil // the lock isn't held; no conflict to speak of
	}
	if g.isSameTxn(lockHolderTxn) {

Didn't we already check this in the call to alreadyLockedOrReservedByRequest?


pkg/kv/kvserver/concurrency/lock_table.go line 1937 at r2 (raw file):

	}

	if g.str == lock.None {

It's subtle that g.str is mutable over the lifetime of a lockTableGuardImpl and yet it looks to the user like it is an immutable attribute of the guard. That's enforced by comments like "if the request, referenced by the supplied lockTableGuardImpl". I think this is why the previous code passed around a SpanAccess alongside the guard. I don't think we need to go that far, but maybe push g.str behind a well-named, well-commented accessor method which references the iteration in stepToNextSpan.

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nvanbenschoten I've replied in-line to all your comments. Even though a few of them weren't directly applicable in the new structure, I think the spirit still holds -- so thanks for the detailed look on this one! Like we spoke about offline, I opened a fresh slate PR over at #104261.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table.go line 1515 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"if the request needs"

Consider giving a hint at what this adjustment might look like. You don't need to be precise or exhaustive, but some indication that this function may add g to the lock's wait-queue would be helpful.

Done.


pkg/kv/kvserver/concurrency/lock_table.go line 1521 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"all locks and reservations"

And also all other waiters, right?

Fixed.


pkg/kv/kvserver/concurrency/lock_table.go line 1524 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"by finalized transactions"

This implies that we always know with confidence whether a transaction is finalized or not. Consider rephrasing to something like "when a key is only locked by transactions that are known to be finalized".

Fixed.


pkg/kv/kvserver/concurrency/lock_table.go line 1569 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: there's a tab in here. In fact, there's a bit of messed up spacing in these lines.

Ugh I think the line wrapping to 80 chars did this. Lazy question -- is there an easy way to wrap with tabs?


pkg/kv/kvserver/concurrency/lock_table.go line 1607 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We only hit this on the clearLockHolder path of conflictsWithLockHolder, right? If so, we're not doing any nudging here. It's also not clear that this benefits from being separated from that logic. The split raises questions about whether there are other reasons why we'd find a lock that's not held or reserved.

I ended up reworking this considerably. We can iterate on the new PR, but I think the new approach is more in line with what I think you're asking for in this comment.


pkg/kv/kvserver/concurrency/lock_table.go line 1608 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It probably should. The current behavior is surprising. This would also allow us to simplify the code, right? We could eliminate the transitionedToFree logic?

I've replaced this with a more opinionated TODO, but I'll probably get to it in a future patch -- mostly because there'll be some test churn as a result.

As for the transitionedToFree logic, I think we might need to keep it unless we do something about non-locking read requests as well.


pkg/kv/kvserver/concurrency/lock_table.go line 1666 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we need to address this TODO now? Would it help to push the waitQueueMaxLengthExceeded handling into adjustWaitQueues?

This is addressed now.


pkg/kv/kvserver/concurrency/lock_table.go line 1765 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It's surprising to see that constructWaitingState is mutating lockState. Is there a better place for this? For instance, should we be setting the l.distinguishedWaiter in adjustWaitQueues and then just checking whether g is the distinguished waiter here?

I was binge watching Rust videos over the weekend, like one does, so coming back and seeing what I'd written here was jarring.
The good news is I ended up discovering your suggestion for myself, so I feel a bit better.


pkg/kv/kvserver/concurrency/lock_table.go line 1870 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

While doing so, it should mark all active waiters it doesn't conflict with as inactive (and nudge them to no longer wait at this lock).

Why should it do this? Is that because we know we'd only reach this point if we don't conflict with the lock holder or any reservations? Does that necessarily mean that waiters that we don't conflict with also don't conflict with the lock holder or any reservation?

It's been a while, but I think I wrote this comment without upgrade locks in mind. But you're right, in the general case, where upgrade locks are a thing this won't work.

This function is getting removed now, and with it, so is the TODO. We'll need something like this for lock transitions though (I think one place is lockIsFree). Let's make sure we consider UPGRADE locks when we do implement this.


pkg/kv/kvserver/concurrency/lock_table.go line 1888 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Again, didn't we already check this condition?

Not applicable in the new structure, but I've tried to apply the spirit of this comment and be very deliberate about which conditions are possible and which ones are not.


pkg/kv/kvserver/concurrency/lock_table.go line 1937 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It's subtle that g.str is mutable over the lifetime of a lockTableGuardImpl and yet it looks to the user like it is an immutable attribute of the guard. That's enforced by comments like "if the request, referenced by the supplied lockTableGuardImpl". I think this is why the previous code passed around a SpanAccess alongside the guard. I don't think we need to go that far, but maybe push g.str behind a well-named, well-commented accessor method which references the iteration in stepToNextSpan.

Noting that I haven't done this yet, but will do before we merge the new PR.

@arulajmani arulajmani closed this Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants