Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concurrency: generalize lock promotion from Exclusive to Intent #118484

Merged
merged 3 commits into from
Feb 24, 2024

Conversation

arulajmani
Copy link
Collaborator

@arulajmani arulajmani commented Jan 30, 2024

Transactions commonly promote locks from Exclusive to Intent lock
strengths. Previously, this was handled as a special case -- requests
trying to promote would simply skip the lock and not add themselves in
the lock's wait queue.

This patch removes this special casing. Instead, requests trying to
promote are added as inactive waiters in the lock's wait queue.
Moreover, they sort before any other waiting requests, regardless of
arrival order. This isn't a functional change. However, it will
generalize better to Shared -> Exclusive/Intent promotion in the future
compared to our current structure.

Informs #110435

Release note: None

@arulajmani arulajmani requested a review from a team as a code owner January 30, 2024 16:59
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is less invasive than I was imagining, which is a nice sign that the concepts we're introducing are slotting into place.

We should take a look at whether this has any impact on YCSB-A performance. It shouldn't, but it's difficult to predict the emergent behavior of some of these tweaks to locking interactions.

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)


pkg/kv/kvserver/concurrency/lock_table.go line 654 at r1 (raw file):

	// Note that we don't need to check newState.guardStrength as it's
	// automatically assigned when updating the state.
	return g.mu.state.kind == newState.kind && g.mu.state.txn.ID == newState.txn.ID &&

Can newState.txn be nil? The comment on the field indicates that it can be for certain waitKinds. Also, we have a nil check in canPushWithPriority (which maybe isn't necessary).


pkg/kv/kvserver/concurrency/lock_table.go line 1018 at r3 (raw file):

func makeQueueOrder(g *lockTableGuardImpl, kl *keyLocks) queueOrder {
	isPromoting := false
	if g.txn != nil && kl.isLockedBy(g.txn.ID) {

ignorable nit: isPromoting := g.txn != nil && kl.isLockedBy(g.txn.ID)


pkg/kv/kvserver/concurrency/lock_table.go line 1039 at r3 (raw file):

	}
	switch {
	case (o1.isPromoting && o2.isPromoting) || (!o1.isPromoting && !o2.isPromoting):

o1.isPromoting == o2.isPromoting

But also, this entire function could be more succinctly written as:

if o1.isPromoting != o2.isPromoting {
    return o2.isPromoting
}
return o1.reqSeqNum > o2.reqSeqNum

Without the need for the "same request" branch, which is already handled below and just raises questions of whether (o1.isPromoting != o2.isPromoting) && (o1.reqSeqNum == o2.reqSeqNum) would ever be possible.

This more closely follows the usual pattern for a multi-field sort order.


pkg/kv/kvserver/concurrency/lock_table.go line 2614 at r3 (raw file):

		if g.isSameTxn(lockHolderTxn) {
			// We should never get here if the lock is already held by another request

Is this comment now stale?


pkg/kv/kvserver/concurrency/lock_table.go line 2877 at r3 (raw file):

			return false
		}
		if g.txn != nil && qqg.guard.isSameTxn(g.txnMeta()) {

This is checking whether a queued request conflicts with another queued request from the same transaction, right? Was it needed for this PR? Not saying we shouldn't make the change, just trying to understand the motivation and effect of the code change.


pkg/kv/kvserver/concurrency/lock_table.go line 3305 at r3 (raw file):
I recall we landed on the change you have here, but I don't recall exactly why.

Clearing the lock holder impacts the sort order of the lock's wait queue, as requests that belong to a holder's transaction sort before requests that do not.

This doesn't feel entirely correct, as we're not actually modifying the queueOrder. But I seem to remember there being a hazard if we didn't do so. What was that?


pkg/kv/kvserver/concurrency/lock_table.go line 3311 at r3 (raw file):

			// doing so. Instead, we could just remove the request from the wait
			// queue. This works in both the cases described above.
			kl.releaseLockingRequestsFromTxn(&up.Txn)

Do we need similar logic in the other two places where we call clearLockHeldBy?


pkg/kv/kvserver/concurrency/testdata/lock_table/lock_promotion line 17 at r3 (raw file):


# ------------------------------------------------------------------------------
# Basic test form Exclusive -> Intent lock promotion when there are no waiters

s/form/for/ here and below?


pkg/kv/kvserver/concurrency/testdata/lock_table/lock_promotion line 132 at r3 (raw file):

# Ensure a request trying to promote from Exclusive -> Intent does not conflict
# with any requests in the wait queue that belong to its own transaction. (If
# they're already there they'll sort before it).

This relates to the qqg.guard.isSameTxn(g.txnMeta()) case from above? Is it important that we handle this case this way?

@arulajmani arulajmani force-pushed the sl-lock-promo-refactors branch from 2a46759 to 531d7b3 Compare February 21, 2024 16:45
Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed review comments. I'll give YCSB-A a spin on this and report back.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table.go line 654 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Can newState.txn be nil? The comment on the field indicates that it can be for certain waitKinds. Also, we have a nil check in canPushWithPriority (which maybe isn't necessary).

The comment on waitKind doesn't cover two cases -- waitQueueMaxLengthExceeded and doneWaiting. Looks like it's mistaken about the former -- waitingState.txn is also not-nil for waitQueueMaxLengthExceeded. I'll update the comment.

We handle state transitions to doneWaiting separately, using updateStateToDoneWaitingLocked. So it means that newState.txn can't be nil here. I'll add an assert here with a comment explaining all this, so that we don't have to rediscover this in the future.


I think all this also means the nil check in canPushWithPriority isn't needed, as it isn't called for the doneWaiting state.


pkg/kv/kvserver/concurrency/lock_table.go line 1039 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

o1.isPromoting == o2.isPromoting

But also, this entire function could be more succinctly written as:

if o1.isPromoting != o2.isPromoting {
    return o2.isPromoting
}
return o1.reqSeqNum > o2.reqSeqNum

Without the need for the "same request" branch, which is already handled below and just raises questions of whether (o1.isPromoting != o2.isPromoting) && (o1.reqSeqNum == o2.reqSeqNum) would ever be possible.

This more closely follows the usual pattern for a multi-field sort order.

This is much nicer, thanks!


pkg/kv/kvserver/concurrency/lock_table.go line 2614 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this comment now stale?

I think this comment still applies, though it was worded slightly incorrectly -- instead of saying ".. sufficient strength: read: less than or equal to ...)" it should have said "greater than or equal to ..."). Updated the comment.


pkg/kv/kvserver/concurrency/lock_table.go line 2877 at r3 (raw file):

This is checking whether a queued request conflicts with another queued request from the same transaction, right?

Yeah, we're making sure we don't consider requests from the same transaction conflicting. None of the lock table tests break without this change, but that's only because we were lacking coverage for when this would be needed. I added a test case below -- it might be easier to explain why we need this change while referencing an example, so I'll just do it below.

All this being said, our randomized tests do fail if I comment this bit out.


pkg/kv/kvserver/concurrency/lock_table.go line 3305 at r3 (raw file):

This doesn't feel entirely correct, as we're not actually modifying the queueOrder. But I seem to remember there being a hazard if we didn't do so. What was that?

Sorry for the confusion here. This entire diff wasn't changed after switching to the queueOrder approach (and our discussion). The commentary (and the call to releaseLockingRequestsFromTxn) is outdated.

I recall we landed on the change you have here, but I don't recall exactly why.

We discussed this offline, and we don't actually need this anymore -- switching to the queueOrder approach prevents the validation failures this was initially trying to address. There's nothing wrong with calling releaseLockingRequestsFromTxn here, but there's no good reason to either. We decided to remove this call entirely.


pkg/kv/kvserver/concurrency/lock_table.go line 3311 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we need similar logic in the other two places where we call clearLockHeldBy?

Superseded by the comment above.


pkg/kv/kvserver/concurrency/testdata/lock_table/lock_promotion line 132 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This relates to the qqg.guard.isSameTxn(g.txnMeta()) case from above? Is it important that we handle this case this way?

We discussed this offline, and decided to keep this as is. Doing this ensures the behaviour stays the same for this case before and after this patch. We discussed that there's no good reason not to handle it this way, and there may be some minor added concurrency benefits of doing this. For example, if the first request that's inactively waiting at this lock starts blocking on a different key, but the second request could proceed to evaluation, this approach allows it to do so.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: assuming the YCSB results look good. Otherwise, we'll need to eliminate any regression first.

Reviewed 3 of 3 files at r4, 1 of 1 files at r5, 3 of 3 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani)


pkg/kv/kvserver/concurrency/lock_table.go line 1039 at r3 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

This is much nicer, thanks!

nit: you also don't need the o1.reqSeqNum == o2.reqSeqNum // same request branch above, but I don't feel strongly about it being removed, so up to you.


pkg/kv/kvserver/concurrency/lock_table.go line 2877 at r3 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

This is checking whether a queued request conflicts with another queued request from the same transaction, right?

Yeah, we're making sure we don't consider requests from the same transaction conflicting. None of the lock table tests break without this change, but that's only because we were lacking coverage for when this would be needed. I added a test case below -- it might be easier to explain why we need this change while referencing an example, so I'll just do it below.

All this being said, our randomized tests do fail if I comment this bit out.

Thanks for explaining here and during our 1:1. This makes sense to me now.


pkg/kv/kvserver/concurrency/lock_table.go line 660 at r4 (raw file):

	// invariants hold by asserting them.
	assert(newState.kind != doneWaiting, "unexpected waiting state kind")
	assert(newState.txn != nil, "txn cannot be unset if waiting state isn't DoneWaiting")

nit: s/DoneWaiting/doneWaiting/

@arulajmani
Copy link
Collaborator Author

It seems like this improved YCSB --

                │ before_summarized.txt │        after_summarized.txt         │
                │         ops/s         │   ops/s     vs base                 │
YCSB-throughput             123.0k ± 2%   128.3k ± ?  +4.31% (p=0.023 n=26+6)

         │ before_summarized.txt │        after_summarized.txt         │
         │         ms/s          │    ms/s     vs base                 │
YCSB-P50              3.100 ± 6%    2.900 ± ?  -6.45% (p=0.003 n=26+6)
YCSB-Avg             900.0m ± 0%   900.0m ± ?       ~ (p=0.274 n=26+6)
geomean               1.670         1.616      -3.28%

Do note that before_summarized has 25 samples, whereas after_summarized has only 5.

We only care about the transaction we were previously pushing (as
identified by its ID) when determining whether a waiting state update
is meaningful or not. The other fields on the TxnMeta don't matter.

This will remove a useless (re–)push in a subsequent commit.

Epic: none

Release note: None
Soon, we'll be introducing lock promotion, which makes this validation
obsolete.

References cockroachdb#110435

Release note: None
Transactions commonly promote locks from Exclusive to Intent lock
strengths. Previously, this was handled as a special case -- requests
trying to promote would simply skip the lock and not add themselves in
the lock's wait queue.

This patch removes this special casing. Instead, requests trying to
promote are added as inactive waiters in the lock's wait queue.
Moreover, they sort before any other waiting requests, regardless of
arrival order. This isn't a functional change. However, it will
generalize better to Shared -> Exclusive/Intent promotion in the future
compared to our current structure.

Informs cockroachdb#110435

Release note: None
@arulajmani arulajmani force-pushed the sl-lock-promo-refactors branch from 531d7b3 to e68ee89 Compare February 22, 2024 18:57
Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic test failures pointed me to there being a need to update tryFreeOnReplicatedAcquire to handle lock promotions correctly. That's because locking requests in the wait queue can now belong to the lock holder transaction, which changes how "uncontended" is determined. @nvanbenschoten could you take a quick look?

Separately, while working through the logic test failures, I realized there's something weird going on with crdb_internal.cluster_locks. The output in that file includes uncontended replicated locks. I'll investigate that separately.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table.go line 1039 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: you also don't need the o1.reqSeqNum == o2.reqSeqNum // same request branch above, but I don't feel strongly about it being removed, so up to you.

I'll leave it, if only to document how same requests are handled explicitly. That's because we use this function while looping in a few places up till a request that's already in the queue.

@arulajmani
Copy link
Collaborator Author

@nvanbenschoten given we discussed the changes to tryFreeOnReplicatedAcquire on Thursday, and you were on board, I'll optimistically bors this assuming the LGTM from before stands.

bors r=nvanbenschoten

@craig
Copy link
Contributor

craig bot commented Feb 24, 2024

Build succeeded:

@craig craig bot merged commit 114f7c4 into cockroachdb:master Feb 24, 2024
8 of 15 checks passed
arulajmani added a commit to arulajmani/cockroach that referenced this pull request Feb 27, 2024
Previously, if a request had acquired a shared lock, it wasn't able to
promote it to an Exclusive or Intent (by writing to the same key) lock.
This was because the lock table could not detect deadlock scenarios
where two transactions that both held shared locks were trying to
promote. Moreover, it also couldn't detect wait queue local deadlocks
that involved non-transactional requests.

These two limitations have now been limited. For the former, we're able
to leverage our existing deadlock detection algorithm by performing the
correct set of pushes. This is done by changing the claimantTxn concept
slightly. Previously, there could only be one claimant for a key. This
is no longer true -- now, the claimant may be different, depending on
who is asking for it.

For the latter, we reorder the wait queue to avoid preventable
deadlocks. This is done by preferring lock promoters over other
requests. The bulk of this was already done in
cockroachdb#118484.

Closes cockroachdb#110435

Release note: shared locks can now be re-acquired with higher strength.
arulajmani added a commit to arulajmani/cockroach that referenced this pull request Mar 14, 2024
Previously, if a request had acquired a shared lock, it wasn't able to
promote it to an Exclusive or Intent (by writing to the same key) lock.
This was because the lock table could not detect deadlock scenarios
where two transactions that both held shared locks were trying to
promote. Moreover, it also couldn't detect wait queue local deadlocks
that involved non-transactional requests.

These two limitations have now been limited. For the former, we're able
to leverage our existing deadlock detection algorithm by performing the
correct set of pushes. This is done by changing the claimantTxn concept
slightly. Previously, there could only be one claimant for a key. This
is no longer true -- now, the claimant may be different, depending on
who is asking for it.

For the latter, we reorder the wait queue to avoid preventable
deadlocks. This is done by preferring lock promoters over other
requests. The bulk of this was already done in
cockroachdb#118484.

Closes cockroachdb#110435

Release note (sql change): shared locks (acquired using SELECT FOR
SHARE or implicitly by read committed transactions) can now be
re-acquired with higher strength (using SELECT FOR UPDATE or by writing
to the key).
arulajmani added a commit to arulajmani/cockroach that referenced this pull request Mar 15, 2024
Previously, if a request had acquired a shared lock, it wasn't able to
promote it to an Exclusive or Intent (by writing to the same key) lock.
This was because the lock table could not detect deadlock scenarios
where two transactions that both held shared locks were trying to
promote. Moreover, it also couldn't detect wait queue local deadlocks
that involved non-transactional requests.

These two limitations have now been limited. For the former, we're able
to leverage our existing deadlock detection algorithm by performing the
correct set of pushes. This is done by changing the claimantTxn concept
slightly. Previously, there could only be one claimant for a key. This
is no longer true -- now, the claimant may be different, depending on
who is asking for it.

For the latter, we reorder the wait queue to avoid preventable
deadlocks. This is done by preferring lock promoters over other
requests. The bulk of this was already done in
cockroachdb#118484.

Closes cockroachdb#110435

Release note (sql change): shared locks (acquired using SELECT FOR
SHARE or implicitly by read committed transactions) can now be
re-acquired with higher strength (using SELECT FOR UPDATE or by writing
to the key).
arulajmani added a commit to arulajmani/cockroach that referenced this pull request Mar 16, 2024
Previously, if a request had acquired a shared lock, it wasn't able to
promote it to an Exclusive or Intent (by writing to the same key) lock.
This was because the lock table could not detect deadlock scenarios
where two transactions that both held shared locks were trying to
promote. Moreover, it also couldn't detect wait queue local deadlocks
that involved non-transactional requests.

These two limitations have now been limited. For the former, we're able
to leverage our existing deadlock detection algorithm by performing the
correct set of pushes. This is done by changing the claimantTxn concept
slightly. Previously, there could only be one claimant for a key. This
is no longer true -- now, the claimant may be different, depending on
who is asking for it.

For the latter, we reorder the wait queue to avoid preventable
deadlocks. This is done by preferring lock promoters over other
requests. The bulk of this was already done in
cockroachdb#118484.

Closes cockroachdb#110435

Release note (sql change): shared locks (acquired using SELECT FOR
SHARE or implicitly by read committed transactions) can now be
re-acquired with higher strength (using SELECT FOR UPDATE or by writing
to the key).
arulajmani added a commit to arulajmani/cockroach that referenced this pull request Mar 22, 2024
Previously, if a request had acquired a shared lock, it wasn't able to
promote it to an Exclusive or Intent (by writing to the same key) lock.
This was because the lock table could not detect deadlock scenarios
where two transactions that both held shared locks were trying to
promote. Moreover, it also couldn't detect wait queue local deadlocks
that involved non-transactional requests.

These two limitations have now been limited. For the former, we're able
to leverage our existing deadlock detection algorithm by performing the
correct set of pushes. This is done by changing the claimantTxn concept
slightly. Previously, there could only be one claimant for a key. This
is no longer true -- now, the claimant may be different, depending on
who is asking for it.

For the latter, we reorder the wait queue to avoid preventable
deadlocks. This is done by preferring lock promoters over other
requests. The bulk of this was already done in
cockroachdb#118484.

Closes cockroachdb#110435

Release note (sql change): shared locks (acquired using SELECT FOR
SHARE or implicitly by read committed transactions) can now be
re-acquired with higher strength (using SELECT FOR UPDATE or by writing
to the key).
craig bot pushed a commit that referenced this pull request Mar 22, 2024
119671: concurrency: implement generalized lock promotion  r=nvanbenschoten a=arulajmani

First two commits are from: #119587

-----

Previously, if a request had acquired a shared lock, it wasn't able to
promote it to an Exclusive or Intent (by writing to the same key) lock.
This was because the lock table could not detect deadlock scenarios
where two transactions that both held shared locks were trying to
promote. Moreover, it also couldn't detect wait queue local deadlocks
that involved non-transactional requests.

These two limitations have now been limited. For the former, we're able
to leverage our existing deadlock detection algorithm by performing the
correct set of pushes. This is done by changing the claimantTxn concept
slightly. Previously, there could only be one claimant for a key. This
is no longer true -- now, the claimant may be different, depending on
who is asking for it.

For the latter, we reorder the wait queue to avoid preventable
deadlocks. This is done by preferring lock promoters over other
requests. The bulk of this was already done in
#118484.

Closes #110435

Release note: shared locks can now be re-acquired with higher strength.

120347: rangefeed: improve memory accounting for event queue r=erikgrinaker a=wenyihu6

To prevent OOMs, we previously introduced memory budget to limit event buffering
to processor.eventC channel. These events could point to large underlying data
structure and keeping them in the channel prevent them from being garbage
collected. However, the current memory accounting did not have a good way to
predict the memory footprint of events. It uses calculateDateSize which 1. does
not account for memory of the new RangeFeedEvents that could be spawned (such as
checkpoint events) 2. inaccurately uses protobuf-encoded wire size for data 3.
does not account for memory of the base structs.

This patch improves the memory estimation by resolving the three points above.
It tries its best effort to account for base struct's memory, predict generation
of new events, and use actual data size rather than compressed protobuf size.
Since it is challenging to predict whether there would be a new checkpoint
event, our strategy is:

Instead of accounting memory for both the current event and future rangefeed
events, it will always pick the maximum memory between the two. It also allows
additional shared event overhead (should be very small) and checkpoint events
(caused by logical ops) (should be rare) to slip away.

- For logical ops, we always take current event memory. Current event memory is
always larger if we assume no checkpoint events will be published for any
logical ops. (same underlying data)
- For checkpoint events (ct and initRTS), we always take checkpoint rangefeed
event. Future events are larger. (curr: 80, future: 152) (span points to the
same underlying data)
- For sst events, we always take future rangefeed event (future base structs is
larger than current base structs by 2) (same underlying data).
- For sync events, we always take current events. (current memory usage is
larger as no future events will be created.

We got these data ^ from test benchmarks documented here
#120582.

Resolves: #114190
Release note: None

120718: sql: fix minor procedure bugs and add UDF/SP mixed tests r=mgartner a=mgartner

#### sql: fix bugs with procedures passing arguments to other procedures

This commit fixes a limitation where a procedure could not pass an
argument through to another procedure, like:

    CREATE PROCEDURE p(n INT) LANGUAGE SQL AS $$
      CALL other_proc(n);
    $$

To lift this restriction, the body of a procedure must be type-checked
with respect to the current scope which contains the procedures named
parameters. This change required a few subsequent changes to prevent
regressions:

  1. Copying some logic that ensures that `CALL` arguments are not
     subqueries to `optbuilder.subquery.TypeCheck`.
  2. Using `FuncExpr.InCall` instead of `CallAncestor` to determine the
     correct verbiage for error messages of procedures. `CallAncestor`
     is no longer used and has been removed.

Epic: None

Release note: None

#### sql/logictest: add tests for UDFs and SPs calling each other

Release note: None


Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Wenyi Hu <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants