Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: requests that acquire repl locks should use read-write path #110279

Merged
merged 3 commits into from
Sep 26, 2023

Conversation

arulajmani
Copy link
Collaborator

First commit from #110201

Previously, {Get,Scan,ReverseScan}Requests all used the read-only
execution path. Things aren't so simple anymore, now that we want these
requests to be able to acquire replicated locks, which means they need
to go through raft (and therefore the read-write execution path). This
patch achieves exactly that.

Informs #100193

Release note: None

@blathers-crl
Copy link

blathers-crl bot commented Sep 8, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

pkg/kv/kvpb/api.go Outdated Show resolved Hide resolved
Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nvanbenschoten this should be good for a look now. The client level test for shared locks which prompted the latching changes. I'll spend a bit more time thinking about other tests we may want to add, either in this patch or as followups, but if any come to mind let me know!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

@arulajmani arulajmani changed the title [WIP, DNM] kvserver: requests that acquire repl locks should use read-write path kvserver: requests that acquire repl locks should use read-write path Sep 16, 2023
@arulajmani arulajmani requested a review from a team September 21, 2023 18:36
@arulajmani arulajmani requested review from a team as code owners September 21, 2023 18:36
@arulajmani arulajmani requested review from cucaroach and miretskiy and removed request for a team September 21, 2023 18:36
Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nvanbenschoten I've updated this patch with everything we spoke about over call/on slack. Should be good for another look!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @miretskiy, and @nvanbenschoten)

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 9 files at r3, 6 of 6 files at r4, 29 of 29 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @cucaroach, and @miretskiy)


pkg/kv/db.go line 521 at r5 (raw file):

// key can be either a byte slice or a string.
func (db *DB) Scan(ctx context.Context, begin, end interface{}, maxRows int64) ([]KeyValue, error) {
	return db.scan(ctx, begin, end, maxRows, false /* isReverse */, kvpb.NonLocking, kvpb.CONSISTENT, kvpb.Invalid)

The direct use of kvpb.Invalid seems fine. There's also similar cases where we just pass in 0 in these cases. No preference from me.


pkg/kv/kvclient/kvcoord/txn_interceptor_seq_num_allocator.go line 87 at r4 (raw file):

) (*kvpb.BatchResponse, *kvpb.Error) {
	for _, ru := range ba.Requests {

nit: stray line?


pkg/kv/kvclient/kvcoord/txn_interceptor_seq_num_allocator.go line 101 at r4 (raw file):

		// Requests that do not perform intent writes use the read sequence number.
		// Notably, this includes Get/Scan/ReverseScan requests that acquire
		// replicated locks, even though thye go through raft.

s/thye/they/


pkg/kv/kvnemesis/applier.go line 289 at r5 (raw file):

			dur := kvpb.BestEffort
			if o.ForUpdate {
				(*kv.Batch).GetForUpdate(b, o.Key, dur)

Here and below, should we just be calling the method on b directly, like b.GetForUpdate(o.Key, dur)?


pkg/kv/kvpb/api.go line 1386 at r4 (raw file):

func flagForIsWrite(l lock.Strength, d lock.Durability) flag {
	if l == lock.None || d == lock.Unreplicated {

Given that the KeyLockingDurability can only be set if KeyLockingStrength is non-None, it doesn't feel like this function should need to take a strength argument at all. It could be as simple as:

func flagForLockDurability(d lock.Durability) flag {
    if d == lock.Unreplicated {
        return isWrite
    }
    return 0
}

pkg/kv/kvpb/api.go line 1208 at r5 (raw file):

// NewGet returns a Request initialized to get the value at key. No lock is
// acquired A lock on the key, even if it exists.

"No lock is acquired A lock on the key".


pkg/kv/kvpb/api.go line 1348 at r5 (raw file):

// NewScan returns a Request initialized to scan from start to end keys. No
// locks will be acquired on each of the resulting keys.

nit here and below: s/on each of the/on the/


pkg/kv/kvpb/api.go line 1410 at r5 (raw file):

// KeyLockingDurabilityType is used to describe the durability goals of per-key
// locks acquired by Get, Scan, and ReverseScan requests.

"by locking Get, ..."


pkg/kv/kvpb/api.go line 1431 at r5 (raw file):

	// transactions that require locks for correctness (read: read-committed,
	// snapshot isolation).
	GuaranteedDurability

Do we want to mention that these locks also prevent writes at or below the commit timestamp of the holder after the lock has been released? So the read cannot skew between the mvcc time that it is originally performed and the mvcc time that the lock holder commits. There's likely a more formal way to express this.


pkg/kv/kvpb/api.go line 1448 at r5 (raw file):

func scanLockDurability(dur KeyLockingDurabilityType) lock.Durability {
	switch dur {

Do we want to add a case for Invalid with a better panic message?


pkg/kv/kvserver/replica_command.go line 3280 at r5 (raw file):

// on the locking mode of the read being respected, remove the forUpdate param
// and perform locking reads for all callers.
// TODO(XXX): Should we address this TODO? I'm not sure what the transaction

Addressing this TODO doesn't seem particularly urgent, but if XXX is feeling motivated, XXX should feel free to.


pkg/kv/kvserver/batcheval/declare.go line 69 at r3 (raw file):

	// on the request and the evaluation path they take. We handle lock/latch span
	// declaration for them along with other read-only requests.
	if ok || kvpb.IsReadOnly(req) {

There are other calls to IsReadOnly that also need to change. For example, in txnSeqNumAllocator.


pkg/kv/kvserver/batcheval/declare.go line 90 at r4 (raw file):

		timestamp.Forward(in.GlobalLimit)
	} else {
		str = lock.Intent

nit: consider putting this in an else branch of the LockingReadRequest check. Up to your coding style preferences.


pkg/kv/kvserver/batcheval/intent.go line 109 at r4 (raw file):

// acquireLocksOnKeys acquires locks on each of the keys in the result of a
// {,Reverse}ScanRequest. The locks are held by the specified transaction with
// the supplied locks strength. Best-effort locks are held in unreplicated

"in an unreplicated"


pkg/kv/kvserver/batcheval/intent.go line 115 at r4 (raw file):

// lock on one of the keys. In such cases, a LockConflictError is returned to
// the caller. However, if locks have been successfully acquired on each of the
// keys, the provided result.Result is mutated accordingly.

This is kind of an awkward contract. Should we just return the []roachpb.LockAcquisition? That would parallel the contract of acquireLockOnKey.


pkg/kv/kvserver/batcheval/intent.go line 121 at r4 (raw file):

	res *result.Result,
	txn *roachpb.Transaction,
	keyLocking lock.Strength,

Out of curiosity, why the s/str/keyLocking/? Should we do the same with dur?


pkg/kv/kvserver/batcheval/intent.go line 150 at r4 (raw file):

			acq, err := acquireLockOnKey(ctx, readWriter, txn, keyLocking, dur, k)
			if err != nil {
				res.Local.AcquiredLocks = res.Local.AcquiredLocks[:]

Was this supposed to be res.Local.AcquiredLocks[:i]?


pkg/kv/kvserver/batcheval/intent.go line 166 at r4 (raw file):

// acquireLockOnKey acquires a lock on the specified key. The lock is acquired
// with the supplied lock strength and are held by the specified transaction.
// Best-effort locks are held in unreplicated fashion; locks that need

"in an unreplicated"


pkg/kv/kvserver/batcheval/intent.go line 168 at r4 (raw file):

// Best-effort locks are held in unreplicated fashion; locks that need
// guaranteed durability are replicated. The resultant lock acquisition struct
// is returned, which the caller may accumulate in its result set.

s/may/must/


pkg/kv/kvserver/batcheval/intent.go line 194 at r4 (raw file):

		dur = lock.Unreplicated
	}
	acq := roachpb.MakeLockAcquisition(txn, key, dur, str)

nit: should this call and the return acq, nil be moved after the switch statement?


pkg/kv/db_test.go line 115 at r5 (raw file):

	s, db := setup(t)
	defer s.Stopper().Stop(context.Background())
	testutils.RunTrueAndFalse(t, "durability-guaranteed", func(t *testing.T, durabilityGuaranteed bool) {

nit: add a new line above this to match TestDB_GetForShare.


pkg/kv/db_test.go line 381 at r5 (raw file):

	defer log.Scope(t).Close(t)
	s, db := setup(t)
	ctx := context.Background()

Want to pull out a ctx in the tests above as well?


pkg/kv/kvserver/client_replica_test.go line 4992 at r6 (raw file):

		var mu struct {
			syncutil.Mutex
			numFinalized int

atomic.Int64?


pkg/kv/kvserver/client_replica_test.go line 5007 at r6 (raw file):

		}()

		mu.Lock()

Should we sleep a bit here to give the GetForUpdate request time to get blocked? Even as small as 10ms? Kind of like in TestTxnWaitPolicies.

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR, RFAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @miretskiy, and @nvanbenschoten)


pkg/kv/db.go line 521 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The direct use of kvpb.Invalid seems fine. There's also similar cases where we just pass in 0 in these cases. No preference from me.

Yeah, I'm not entirely sure what I prefer here either. On one hand, Invalid makes things explicit at the caller. On the other hand, I've to explain what "Invalid" means at the enum definition. I'll defer to inertia, I guess.


pkg/kv/kvnemesis/applier.go line 289 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Here and below, should we just be calling the method on b directly, like b.GetForUpdate(o.Key, dur)?

Done; not entirely sure why it was written like this in the first place.


pkg/kv/kvpb/api.go line 1386 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Given that the KeyLockingDurability can only be set if KeyLockingStrength is non-None, it doesn't feel like this function should need to take a strength argument at all. It could be as simple as:

func flagForLockDurability(d lock.Durability) flag {
    if d == lock.Unreplicated {
        return isWrite
    }
    return 0
}

Good point. Done.

PS: (I think you inverted the condition above)


pkg/kv/kvpb/api.go line 1431 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we want to mention that these locks also prevent writes at or below the commit timestamp of the holder after the lock has been released? So the read cannot skew between the mvcc time that it is originally performed and the mvcc time that the lock holder commits. There's likely a more formal way to express this.

Added some words. PTAL.


pkg/kv/kvserver/replica_command.go line 3280 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Addressing this TODO doesn't seem particularly urgent, but if XXX is feeling motivated, XXX should feel free to.

XXX is going to see some natural salt mines tomorrow, so maybe in October :)


pkg/kv/kvserver/batcheval/declare.go line 69 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

There are other calls to IsReadOnly that also need to change. For example, in txnSeqNumAllocator.

(from before the recent push -- we ended up discussing this offline).


pkg/kv/kvserver/batcheval/declare.go line 90 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: consider putting this in an else branch of the LockingReadRequest check. Up to your coding style preferences.

Minor preference for this, so I'll keep.


pkg/kv/kvserver/batcheval/intent.go line 115 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This is kind of an awkward contract. Should we just return the []roachpb.LockAcquisition? That would parallel the contract of acquireLockOnKey.

Yeah, makes sense. I'd rewritten these a few different ways before I finally settled on the contract for acquireLockOnKey. Should've unified it with acquireLocksOnKeys. Done now.


pkg/kv/kvserver/batcheval/intent.go line 121 at r4 (raw file):

Out of curiosity, why the s/str/keyLocking/? Should we do the same with dur?

I built this on top of my first PR, which hadn't renamed s/keyLocking/keyLockingStrength on batch requests. It also didn't prefix durability with keyLocking. s/str/keyLocking/ was prompted by the fields we were passing in at the caller.

I'm actually going to walk back this change. keyLockingStrength and keyLockingDurability feels needlessly verbose here.


pkg/kv/kvserver/batcheval/intent.go line 150 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Was this supposed to be res.Local.AcquiredLocks[:i]?

I was just clearing the entire thing because if there's a LockConflictError then none of the locks are acquired. Is that not so?


pkg/kv/kvserver/batcheval/intent.go line 166 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"in an unreplicated"

Ended up removing this entirely -- this comment was referencing "best effort" and "guaranteed", which we've since walked back on.


pkg/kv/kvserver/batcheval/intent.go line 194 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: should this call and the return acq, nil be moved after the switch statement?

Yeah, done.


pkg/kv/db_test.go line 381 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Want to pull out a ctx in the tests above as well?

Done.


pkg/kv/kvserver/client_replica_test.go line 4992 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

atomic.Int64?

I thought we'd need to wrap both the increments and txn1.Commit() and txn2.Rollback() inside of the mutex, because otherwise the GetForUpdate might get unblocked and read the wrong value for numFinalized. But that also makes this checking numFinalized bit kind of stupid. Should we just remove it?


pkg/kv/kvserver/client_replica_test.go line 5007 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should we sleep a bit here to give the GetForUpdate request time to get blocked? Even as small as 10ms? Kind of like in TestTxnWaitPolicies.

Good call. Done.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 29 of 30 files at r11, 1 of 1 files at r12, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach and @miretskiy)

@arulajmani
Copy link
Collaborator Author

TFTR!

bors r=nvanbenschoten

craig bot pushed a commit that referenced this pull request Sep 23, 2023
110279: kvserver: requests that acquire repl locks should use read-write path r=nvanbenschoten a=arulajmani

First commit from #110201

Previously, {Get,Scan,ReverseScan}Requests all used the read-only
execution path. Things aren't so simple anymore, now that we want these
requests to be able to acquire replicated locks, which means they need
to go through raft (and therefore the read-write execution path). This
patch achieves exactly that.

Informs #100193

Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
@craig
Copy link
Contributor

craig bot commented Sep 23, 2023

Build failed:

@arulajmani
Copy link
Collaborator Author

bors r=nvanbenschoten

@craig
Copy link
Contributor

craig bot commented Sep 24, 2023

Build failed:

@arulajmani
Copy link
Collaborator Author

TestTransactionUnexpectedlyCommitted seems to have flaked on both these runs. I've confirmed its a flake, and I'm gonna retry.

bors r=nvanbenschoten

@craig
Copy link
Contributor

craig bot commented Sep 24, 2023

Build failed:

@arulajmani
Copy link
Collaborator Author

I'm going to skip the test. #111205.

@arulajmani
Copy link
Collaborator Author

Test is skipped now. Gonna merge.

bors r=nvanbenschoten

@craig
Copy link
Contributor

craig bot commented Sep 25, 2023

Timed out.

@arulajmani
Copy link
Collaborator Author

Things just timed out. I can't glean anything from the build logs that points to something with my patch, so I'll give it another go.

bors r=nvanbenschoten

@arulajmani
Copy link
Collaborator Author

bors r-

@craig
Copy link
Contributor

craig bot commented Sep 25, 2023

Canceled.

@arulajmani
Copy link
Collaborator Author

bors r+ p=99

@craig
Copy link
Contributor

craig bot commented Sep 25, 2023

Timed out.

Previously, {Get,Scan,ReverseScan}Requests all used the read-only
execution path. Things aren't so simple anymore, now that we want these
requests to be able to acquire replicated locks, which means they need
to go through raft (and therefore the read-write execution path). This
patch achieves exactly that.

Informs cockroachdb#100193

Release note: None
This patch adds lock durability flags to GetFor{Update,Share},
ScanFor{Update,Share}, and ReverseScanFor{Update,Share}, using which
users of the KV client API can express lock durability goals. These
then map to either replicated or unreplicated locks.

This patch then modifies existing tests to make use of replicated locks
in a few places.

Epic: none

Release note: None
This patch adds a very basic shared locks test using the KV client
API. While here, we also take the opportunity to extend this test
for replicated locks.

Epic: none

Release note: None
@arulajmani
Copy link
Collaborator Author

Day 4, we go again!

bors r=nvanbenschoten

@craig
Copy link
Contributor

craig bot commented Sep 26, 2023

Build succeeded:

@craig craig bot merged commit 617f581 into cockroachdb:master Sep 26, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants