Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv/kvserver: TestSharedLocksBasic failed during spanset assertion #111409

Closed
cockroach-teamcity opened this issue Sep 28, 2023 · 5 comments · Fixed by #111526
Closed

kv/kvserver: TestSharedLocksBasic failed during spanset assertion #111409

cockroach-teamcity opened this issue Sep 28, 2023 · 5 comments · Fixed by #111526
Assignees
Labels
A-kv-transactions Relating to MVCC and the transactional model. branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Sep 28, 2023

kv/kvserver.TestSharedLocksBasic failed with artifacts on master @ 2111b61b2d7c789bc03b1e9392062df80c779075:

        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).sendPartialBatch(0xc00b789080, {0xde298b0?, 0xc00caec030}, 0xc00de2c900, {{0xc0073cc6b8, 0x1, 0x8}, {0xc0073cc6c8, 0x1, 0x8}}, ...)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1875 +0x9ac
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).divideAndSendBatchToRanges(0xc00b789080, {0xde298b0, 0xc00caec030}, 0xc00de2c900, {{0xc0073cc6b8, 0x1, 0x8}, {0xc0073cc6c8, 0x1, 0x8}}, ...)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1446 +0x573
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).Send(0xc00b789080, {0xde298b0, 0xc00b08ff80}, 0xc00de2c900)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1066 +0xa25
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnLockGatekeeper).SendLocked(0xc00d66f5b0, {0xde298b0, 0xc00b08ff80}, 0xc00de2c900)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_lock_gatekeeper.go:82 +0x22a
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnMetricRecorder).SendLocked(0xc00d66f578, {0xde298b0, 0xc00b08ff80}, 0x4ace95?)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_metric_recorder.go:47 +0x1b9
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSpanRefresher).sendLockedWithRefreshAttempts(0xc00d66f498, {0xde298b0, 0xc00b08ff80}, 0xc00de2c900, 0x5)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go:222 +0x2b6
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSpanRefresher).SendLocked(0xc00d66f498, {0xde298b0, 0xc00b08ff80}, 0xc00de2c900)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go:150 +0x1e7
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnCommitter).SendLocked(0xc00d66f460, {0xde298b0, 0xc00b08ff80}, 0xc00de2c900)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go:146 +0xa06
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnPipeliner).SendLocked(0xc00d66f330, {0xde298b0, 0xc00b08ff80}, 0x1?)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_pipeliner.go:295 +0x246
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSeqNumAllocator).SendLocked(0xc00d66f310, {0xde298b0, 0xc00b08ff80}, 0xc00de2c900)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_seq_num_allocator.go:114 +0x3a6
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnHeartbeater).SendLocked(0xc00d66f260, {0xde298b0, 0xc00b08ff80}, 0xc00de2c900)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_heartbeater.go:246 +0x7d4
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*TxnCoordSender).Send(0xc00d66f080, {0xde29840, 0xc0001bc008}, 0xc00de2c900)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_coord_sender.go:533 +0xa33
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.(*DB).sendUsingSender(0xc00cdcb360, {0xde29840, 0xc0001bc008}, 0xc00de2c900, {0x7f1799b530b0, 0xc00d66f080})
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/db.go:1090 +0x191
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Send(0xc018ed5600, {0xde29840, 0xc0001bc008}, 0xc00de2c900)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/txn.go:1218 +0x2e9
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.sendAndFill({0xde29840, 0xc0001bc008}, 0xc00c473b70, 0xc00d66fb80)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/db.go:922 +0x253
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Run(0xc018ed5600, {0xde29840, 0xc0001bc008}, 0x8f31940?)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/txn.go:764 +0xa5
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.(*Txn).scan(0xc018ed5600, {0xde29840, 0xc0001bc008}, {0x8f31940, 0xddc6750}, {0x8f31940, 0xddc6760}, 0x0, 0x9a?, 0x2, ...)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/txn.go:610 +0x22d
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.(*Txn).ScanForShare(...)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/txn.go:652
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvserver_test.TestSharedLocksBasic.func1(0xc00df54000, 0x1)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver_test/pkg/kv/kvserver/client_replica_test.go:4991 +0x111
        	            	  | github.com/cockroachdb/cockroach/pkg/testutils.RunValues[...].func1()
        	            	  | 	github.com/cockroachdb/cockroach/pkg/testutils/subtest.go:31 +0x56
        	            	  | testing.tRunner(0xc00df54000, 0xc006e641c0)
        	            	  | 	GOROOT/src/testing/testing.go:1576 +0x217
        	            	  | created by testing.(*T).Run
        	            	  | 	GOROOT/src/testing/testing.go:1629 +0x806
        	            	Error types: (1) *errbase.opaqueWrapper (2) *errutil.leafError
        	Test:       	TestSharedLocksBasic/guaranteed-durability=true
=== NAME  TestSharedLocksBasic
    client_replica_test.go:5030: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/33e1d369c27b9c01b2b6009c561815a3/logTestSharedLocksBasic2243548617
    --- FAIL: TestSharedLocksBasic/guaranteed-durability=true (0.13s)

Parameters: TAGS=bazel,gss , stress=true

Help

See also: How To Investigate a Go Test Failure (internal)

/cc @cockroachdb/kv

This test on roachdash | Improve this report!

Jira issue: CRDB-31888

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team labels Sep 28, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.2 milestone Sep 28, 2023
@pav-kv pav-kv added the A-kv-transactions Relating to MVCC and the transactional model. label Sep 28, 2023
@pav-kv
Copy link
Collaborator

pav-kv commented Sep 28, 2023

Wraps: (2) cannot write undeclared span /Local/Lock"a"
  | declared:
  | read global: {a-c} at 9223372036.854775807,2147483647
  | read local: /Local/RangeID/5/r/AbortSpan/"7433d599-876b-46db-ae31-bc5a84a33e07" at 0,0
  | read local: /Local/Lock"{a"-c"} at 0,0
  | read local: /Local/Lock/Local/RangeID/5/r/AbortSpan/"7433d599-876b-46db-ae31-bc5a84a33e07" at 0,0

@pav-kv pav-kv added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Sep 28, 2023
@pav-kv
Copy link
Collaborator

pav-kv commented Sep 28, 2023

Stressed this 10k times on master, couldn't reproduce on Mac or gceworker. Was this fixed recently?
Upd: Couldn't repro on 2111b61 either under which this failure is reported. Seems like it's just a rare flake.

@nvanbenschoten
Copy link
Member

I think this has to do with how we declare latches for replicated shared lock acquisition. We're declaring a read latch at MaxTimestamp, not a write latch. This is safe because the writes from different transaction IDs are to different keys and the writes to the same transaction ID are protected by the write latch added in #111388. But the spanset assertions don't know this.

So I think we'll want a fix like:

diff --git a/pkg/kv/kvserver/spanset/batch.go b/pkg/kv/kvserver/spanset/batch.go
index 3f5efb08d57..3900e97d36c 100644
--- a/pkg/kv/kvserver/spanset/batch.go
+++ b/pkg/kv/kvserver/spanset/batch.go
@@ -891,6 +891,10 @@ func addLockTableSpans(spans *SpanSet) *SpanSet {
                if span.EndKey != nil {
                        ltEndKey, _ = keys.LockTableSingleKey(span.EndKey, nil)
                }
+               if sa == SpanReadOnly && span.Timestamp == hlc.MaxTimestamp {
+                       // TODO: explain.
+                       sa = SpanReadWrite
+               }
                withLocks.AddNonMVCC(sa, roachpb.Span{Key: ltKey, EndKey: ltEndKey})
        })
        return withLocks

@nvanbenschoten nvanbenschoten changed the title kv/kvserver: TestSharedLocksBasic failed kv/kvserver: TestSharedLocksBasic failed during spanset assertion Sep 28, 2023
@arulajmani
Copy link
Collaborator

Thanks for tracking this down @nvanbenschoten before I even got to my desk! I'll end out a patch today.

@cockroach-teamcity
Copy link
Member Author

kv/kvserver.TestSharedLocksBasic failed with artifacts on master @ d89e0f438dfd8fdd1e891bd222f88205458966f4:

        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).sendPartialBatch(0xc019eab600, {0xde49770?, 0xc00bed1680}, 0xc00db49200, {{0xc0071d9e28, 0x1, 0x8}, {0xc0071d9e38, 0x1, 0x8}}, ...)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1875 +0x9ac
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).divideAndSendBatchToRanges(0xc019eab600, {0xde49770, 0xc00bed1680}, 0xc00db49200, {{0xc0071d9e28, 0x1, 0x8}, {0xc0071d9e38, 0x1, 0x8}}, ...)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1446 +0x573
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).Send(0xc019eab600, {0xde49770, 0xc00bed15f0}, 0xc00db49200)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1066 +0xa25
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnLockGatekeeper).SendLocked(0xc00b4576b0, {0xde49770, 0xc00bed15f0}, 0xc00db49200)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_lock_gatekeeper.go:82 +0x22a
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnMetricRecorder).SendLocked(0xc00b457678, {0xde49770, 0xc00bed15f0}, 0x4ace95?)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_metric_recorder.go:47 +0x1b9
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSpanRefresher).sendLockedWithRefreshAttempts(0xc00b457598, {0xde49770, 0xc00bed15f0}, 0xc00db49200, 0x5)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go:222 +0x2b6
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSpanRefresher).SendLocked(0xc00b457598, {0xde49770, 0xc00bed15f0}, 0xc00db49200)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go:150 +0x1e7
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnCommitter).SendLocked(0xc00b457560, {0xde49770, 0xc00bed15f0}, 0xc00db49200)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go:146 +0xa06
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnPipeliner).SendLocked(0xc00b457430, {0xde49770, 0xc00bed15f0}, 0x1?)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_pipeliner.go:295 +0x246
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSeqNumAllocator).SendLocked(0xc00b457410, {0xde49770, 0xc00bed15f0}, 0xc00db49200)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_seq_num_allocator.go:114 +0x3a6
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnHeartbeater).SendLocked(0xc00b457360, {0xde49770, 0xc00bed15f0}, 0xc00db49200)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_heartbeater.go:246 +0x7d4
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*TxnCoordSender).Send(0xc00b457180, {0xde49700, 0xc0001bc008}, 0xc00db49200)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_coord_sender.go:533 +0xa33
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.(*DB).sendUsingSender(0xc004df2fa0, {0xde49700, 0xc0001bc008}, 0xc00db49200, {0x7f1c419c01f0, 0xc00b457180})
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/db.go:1090 +0x191
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Send(0xc002f02210, {0xde49700, 0xc0001bc008}, 0xc00db49200)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/txn.go:1218 +0x2e9
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.sendAndFill({0xde49700, 0xc0001bc008}, 0xc009f2fb70, 0xc00db6a000)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/db.go:922 +0x253
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Run(0xc002f02210, {0xde49700, 0xc0001bc008}, 0x8f485e0?)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/txn.go:764 +0xa5
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.(*Txn).scan(0xc002f02210, {0xde49700, 0xc0001bc008}, {0x8f485e0, 0xdde65e0}, {0x8f485e0, 0xdde65f0}, 0x0, 0x9a?, 0x2, ...)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/txn.go:610 +0x22d
        	            	  | github.com/cockroachdb/cockroach/pkg/kv.(*Txn).ScanForShare(...)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/txn.go:652
        	            	  | github.com/cockroachdb/cockroach/pkg/kv/kvserver_test.TestSharedLocksBasic.func1(0xc00a9d1ba0, 0x1)
        	            	  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver_test/pkg/kv/kvserver/client_replica_test.go:4994 +0x111
        	            	  | github.com/cockroachdb/cockroach/pkg/testutils.RunValues[...].func1()
        	            	  | 	github.com/cockroachdb/cockroach/pkg/testutils/subtest.go:31 +0x56
        	            	  | testing.tRunner(0xc00a9d1ba0, 0xc008935480)
        	            	  | 	GOROOT/src/testing/testing.go:1576 +0x217
        	            	  | created by testing.(*T).Run
        	            	  | 	GOROOT/src/testing/testing.go:1629 +0x806
        	            	Error types: (1) *errbase.opaqueWrapper (2) *errutil.leafError
        	Test:       	TestSharedLocksBasic/guaranteed-durability=true
=== NAME  TestSharedLocksBasic
    client_replica_test.go:5033: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/33e1d369c27b9c01b2b6009c561815a3/logTestSharedLocksBasic3599415893
    --- FAIL: TestSharedLocksBasic/guaranteed-durability=true (0.15s)

Parameters: TAGS=bazel,gss , stress=true

Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

craig bot pushed a commit that referenced this issue Sep 29, 2023
109638: sql: use the correct locking strength for FOR SHARE clause r=michae2 a=arulajmani

Previously, FOR SHARE and FOR KEY SHARE would use non-locking KV scans.
Now that the lock table supports shared locks, we can use lock.Shared as
the locking strength for KV scans. This patch does that, and in doing
so, wires up SHARED locks end to end.

By default, we turn of this functionality for serializable transactions.
Instead, it's gated behind a session setting called
`enable_shared_locking_for_serializable`.

Informs #91545

Release note (sql change): SELECT FOR SHARE and SELECT FOR UPDATE
previously did not acquire any locks. Users issuing these statements
would expect them to acquire shared locks (multiple readers allowed,
no writers though). This patch switches over the behavior to acquire
such read locks.

For serializable transactions, we default to the old behaviour, unless
the `enable_shared_locking_for_serializable` session setting is set
to true. We'll probably switch this behavior in the future, but for
now, given shared locks are a preview feature, we gate things behind
a session setting.

111441: goschedstats: reduce underloadedRunnablePerProcThreshold r=aadityasondhi a=sumeerbhola

By reducing this threshold, we more often sample the stats at 1ms intervals, which is desirable for admission control slot adjustment.

Fixes #111125

Epic: none

Release note: None

111466: ui: update default timescale to 1h r=maryliag a=maryliag

On the Metrics page, the default value was 10min, which was too small. This commit updates it to 1h to match the most selected value on the SQL Activity.

Fixes #96479

Release note: None

111521: roachprod: fix createTenantCertBundle script r=andy-kimball a=herkolategan

The `createTenantCertBundle` function has a script in it that has erroneous "+" characters which results in an error when trying to start a secure cluster. This change removes those characters.

Epic: None
Release Note: None

111526: kv/spanset: permit writes to lock table by shared lock latching configuration r=nvanbenschoten a=nvanbenschoten

Fixes #111409.
Fixes #111492.

This commit updates addLockTableSpans to account for the way that shared locking requests declare their latches. Even though they declare a "read" latch, they do so at max timestamp, which gives them sufficient isolation to write to the lock table without having to declare a write latch and be serialized with other shared lock acquisitions.

Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: sumeerbhola <[email protected]>
Co-authored-by: maryliag <[email protected]>
Co-authored-by: Herko Lategan <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in 2c17fb7 Sep 29, 2023
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Oct 6, 2023
…guration

Fixes cockroachdb#111409.
Fixes cockroachdb#111492.

This commit updates addLockTableSpans to account for the way that shared locking
requests declare their latches. Even though they declare a "read" latch, they do
so at max timestamp, which gives them sufficient isolation to write to the lock
table without having to declare a write latch and be serialized with other
shared lock acquisitions.

Release note: None
@github-project-automation github-project-automation bot moved this to Closed in KV Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-transactions Relating to MVCC and the transactional model. branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team
Projects
No open projects
Archived in project
4 participants