Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: transfer-leases/drain-other-node failed #91801

Closed
cockroach-teamcity opened this issue Nov 13, 2022 · 5 comments · Fixed by #92313
Closed

roachtest: transfer-leases/drain-other-node failed #91801

cockroach-teamcity opened this issue Nov 13, 2022 · 5 comments · Fixed by #92313
Assignees
Labels
A-testing Testing tools and infrastructure C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Nov 13, 2022

roachtest.transfer-leases/drain-other-node failed with artifacts on release-22.2.0 @ 234c9295cc02150f919cfa96b09ee2fa07b68ace:

test artifacts and logs in: /artifacts/transfer-leases/drain-other-node/run_1
	quit.go:72,quit.go:324,soon.go:69,retry.go:208,soon.go:75,soon.go:48,quit.go:228,quit.go:95,quit.go:154,context.go:91,quit.go:153,quit.go:95,quit.go:54,quit.go:361,test_runner.go:930: (1) ranges with no lease outside of node 3: []string{"37"}

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

Jira issue: CRDB-21437

@cockroach-teamcity cockroach-teamcity added branch-release-22.2.0 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Nov 13, 2022
@cockroach-teamcity cockroach-teamcity added this to the 22.2 milestone Nov 13, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Nov 13, 2022
@nvanbenschoten
Copy link
Member

n3 does think that is has transferred all leases away by the end of the test.

I221113 08:22:04.569289 69493 1@server/drain.go:285 â‹® [n3] 2005  drain remaining: 1
I221113 08:22:04.569310 69493 1@server/drain.go:287 â‹® [n3] 2006  drain details: range lease iterations: 1
I221113 08:22:04.776108 70977 1@server/drain.go:285 â‹® [n3] 2010  drain remaining: 0

And then n2 does receive the lease in a transfer:

I221113 08:22:09.444855 207 kv/kvserver/replica_proposal.go:268 ⋮ [n2,s2,r37/2:‹/Table/3{6-7}›,raft] 987  new range lease repl=(n2,s2):2 seq=12 start=1668327724.301867742,0 exp=1668327733.301786243,0 pro=1668327724.301786243,0 following repl=(n3,s3):3 seq=11 start=1668327694.466207271,0 epo=3 pro=1668327694.510061860,0
I221113 08:22:09.445062 207 kv/kvserver/replica_proposal.go:374 ⋮ [n2,s2,r37/2:‹/Table/3{6-7}›,raft] 988  upgrading expiration lease repl=(n2,s2):2 seq=12 start=1668327724.301867742,0 exp=1668327733.301786243,0 pro=1668327724.301786243,0 to an epoch-based one
I221113 08:22:09.445125 207 kv/kvserver/replica_proposal.go:430 ⋮ [n2,s2,r37/2:‹/Table/3{6-7}›,raft] 989  is now leaseholder
I221113 08:22:09.449525 209 kv/kvserver/replica_proposal.go:268 ⋮ [n2,s2,r37/2:‹/Table/3{6-7}›,raft] 991  new range lease repl=(n2,s2):2 seq=13 start=1668327724.301867742,0 epo=4 pro=1668327729.445054129,0 following repl=(n2,s2):2 seq=12 start=1668327724.301867742,0 exp=1668327733.301786243,0 pro=1668327724.301786243,0
I221113 08:22:09.449719 209 kv/kvserver/replica_proposal.go:430 ⋮ [n2,s2,r37/2:‹/Table/3{6-7}›,raft] 992  is now leaseholder

I don't understand why the test failed, but this doesn't need to block the release while we find out.

@nvanbenschoten nvanbenschoten removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Nov 14, 2022
@tbg
Copy link
Member

tbg commented Nov 22, 2022

Looks similar to #75438, as a result of which I added some more debugging. Unfortunately, looks like I botched it - this is written in a loop so we get a random node's view. In this case, for r37 we see the lease on n3 still. But the test would only fail if this is the case for all nodes (i.e. if the lease transfer "has happened", at least one replica must reflect it), so we need all files. I think this wouldn't show anything surprising probably, we'd see all leases on n3 probably. There must be a tiny amount of asynchrony somewhere in the drain-lease-transfer path.

@tbg
Copy link
Member

tbg commented Nov 22, 2022

Is this code right? otherNodeID is 1 + (3 % count) (boils down to n1) , i.e. independent of which node in the for loop we're looking at. I think we want q.c.Node(i)here, nototherNodeID`.

This would also explain the test failure: We've checked n1 only. The lease for r37 went from n3 to n2, so it's more likely for n1 to learn about this a split second later. But since we're only asking n1, the test doesn't see that.

I think even with that fixed there is a race here. The drain completes (assuming r37 is the only range) once n3 has applied the lease transfer (to n2). But that doesn't mean that n2 has already applied it; for that to happen n3 has to first send the commit index to n2.

The correct check in that method would be to verify that n3 no longer claims that it itself has the lease. I think this race is probably theoretical but I'll add a comment.

Will send a PR to address the correct node for pulling the range info.

@tbg tbg self-assigned this Nov 22, 2022
@tbg tbg added T-kv-replication and removed T-kv KV Team labels Nov 22, 2022
@blathers-crl
Copy link

blathers-crl bot commented Nov 22, 2022

cc @cockroachdb/replication

@tbg
Copy link
Member

tbg commented Nov 22, 2022

Unfortunately I missed something, so the above explanation doesn't check out. Maybe it really is the "no other node may have applied the transfer yet" race. See #92313 (comment)

craig bot pushed a commit that referenced this issue Dec 7, 2022
90097: kv: collect hot ranges per tenant on store r=koorosh a=koorosh

The main goal of this change is to provide ability to request hot ranges info per tenant (as part of changes for UA). To track down hot ranges per tenant, `ReplicaRankingMap` struct is added that works almost identically as existing `ReplicaRankings` struct with only difference that hottest replicas are stored in an underlying map structure to keep separately ranges for every tenant. Also, it was decided to not extend existing `ReplicaRankings` functionality because it is used for replica rebalancing that doesn't need to know about tenants.

This change might bring small overhead for self-hosted clusters as it will accumulate same hot ranges in both `replRankings` and `replRankingsByTenant` fields in Store.

Changes in next PRs are supposed to rely on this change.

Release note: None

Epic: https://cockroachlabs.atlassian.net/browse/CRDB-19124

92313: roachtest: fix transfer-leases r=erikgrinaker a=tbg

`checkNoLeases` verifies that once a node is drained, for each range
there is one replica of the other nodes that sees the lease on one of
the non-drained stores. The reason it asks justs for *one* replica to
have the lease as opposed to all of them is because some followers may
be behind.

However, even just the assumption that there is one can be violated.
The drain succeeds once the draining node has locally seen the lease
transfer succeed. It is likely the raft leader at this point, so it
will be the first one to see this event. Other nodes will only see
it after one additional round-trip (when they learn that the log
index has committed, and then go and apply it).

So going and looking for a replica that sees the new lease immediately
after the drain succeeds may fail.

Work around this by sleeping for one second before checking, which ought
to be enough, and is also a small enough delay to make sure that if
leases are actually not getting transferred, the check will continue to
fail (non-cooperative lease failover is on the order of multiple
seconds).

This commit also improves a debugging file which was previously
clobbered over the multiple iterations of the surrounding loop.

It also makes it clearer that we're pulling the range data from
each node in turn (we were previously hitting otherNodeID but asking
it to proxy to the node. Now we're hitting each node directly without
relying on the internal redirect, which is less confusing).

See: #91801 (comment)

Fixes #91801.

Epic: none
Release note: None


92834: tree: improve type-checking for placeholders with ambiguous type r=jordanlewis a=rafiss

fixes #90364

The key fix is to change the typeCheckSplitExprs function so that it marks _all_
placeholder indexes. This then causes the existing type-checking logic
in typeCheckOverloadedExprs to check all placeholder expressions, rather
than just ones that don't have type hints.

Release note (bug fix): Prepared statements that use type hints can now succeed type-checking in more cases when the placeholder type is ambiguous.

93047: base: remove old testing knobs r=andreimatei a=andreimatei

The testing knobs for startup migrations don't exist anymore.

Release note: None
Epic: None

93074: testutils: add missing comment r=andreimatei a=andreimatei

Release note: None
Epic: None

93079: insights: record more txn insights r=xinhaoz a=xinhaoz

Closes #93076

This commit adds the following fields at the txn level when recording
insights in the sql insights system:
- contention: total txn contention time
- start_time: txn start time
- end_time: txn end time
- auto_retry_reason: last reason for auto txn retry
- retry_count
- rows_written
- rows_read

In addition, the following fields have been moved from recording at
the stmt level to recording at the txn level for insights:
- user
- application_name

Release note: None

93160: bazci: insert `--config ci` if necessary r=healthy-pod a=rickystewart

All builds and tests in CI need this --config argument.

Epic: None
Release note: None

93177: dev: fix dev build error when cross building r=rickystewart a=healthy-pod

This is a temporary fix for the issue. In a future change, we should let beaver hub distinguish between normal and cross builds, and then have an actual fix for this.

Release note: None
Epic: none

93193: vendor: bump Pebble to 0d6d19018632 r=nicktrav a=itsbilal

```
0d6d1901 crossversion: don't stream TestMeta output on verbose
d05a6f0e vfs: use SequentialReadsOption in vfs.[Limited]Copy
b85fc64f merging_iter: don't relative-seek past prefix in isNextEntryDeleted
32ad55f8 *: use github.com/cockroachdb/datadriven
6b644274 sstable: don't fatal if file no longer exists during readahead
```

Fixes #93191.

Release note: None.

Co-authored-by: Andrii Vorobiov <[email protected]>
Co-authored-by: Tobias Grieger <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: healthy-pod <[email protected]>
Co-authored-by: Bilal Akhtar <[email protected]>
@craig craig bot closed this as completed in 4dbbafa Dec 7, 2022
@lunevalex lunevalex added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-testing Testing tools and infrastructure labels Dec 9, 2022
@exalate-issue-sync exalate-issue-sync bot removed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-testing Testing tools and infrastructure labels Dec 15, 2022
@lunevalex lunevalex added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-testing Testing tools and infrastructure labels Jan 11, 2023
blathers-crl bot pushed a commit that referenced this issue Aug 30, 2023
`checkNoLeases` verifies that once a node is drained, for each range
there is one replica of the other nodes that sees the lease on one of
the non-drained stores. The reason it asks justs for *one* replica to
have the lease as opposed to all of them is because some followers may
be behind.

However, even just the assumption that there is one can be violated.
The drain succeeds once the draining node has locally seen the lease
transfer succeed. It is likely the raft leader at this point, so it
will be the first one to see this event. Other nodes will only see
it after one additional round-trip (when they learn that the log
index has committed, and then go and apply it).

So going and looking for a replica that sees the new lease immediately
after the drain succeeds may fail.

Work around this by sleeping for one second before checking, which ought
to be enough, and is also a small enough delay to make sure that if
leases are actually not getting transferred, the check will continue to
fail (non-cooperative lease failover is on the order of multiple
seconds).

This commit also improves a debugging file which was previously
clobbered over the multiple iterations of the surrounding loop.

It also makes it clearer that we're pulling the range data from
each node in turn (we were previously hitting otherNodeID but asking
it to proxy to the node. Now we're hitting each node directly without
relying on the internal redirect, which is less confusing).

See: #91801 (comment)

Fixes #91801.

Epic: none
Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testing Testing tools and infrastructure C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants