-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv/kvserver: TestLeasePreferencesDuringOutage failed #105101
Comments
This is failing here: cockroach/pkg/kv/kvserver/client_lease_test.go Lines 1025 to 1028 in 7754c4b
The error is coming from lease acquisition attempt here: cockroach/pkg/kv/kvserver/store.go Lines 3640 to 3646 in 7754c4b
The failure is different than previous flakes. I see we backported a deflake commit to 22.2: #103636. |
Stressed for 25mins with no failure.
cockroach/pkg/kv/kvserver/client_lease_test.go Lines 1025 to 1028 in 7754c4b
I wonder if we could wrap this in a succeeds soon to deflake the test. |
Reproduced after trying again. Looks like
|
We either need to disable expiration leases or retry the enqueue. I'll open a PR. |
It was possible that another node would acquire the lease, causing the replicate queue enqueue to fail. Wrap the enqueue in a succeeds soon to retry when this happens. Resolves: cockroachdb#105101 Release note: None
`TestLeasePreferencesDuringOutage` checks that a non-preferred node can up-replicate and transfer the lease in a single replicate queue `process()` It was possible that another node would acquire the lease, causing the replicate queue enqueue to fail. Disable expiration based lease transfers and force the transfer to a non-preferred locality. This avoids flakiness in parts of the test which are not directly part of the behavior being asserted on. This patch also unskips the test. Resolves: cockroachdb#105101 Release note: None
`TestLeasePreferencesDuringOutage` checks that a non-preferred node can up-replicate and transfer the lease in a single replicate queue `process()` It was possible that another node would acquire the lease, causing the replicate queue enqueue to fail. Disable expiration based lease transfers and force the transfer to a non-preferred locality. This avoids flakiness in parts of the test which are not directly part of the behavior being asserted on. This patch also unskips the test. Resolves: cockroachdb#105101 Release note: None
We have marked this test failure issue as stale because it has been |
This hasn't flaked in a while. I'm going to stress for an hour, if I can't reproduce the problem, I'll close the issue. |
nvm this is still flaky due to above.
|
I'm stressing a candidate fix, where expiration based lease transfers are disabled. |
Previously, it was possible for a soon to be dead replica, to acquire the range lease in the `TestLeasePreferencesDuringOutage` test. The acquired lease would be expiration based, disallowing the intended leaseholder from acquiring the lease. This patch disables expiration based lease transfers, deflaking the test. Resolves: cockroachdb#105101 Epic: none Release note: None
108295: roachtest: increase timeout for network_logging to 60s r=dhartunian a=abarganier Fixes: #108088 The new network_logging roachtest sets a pgclient timeout of 10s to attempt detecting deadlocks. This timeout was hit fairly easily during the nightly runs. This is an indication that our 10s timeout is too aggressive. This PR changes the timeout from 10s to 60s which still achieves the original aim without being so aggressive. Release note: none 108333: kvserver: deflake lease preferences during outage r=erikgrinaker a=kvoli *This PR is intended to be backported to `release-22.2`. `TestLeasePreferencesDuringOutage` is currently skipped on master. Stressed for 30 mins without failure on release-22.2.* Previously, it was possible for a soon-to-be dead replica, to acquire the range lease in the `TestLeasePreferencesDuringOutage` test. The acquired lease would be expiration based, disallowing the intended leaseholder from acquiring the lease. This patch disables expiration based lease transfers, deflaking the test. Resolves: #105101 Epic: none Release note: None Co-authored-by: Alex Barganier <[email protected]> Co-authored-by: Austen McClernon <[email protected]>
Previously, it was possible for a soon to be dead replica, to acquire the range lease in the `TestLeasePreferencesDuringOutage` test. The acquired lease would be expiration based, disallowing the intended leaseholder from acquiring the lease. This patch disables expiration based lease transfers, deflaking the test. Resolves: #105101 Epic: none Release note: None
Previously, it was possible for a soon to be dead replica, to acquire the range lease in the `TestLeasePreferencesDuringOutage` test. The acquired lease would be expiration based, disallowing the intended leaseholder from acquiring the lease. This patch disables expiration based lease transfers, deflaking the test. Resolves: #105101 Epic: none Release note: None
kv/kvserver.TestLeasePreferencesDuringOutage failed with artifacts on release-22.2 @ 8055184fc991424d1d0eefd7ccf703948b8de3ee:
Parameters:
TAGS=bazel,gss
Help
See also: How To Investigate a Go Test Failure (internal)
This test on roachdash | Improve this report!
Jira issue: CRDB-28862
The text was updated successfully, but these errors were encountered: