-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: clearrange/checks=false/rangeTs=false failed #104016
Comments
a bunch of nodes failed to come back up:
n1:
n2:
n3:
n4:
|
I'm not sure what's up, or who should take a look at this. @knz do you mind helping me triage this? |
Yep this looks like kv. |
thanks! |
The error originates here: cockroach/pkg/kv/kvclient/kvcoord/replica_slice.go Lines 136 to 138 in 6cbb876
and is wrapped here: cockroach/pkg/sql/physicalplan/replicaoracle/oracle.go Lines 279 to 282 in c91ee21
We have recently done some work to make operations in the critical startup path retry on certain errors (in particular circuit breaker errors): #97710. However, here we are returning a I think we hit this code through this call: cockroach/pkg/server/server_sql.go Lines 1609 to 1614 in f04439c
which calls into this code: cockroach/pkg/upgrade/upgrademanager/manager.go Lines 216 to 227 in f849ef6
Note that this is wrapped in |
104128: startup: retry pgerror.Code == RangeUnavailable r=aliher1911 a=tbg Closes #104016. Epic: None Release note (bug fix): a bug was fixed due to which nodes could terminate with the following message: server startup failed: cockroach server exited with error: ‹migration-job-find-already-completed›: key range id:X is unavailable: ‹failed to send RPC: no replica node information available via gossip for rX› Co-authored-by: Tobias Grieger <[email protected]>
Closes #104016. Epic: None Release note (bug fix): a bug was fixed due to which nodes could terminate with the following message: server startup failed: cockroach server exited with error: ‹migration-job-find-already-completed›: key range id:X is unavailable: ‹failed to send RPC: no replica node information available via gossip for rX›
Closes cockroachdb#104016. Epic: None Release note (bug fix): a bug was fixed due to which nodes could terminate with the following message: server startup failed: cockroach server exited with error: ‹migration-job-find-already-completed›: key range id:X is unavailable: ‹failed to send RPC: no replica node information available via gossip for rX›
roachtest.clearrange/checks=false/rangeTs=false failed with artifacts on master @ dd8dbfc4fa61763dc4fa8fbc19d3a7f2a30bb1e2:
Parameters:
ROACHTEST_cloud=gce
,ROACHTEST_cpu=16
,ROACHTEST_encrypted=false
,ROACHTEST_fs=ext4
,ROACHTEST_localSSD=true
,ROACHTEST_ssd=0
Help
See: roachtest README
See: How To Investigate (internal)
This test on roachdash | Improve this report!
Jira issue: CRDB-28313
The text was updated successfully, but these errors were encountered: