Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: restore/tpce/32TB/inc-count=400/aws/nodes=15/cpus=16 failed #104006

Closed
cockroach-teamcity opened this issue May 27, 2023 · 2 comments
Closed
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-disaster-recovery
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented May 27, 2023

roachtest.restore/tpce/32TB/inc-count=400/aws/nodes=15/cpus=16 failed with artifacts on master @ 9d6e7baebdc0d8b1a57ce8dc158fec35b48e9e2a:

test artifacts and logs in: /artifacts/restore/tpce/32TB/inc-count=400/aws/nodes=15/cpus=16/run_1
(monitor.go:127).Wait: monitor failure: monitor task failed: read tcp 172.17.0.3:36150 -> 18.219.155.15:26257: read: connection reset by peer

Parameters: ROACHTEST_cloud=aws , ROACHTEST_cpu=16 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

/cc @cockroachdb/disaster-recovery

This test on roachdash | Improve this report!

Jira issue: CRDB-28304

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-disaster-recovery labels May 27, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.1 milestone May 27, 2023
@adityamaru
Copy link
Contributor

adityamaru commented Jun 5, 2023

Seeing a panic on node 6 coming from processing a split:

E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212  a panic has occurred!
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +panic: ‹iterator with constraint=2 is being used with key /
Min that has constraint=1›
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +(1) attached stack trace
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  -- stack trace:
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | runtime.gopanic
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   GOROOT/src/runtime/panic.go:884
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/storage.(*intentIn
terleavingIter).checkConstraint
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/storage/intent_i
nterleaving_iter.go:574
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/storage.(*intentIn
terleavingIter).SeekGE
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/storage/intent_i
nterleaving_iter.go:480
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/storage.mvccMinSpl
itKey
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/storage/mvcc.go:
5904
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/storage.MVCCFirstSplitKey
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/storage/mvcc.go:5955
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).adminSplitWithDescriptor
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_command.go:357
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*splitQueue).processAttempt
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/split_queue.go:321
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*splitQueue).process
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/
kv/kvserver/split_queue.go:225
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*base
Queue).processReplica.func1
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/
kv/kvserver/queue.go:1020
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/util/timeutil.RunW
ithTimeout
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/util/timeutil/timeout.go:29
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*baseQueue).processReplica
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/queue.go:979
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*base
Queue).processLoop.func2.1
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/
kv/kvserver/queue.go:890
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | github.com/cockroachdb/cockroach/pkg/util/stop.(*Stoppe
r).RunAsyncTaskEx.func2
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   github.com/cockroachdb/cockroach/pkg/util/stop/stoppe
r.go:470
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  | runtime.goexit
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +  |   GOROOT/src/runtime/asm_amd64.s:1594
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +Wraps: (2) panic: ‹iterator with constraint=2 is being used with key /Min that has constraint=1›
E230527 17:53:34.060640 1714478 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n6,split,s6,r1/12:‹/{Min-System/NodeL…}›] 21212 +Error types: (1) *withstack.withStack (2) *errutil.leafErro
r

The restore still seems to have been running when we hit this panic. Moving this to KV for triage.

@blathers-crl blathers-crl bot added the T-kv KV Team label Jun 5, 2023
@erikgrinaker
Copy link
Contributor

Resolved by #104082.

@erikgrinaker erikgrinaker added X-duplicate Closed as a duplicate of another issue. and removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-disaster-recovery labels Jun 7, 2023
@exalate-issue-sync exalate-issue-sync bot added release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-disaster-recovery and removed X-duplicate Closed as a duplicate of another issue. T-kv KV Team labels Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-disaster-recovery
Projects
No open projects
Archived in project
Development

No branches or pull requests

3 participants