Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: failover/chaos/read-write/lease=expiration failed #123186

Closed
cockroach-teamcity opened this issue Apr 28, 2024 · 5 comments
Closed

roachtest: failover/chaos/read-write/lease=expiration failed #123186

cockroach-teamcity opened this issue Apr 28, 2024 · 5 comments
Assignees
Labels
A-storage Relating to our storage engine (Pebble) on-disk storage. branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-storage Storage Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Apr 28, 2024

roachtest.failover/chaos/read-write/lease=expiration failed with artifacts on master @ 5d02bd9ff6b2bccecf6d43fc6cd647167b91f782:

(failover.go:1766).sleepFor: sleep failed: context canceled
(monitor.go:154).Wait: monitor failure: monitor user task failed: t.Fatal() was called
(cluster.go:2351).Run: context canceled
(cluster.go:2351).Run: context canceled
(cluster.go:2351).Run: context canceled
(cluster.go:2351).Run: context canceled
test artifacts and logs in: /artifacts/failover/chaos/read-write/lease=expiration/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=2
  • ROACHTEST_encrypted=false
  • ROACHTEST_fs=ext4
  • ROACHTEST_localSSD=false
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

Jira issue: CRDB-38233

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team labels Apr 28, 2024
@cockroach-teamcity cockroach-teamcity added this to the 24.1 milestone Apr 28, 2024
@miraradeva miraradeva added the T-storage Storage Team label Apr 29, 2024
@blathers-crl blathers-crl bot added the A-storage Relating to our storage engine (Pebble) on-disk storage. label Apr 29, 2024
@miraradeva
Copy link
Contributor

Looks like node 9 crashed, and this is at the top of its stack trace:

panic: runtime error: index out of range [1390] with length 300

goroutine 135728 gp=0xc00dc9ac40 m=14 mp=0xc013d65008 [running]:
panic({0x60c44e0?, 0xc02c04af60?})
	GOROOT/src/runtime/panic.go:779 +0x158 fp=0xc00d696b00 sp=0xc00d696a50 pc=0x49bd98
runtime.goPanicIndex(0x56e, 0x12c)
	GOROOT/src/runtime/panic.go:114 +0x7c fp=0xc00d696b40 sp=0xc00d696b00 pc=0x49a6dc
github.com/cockroachdb/cockroach/pkg/storage/disk.(*monitorTracer).String(0xc0017962c0)
	github.com/cockroachdb/cockroach/pkg/storage/disk/monitor_tracer.go:140 +0x4d7 fp=0xc00d696f08 sp=0xc00d696b40 pc=0x2014e37
github.com/cockroachdb/cockroach/pkg/storage/disk.(*Monitor).LogTrace(...)
	github.com/cockroachdb/cockroach/pkg/storage/disk/monitor.go:275
github.com/cockroachdb/cockroach/pkg/storage.(*Pebble).makeMetricEtcEventListener.func4.1()
	github.com/cockroachdb/cockroach/pkg/storage/pebble.go:1460 +0x4f fp=0xc00d696fb0 sp=0xc00d696f08 pc=0x208006f
github.com/cockroachdb/cockroach/pkg/storage.(*Pebble).async.func1()
	github.com/cockroachdb/cockroach/pkg/storage/pebble.go:1378 +0x59 fp=0xc00d696fe0 sp=0xc00d696fb0 pc=0x207f4b9
runtime.goexit({})
	src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00d696fe8 sp=0xc00d696fe0 pc=0x4d7e01
created by github.com/cockroachdb/cockroach/pkg/storage.(*Pebble).async in goroutine 135727
	github.com/cockroachdb/cockroach/pkg/storage/pebble.go:1376 +0x7c

Pinging @cockroachdb/storage to see if this looks familiar.

@andrewbaptist andrewbaptist added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Apr 29, 2024
@cockroach-teamcity
Copy link
Member Author

roachtest.failover/chaos/read-write/lease=expiration failed with artifacts on master @ 2b5baff23f9b53da26f0fc1a763ca43384b464b0:

(failover.go:1766).sleepFor: sleep failed: context canceled
(monitor.go:154).Wait: monitor failure: monitor user task failed: t.Fatal() was called
(cluster.go:2351).Run: context canceled
(cluster.go:2351).Run: context canceled
(cluster.go:2351).Run: context canceled
(cluster.go:2351).Run: context canceled
test artifacts and logs in: /artifacts/failover/chaos/read-write/lease=expiration/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=2
  • ROACHTEST_encrypted=false
  • ROACHTEST_fs=ext4
  • ROACHTEST_localSSD=false
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

This test on roachdash | Improve this report!

@nicktrav
Copy link
Collaborator

According to @andrewbaptist, this is likely a dup of #123021.

@nicktrav
Copy link
Collaborator

cc: @RaduBerinde - as we have a patch up.

@RaduBerinde RaduBerinde self-assigned this Apr 29, 2024
@jbowens
Copy link
Collaborator

jbowens commented Apr 29, 2024

Fixed by #123218

@jbowens jbowens closed this as completed Apr 29, 2024
@exalate-issue-sync exalate-issue-sync bot removed the T-kv KV Team label Apr 29, 2024
@jbowens jbowens moved this to Done in [Deprecated] Storage Jun 4, 2024
@github-project-automation github-project-automation bot moved this to roachtest/unit test backlog in KV Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-storage Relating to our storage engine (Pebble) on-disk storage. branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-storage Storage Team
Projects
No open projects
Status: roachtest/unit test backlog
Archived in project
Development

No branches or pull requests

6 participants