Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv/kvnemesis: TestValidate failed #110765

Closed
cockroach-teamcity opened this issue Sep 16, 2023 · 7 comments · Fixed by #110863
Closed

kv/kvnemesis: TestValidate failed #110765

cockroach-teamcity opened this issue Sep 16, 2023 · 7 comments · Fixed by #110863
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-storage Storage Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Sep 16, 2023

kv/kvnemesis.TestValidate failed with artifacts on master @ 985662236d7bf273b93a7b5e32def8e2d1043640:

Fatal error:

panic: decoding ffffffffffffff: value type is not BYTES: 255 [recovered]
	panic: decoding ffffffffffffff: value type is not BYTES: 255

Stack:

goroutine 3443 [running]:
testing.tRunner.func1.2({0x77a59e0, 0xc005a07e78})
	GOROOT/src/testing/testing.go:1526 +0x372
testing.tRunner.func1()
	GOROOT/src/testing/testing.go:1529 +0x650
panic({0x77a59e0, 0xc005a07e78})
	GOROOT/src/runtime/panic.go:890 +0x263
github.com/cockroachdb/cockroach/pkg/kv/kvnemesis.mustGetStringValue({0xc000c2cfe2, 0x7, 0x7})
	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/validator.go:1313 +0x2a5
github.com/cockroachdb/cockroach/pkg/kv/kvnemesis.validReadTimes(0xc00244d680?, {0xc00574fb18, 0x14, 0x18}, {0xc005d37c89, 0x7, 0x7}, 0x0)
	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/validator.go:1406 +0x1285
github.com/cockroachdb/cockroach/pkg/kv/kvnemesis.(*validator).checkAtomicCommitted(0xc004f7aa00, {0xc0046a9901, 0xf}, {0xc0008f9fc0, 0x3, 0x4}, {0x3, 0x0, 0x0})
	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/validator.go:1048 +0x1009
github.com/cockroachdb/cockroach/pkg/kv/kvnemesis.(*validator).checkAtomic(0xc004f7aa00, {0x7b63dcf, 0x5}, {0x1, 0x0, {0x0, 0x0, 0x0}, {0x0, 0x0, ...}, ...})
	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/validator.go:821 +0x329
github.com/cockroachdb/cockroach/pkg/kv/kvnemesis.(*validator).processOp(0xc004f7aa00, {0xc004498100, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...})
	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/validator.go:695 +0x53d8
github.com/cockroachdb/cockroach/pkg/kv/kvnemesis.Validate({0xc002325180?, 0x5, 0x5}, 0x4?, 0x4?)
	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/validator.go:73 +0x265
github.com/cockroachdb/cockroach/pkg/kv/kvnemesis.TestValidate.func9(0xc000f184e0?)
	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/validator_test.go:2081 +0x910
github.com/cockroachdb/cockroach/pkg/testutils/echotest.(*Walker).Run.func2(0xc000f184e0)
	github.com/cockroachdb/cockroach/pkg/testutils/echotest/echotest.go:170 +0x59
testing.tRunner(0xc000f184e0, 0xc00251d260)
	GOROOT/src/testing/testing.go:1576 +0x217
created by testing.(*T).Run
	GOROOT/src/testing/testing.go:1629 +0x806
Log preceding fatal error

=== RUN   TestValidate
    test_log_scope.go:167: test logs captured to: /artifacts/tmp/_tmp/1f42cf5be2fc021646bf9b2daf5eaef3/logTestValidate209740005
    test_log_scope.go:81: use -show-logs to present logs inline
--- FAIL: TestValidate (0.19s)
=== RUN   TestValidate/batch_of_reads_after_writes_and_deletes
    --- FAIL: TestValidate/batch_of_reads_after_writes_and_deletes (0.00s)

Parameters: TAGS=bazel,gss , stress=true

Help

See also: How To Investigate a Go Test Failure (internal)

/cc @cockroachdb/kv

This test on roachdash | Improve this report!

Jira issue: CRDB-31598

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team labels Sep 16, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.2 milestone Sep 16, 2023
@cockroach-teamcity
Copy link
Member Author

kv/kvnemesis.TestValidate failed with artifacts on master @ 6cbd07ee6fbfb92706e8cdc8c559960b1bc41663:

created by github.com/cockroachdb/cockroach/pkg/util/log.init.5
	github.com/cockroachdb/cockroach/pkg/util/log/log_flush.go:84 +0x36

goroutine 50 [syscall]:
os/signal.signal_recv()
	GOROOT/src/runtime/sigqueue.go:152 +0x2f
os/signal.loop()
	GOROOT/src/os/signal/signal_unix.go:23 +0x25
created by os/signal.Notify.func1.1
	GOROOT/src/os/signal/signal.go:151 +0x51

goroutine 10 [chan receive]:
github.com/cockroachdb/cockroach/pkg/util/goschedstats.init.0.func1()
	github.com/cockroachdb/cockroach/pkg/util/goschedstats/runnable.go:165 +0x13b
created by github.com/cockroachdb/cockroach/pkg/util/goschedstats.init.0
	github.com/cockroachdb/cockroach/pkg/util/goschedstats/runnable.go:157 +0x2a

goroutine 2543 [chan receive]:
testing.(*T).Run(0xc004352ea0, {0x7ba4886, 0x15}, 0xc000787b00)
	GOROOT/src/testing/testing.go:1630 +0x82e
github.com/cockroachdb/cockroach/pkg/kv/kvnemesis.TestValidate(0xc004352ea0)
	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/validator_test.go:2043 +0x7e218
testing.tRunner(0xc004352ea0, 0x7ec4508)
	GOROOT/src/testing/testing.go:1576 +0x217
created by testing.(*T).Run
	GOROOT/src/testing/testing.go:1629 +0x806

goroutine 2544 [select]:
github.com/cockroachdb/cockroach/pkg/util/log.(*fileSink).gcDaemon(0xc0022b21c0, {0xc0415e8, 0xc005306550})
	github.com/cockroachdb/cockroach/pkg/util/log/file_log_gc.go:25 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/log.ApplyConfig
	github.com/cockroachdb/cockroach/pkg/util/log/flags.go:319 +0x1b94

goroutine 2578 [select]:
github.com/cockroachdb/cockroach/pkg/util/log.(*fileSink).gcDaemon(0xc0022b2380, {0xc0415e8, 0xc005306550})
	github.com/cockroachdb/cockroach/pkg/util/log/file_log_gc.go:25 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/log.ApplyConfig
	github.com/cockroachdb/cockroach/pkg/util/log/flags.go:319 +0x1b94

goroutine 2579 [select]:
github.com/cockroachdb/cockroach/pkg/util/log.(*fileSink).gcDaemon(0xc0022b2540, {0xc0415e8, 0xc005306550})
	github.com/cockroachdb/cockroach/pkg/util/log/file_log_gc.go:25 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/log.ApplyConfig
	github.com/cockroachdb/cockroach/pkg/util/log/flags.go:319 +0x1b94

goroutine 2545 [select]:
github.com/cockroachdb/cockroach/pkg/util/log.(*fileSink).gcDaemon(0xc0022b22a0, {0xc0415e8, 0xc005306550})
	github.com/cockroachdb/cockroach/pkg/util/log/file_log_gc.go:25 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/log.ApplyConfig
	github.com/cockroachdb/cockroach/pkg/util/log/flags.go:319 +0x1b94

Parameters: TAGS=bazel,gss , stress=true

Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

kv/kvnemesis.TestValidate failed with artifacts on master @ 6cbd07ee6fbfb92706e8cdc8c559960b1bc41663:

created by github.com/cockroachdb/cockroach/pkg/util/log.init.5
	github.com/cockroachdb/cockroach/pkg/util/log/log_flush.go:84 +0x36

goroutine 19 [syscall]:
os/signal.signal_recv()
	GOROOT/src/runtime/sigqueue.go:152 +0x2f
os/signal.loop()
	GOROOT/src/os/signal/signal_unix.go:23 +0x25
created by os/signal.Notify.func1.1
	GOROOT/src/os/signal/signal.go:151 +0x51

goroutine 8 [chan receive]:
github.com/cockroachdb/cockroach/pkg/util/goschedstats.init.0.func1()
	github.com/cockroachdb/cockroach/pkg/util/goschedstats/runnable.go:165 +0x13b
created by github.com/cockroachdb/cockroach/pkg/util/goschedstats.init.0
	github.com/cockroachdb/cockroach/pkg/util/goschedstats/runnable.go:157 +0x2a

goroutine 2644 [chan receive]:
testing.(*T).Run(0xc004e631e0, {0x7c2cb72, 0x2a}, 0xc005464a80)
	GOROOT/src/testing/testing.go:1630 +0x82e
github.com/cockroachdb/cockroach/pkg/kv/kvnemesis.TestValidate(0xc004e631e0)
	github.com/cockroachdb/cockroach/pkg/kv/kvnemesis/pkg/kv/kvnemesis/validator_test.go:2043 +0x7e218
testing.tRunner(0xc004e631e0, 0x7ec4508)
	GOROOT/src/testing/testing.go:1576 +0x217
created by testing.(*T).Run
	GOROOT/src/testing/testing.go:1629 +0x806

goroutine 2647 [select]:
github.com/cockroachdb/cockroach/pkg/util/log.(*fileSink).gcDaemon(0xc002a46700, {0xc0415e8, 0xc003488000})
	github.com/cockroachdb/cockroach/pkg/util/log/file_log_gc.go:25 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/log.ApplyConfig
	github.com/cockroachdb/cockroach/pkg/util/log/flags.go:319 +0x1b94

goroutine 2648 [select]:
github.com/cockroachdb/cockroach/pkg/util/log.(*fileSink).gcDaemon(0xc002a467e0, {0xc0415e8, 0xc003488000})
	github.com/cockroachdb/cockroach/pkg/util/log/file_log_gc.go:25 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/log.ApplyConfig
	github.com/cockroachdb/cockroach/pkg/util/log/flags.go:319 +0x1b94

goroutine 2645 [select]:
github.com/cockroachdb/cockroach/pkg/util/log.(*fileSink).gcDaemon(0xc002a460e0, {0xc0415e8, 0xc003488000})
	github.com/cockroachdb/cockroach/pkg/util/log/file_log_gc.go:25 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/log.ApplyConfig
	github.com/cockroachdb/cockroach/pkg/util/log/flags.go:319 +0x1b94

goroutine 2646 [select]:
github.com/cockroachdb/cockroach/pkg/util/log.(*fileSink).gcDaemon(0xc002a46620, {0xc0415e8, 0xc003488000})
	github.com/cockroachdb/cockroach/pkg/util/log/file_log_gc.go:25 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/log.ApplyConfig
	github.com/cockroachdb/cockroach/pkg/util/log/flags.go:319 +0x1b94

Parameters: TAGS=bazel,gss , stress=true

Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

@nvanbenschoten
Copy link
Member

Bisected to 6dc61d5.

@nvanbenschoten nvanbenschoten removed the T-kv KV Team label Sep 18, 2023
@blathers-crl blathers-crl bot added the T-storage Storage Team label Sep 18, 2023
@bananabrick bananabrick self-assigned this Sep 18, 2023
@jbowens
Copy link
Collaborator

jbowens commented Sep 18, 2023

I'm confused; do Cockroach tests build Pebble with the invariants build tag?

@erikgrinaker
Copy link
Contributor

I think the invariants checks are also enabled under race? Doesn't look like this was a race build though.

@bananabrick
Copy link
Contributor

bananabrick commented Sep 18, 2023

I can reproduce this using a race build using the following command:
dev test --stress --race pkg/kv/kvnemesis -f TestValidate/batch_of_reads_after_writes_and_deletes

I wasn't able to reproduce without the --race flag. Thankfully, it fails easily without the --stress flag.

Bisected to the following Pebble commit: cockroachdb/pebble@529d256

Edit:
Was able to isolate the MaybeWrap call down to https://github.com/cockroachdb/pebble/blob/master/db.go#L1458. Removed the rest of the calls, and will debug from here.

@bananabrick
Copy link
Contributor

I think the problem might be here: https://github.com/cockroachdb/cockroach/blob/master/pkg/kv/kvnemesis/validator.go#L1385.

The tests pass if I copy the valB slice. I don't think ValueAndErr guarantees that the contents of the slice won't change as iteration is continued.

bananabrick added a commit to cockroachdb/pebble that referenced this issue Sep 18, 2023
craig bot pushed a commit that referenced this issue Sep 19, 2023
110693: sql: wrap each planNode into DistSQL independently when collecting stats r=yuzefovich a=yuzefovich

This commit adjusts the DistSQL physical planner to create a pair of `planNodeToRowSource` and `rowSourceToPlanNode` for each `planNode` whenever it's included into the DistSQL flow separately whenever the execution statistics are collected. This allows us to collect exec stats for each plan node (rather than see the execution time of the whole chain of `planNode`s and the number of output rows only of the first `planNode` to be wrapped). This should have negligible overhead. The only exception for when this wrapping is disabled is when the planNode implements `batchedPlanNode` interface since wrapping those types of planNodes breaks some assumptions.

This was useful in a recent query latency investigation where multiple virtual table lookup joins (powered by the corresponding planNodes) were taking vast majority of the query execution, but since all of them were hidden behind a single pair of DistSQL adapters, it wasn't clear which particular vtable lookup join was the bottleneck.

Epic: None

Release note: None

110863: kv/kvnemesis: copy value before holding a reference r=bananabrick a=bananabrick

Epic: none
Fixes: #110765

Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Arjun Nair <[email protected]>
@craig craig bot closed this as completed in 78a405a Sep 19, 2023
@jbowens jbowens moved this to Done in [Deprecated] Storage Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-storage Storage Team
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants