Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: cdc/pubsub-sink failed #87899

Closed
cockroach-teamcity opened this issue Sep 13, 2022 · 1 comment · Fixed by #88130
Closed

roachtest: cdc/pubsub-sink failed #87899

cockroach-teamcity opened this issue Sep 13, 2022 · 1 comment · Fixed by #88130
Assignees
Labels
branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-test-failure Broken test (automatically or manually discovered). GA-blocker O-roachtest O-robot Originated from a bot. T-cdc
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Sep 13, 2022

roachtest.cdc/pubsub-sink failed with artifacts on release-22.2 @ 58d24a3e81a743a9a77bd4e8bf0bd27180eb3c04:

		  |     3.0s        0            3.0            1.0     83.9     88.1     88.1     88.1 orderStatus
		  |     3.0s        0            0.0            0.0      0.0      0.0      0.0      0.0 payment
		  |     3.0s        0            0.0            0.0      0.0      0.0      0.0      0.0 stockLevel
		  |     4.0s        0            0.0            0.2      0.0      0.0      0.0      0.0 delivery
		  |     4.0s        0            0.0            0.0      0.0      0.0      0.0      0.0 newOrder
		  |     4.0s        0            0.0            0.7      0.0      0.0      0.0      0.0 orderStatus
		  |     4.0s        0            3.0            0.7     19.9     27.3     27.3     27.3 payment
		  |     4.0s        0            0.0            0.0      0.0      0.0      0.0      0.0 stockLevel
		  | _elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
		  |     5.0s        0            0.0            0.2      0.0      0.0      0.0      0.0 delivery
		  |     5.0s        0            0.0            0.0      0.0      0.0      0.0      0.0 newOrder
		  |     5.0s        0            0.0            0.6      0.0      0.0      0.0      0.0 orderStatus
		  |     5.0s        0            0.0            0.6      0.0      0.0      0.0      0.0 payment
		  |     5.0s        0            0.0            0.0      0.0      0.0      0.0      0.0 stockLevel
		Wraps: (4) secondary error attachment
		  | UNCLASSIFIED_PROBLEM: context canceled
		  | (1) UNCLASSIFIED_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ``````
		  |   | ./workload run tpcc --warehouses=1 --duration=30m  {pgurl:1-3}
		  |   | ``````
		  | Wraps: (3) context canceled
		  | Error types: (1) errors.Unclassified (2) *hintdetail.withDetail (3) *errors.errorString
		Wraps: (5) context canceled
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *cluster.WithCommandDetails (4) *secondary.withSecondaryError (5) *errors.errorString

	monitor.go:127,cdc.go:300,cdc.go:794,test_runner.go:908: monitor failure: monitor task failed: read tcp 172.17.0.3:53792 -> 34.138.195.47:26257: read: connection reset by peer
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitorImpl).WaitE
		  | 	main/pkg/cmd/roachtest/monitor.go:115
		  | main.(*monitorImpl).Wait
		  | 	main/pkg/cmd/roachtest/monitor.go:123
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.cdcBasicTest
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/cdc.go:300
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerCDC.func8
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/cdc.go:794
		  | [...repeated from below...]
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitorImpl).wait.func2
		  | 	main/pkg/cmd/roachtest/monitor.go:171
		  | runtime.goexit
		  | 	GOROOT/src/runtime/asm_amd64.s:1594
		Wraps: (4) monitor task failed
		Wraps: (5) read tcp 172.17.0.3:53792 -> 34.138.195.47:26257
		Wraps: (6) read
		Wraps: (7) connection reset by peer
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *net.OpError (6) *os.SyscallError (7) syscall.Errno

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=16 , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

/cc @cockroachdb/cdc

This test on roachdash | Improve this report!

Jira issue: CRDB-19583

Epic CRDB-11732

@cockroach-teamcity cockroach-teamcity added branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Sep 13, 2022
@cockroach-teamcity cockroach-teamcity added this to the 22.2 milestone Sep 13, 2022
@blathers-crl blathers-crl bot added the T-cdc label Sep 13, 2022
@miretskiy
Copy link
Contributor

Definitely an issue:

goroutine 7393 [running]:
runtime.fatal({0x513e354?, 0x47e54a0?})
        GOROOT/src/runtime/panic.go:1066 +0x5d fp=0xc006e53c70 sp=0xc006e53c40 pc=0x48dc9d
runtime.mapassign_faststr(0x495a120, 0xc005a271d0, {0xc0052aa6d8, 0x14})
        GOROOT/src/runtime/map_faststr.go:212 +0x56 fp=0xc006e53ce0 sp=0xc006e53c70 pc=0x467396
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.(*gcpPubsubClient).getTopicClient(0xc006ee7880, {0xc0052aa6d8, 0x14})
        github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/sink_pubsub.go:360 +0x8e fp=0xc006e53d18 sp=0xc006e53ce0 pc=0x3c2cc0e
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.(*gcpPubsubClient).sendMessage(0xc006ee7880, {0xc002a12fc0, 0x6c, 0x70}, {0xc0052aa6d8?, 0xc0
07171e01?}, {0xc0033609e0, 0xc})
        github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/sink_pubsub.go:533 +0x49 fp=0xc006e53d60 sp=0xc006e53d18 pc=0x3c2d8c9
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.(*pubsubSink).workerLoop(0xc002393050, 0x3f)
        github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/sink_pubsub.go:415 +0x343 fp=0xc006e53f38 sp=0xc006e53d60 pc=0x3c2d243
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.(*pubsubSink).setupWorkers.func1({0x1145526?, 0x1017780?})
        github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/sink_pubsub.go:381 +0x25 fp=0xc006e53f58 sp=0xc006e53f38 pc=0x3c2cee5
github.com/cockroachdb/cockroach/pkg/util/ctxgroup.Group.GoCtx.func1()
        github.com/cockroachdb/cockroach/pkg/util/ctxgroup/ctxgroup.go:168 +0x25 fp=0xc006e53f78 sp=0xc006e53f58 pc=0x1145c85
golang.org/x/sync/errgroup.(*Group).Go.func1()
        golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:74 +0x64 fp=0xc006e53fe0 sp=0xc006e53f78 pc=0x11453e4
runtime.goexit()
        GOROOT/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc006e53fe8 sp=0xc006e53fe0 pc=0x4c2361
created by golang.org/x/sync/errgroup.(*Group).Go
        golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:71 +0xa5

@miretskiy miretskiy added GA-blocker and removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Sep 19, 2022
@HonoreDB HonoreDB self-assigned this Sep 19, 2022
craig bot pushed a commit that referenced this issue Sep 19, 2022
88129: opt: copy ColSet in CreateLocalityOptimizedLookupJoinPrivateIncludingCols r=mgartner a=mgartner

`CreateLocalityOptimizedLookupJoinPrivateIncludingCols` was mutating a
`opt.ColSet` field of another `LookupJoinPrivate` because it was calling
`ColSet.UnionWith` without copying the `ColSet` first. This commit fixes
the bug.

Fixes #88126

Release note: None


88130: changefeedccl: avoid concurrent map access r=[miretskiy] a=HonoreDB

go 1.18 introduced more stringent checks for unsafe concurrent map use, surfacing some new and exciting panics in changefeed code.

When backported, fixes #87939
When backported, fixes #88089
When backported, fixes #87899

Release note (bug fix): Fixed crashes in changefeed code when running on recent go versions.

88134: kvserver: tweak a comment about raft snaps r=nvanbenschoten a=tbg

Suggested by Nathan[^1].

[^1]: #87702 (comment)

Release note: None


Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Aaron Zinger <[email protected]>
Co-authored-by: Tobias Grieger <[email protected]>
@craig craig bot closed this as completed in 03926b3 Sep 19, 2022
HonoreDB added a commit to HonoreDB/cockroach that referenced this issue Sep 20, 2022
go 1.18 introduced more stringent checks for unsafe concurrent map use, surfacing
some new and exciting panics in changefeed code.

When backported, fixes cockroachdb#87939
When backported, fixes cockroachdb#88089
When backported, fixes cockroachdb#87899

Release note (bug fix): Fixed crashes in changefeed code when running on recent go versions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-test-failure Broken test (automatically or manually discovered). GA-blocker O-roachtest O-robot Originated from a bot. T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants