Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ccl/streamingccl/streamingest: TestDataDriven failed #107930

Closed
cockroach-teamcity opened this issue Aug 1, 2023 · 2 comments · Fixed by #108401
Closed

ccl/streamingccl/streamingest: TestDataDriven failed #107930

cockroach-teamcity opened this issue Aug 1, 2023 · 2 comments · Fixed by #108401
Assignees
Labels
branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-disaster-recovery
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Aug 1, 2023

ccl/streamingccl/streamingest.TestDataDriven failed with artifacts on release-23.1 @ f6c68f6626497c43f2e5bef6f7e189b8792cfefb:

      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x619
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:332 +0xf4c
  github.com/cockroachdb/cockroach/pkg/server.(*channelOrchestrator).startControlledServer()
      github.com/cockroachdb/cockroach/pkg/server/server_controller_channel_orchestrator.go:292 +0x29
  github.com/cockroachdb/cockroach/pkg/server.(*serverController).createServerEntryLocked()
      github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:173 +0x2b0
  github.com/cockroachdb/cockroach/pkg/server.(*serverController).scanTenantsForRunnableServices()
      github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:134 +0x538
  github.com/cockroachdb/cockroach/pkg/server.(*serverController).start.func1()
      github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:60 +0x21a
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6

Goroutine 147620 (running) created at:
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x619
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:332 +0x404
  github.com/cockroachdb/cockroach/pkg/util/netutil.(*TCPServer).ServeWith()
      github.com/cockroachdb/cockroach/pkg/util/netutil/net.go:185 +0x36
  github.com/cockroachdb/cockroach/pkg/server.startServeSQL.func1()
      github.com/cockroachdb/cockroach/pkg/server/server_sql.go:1756 +0x17b
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6
==================
=== RUN   TestDataDriven/alter_tenant
    datadriven_test.go:102: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4301/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/streamingccl/streamingest/streamingest_test_/streamingest_test.runfiles/com_github_cockroachdb_cockroach/pkg/ccl/streamingccl/streamingest/testdata/alter_tenant:1:
        create-replication-clusters [0 args]
        <no input to command>
        ----
    datadriven_test.go:102: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4301/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/streamingccl/streamingest/streamingest_test_/streamingest_test.runfiles/com_github_cockroachdb_cockroach/pkg/ccl/streamingccl/streamingest/testdata/alter_tenant:4:
        start-replication-stream [0 args]
        <no input to command>
        ----
    datadriven_test.go:102: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4301/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/streamingccl/streamingest/streamingest_test_/streamingest_test.runfiles/com_github_cockroachdb_cockroach/pkg/ccl/streamingccl/streamingest/testdata/alter_tenant:7:
        exec-sql [1 args]
        ALTER TENANT "destination" SET REPLICATION RETENTION = '42s'
        ----
    datadriven_test.go:102: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4301/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/streamingccl/streamingest/streamingest_test_/streamingest_test.runfiles/com_github_cockroachdb_cockroach/pkg/ccl/streamingccl/streamingest/testdata/alter_tenant:11:
        query-sql [1 args]
        SELECT crdb_internal.pb_to_json('payload', payload)->'streamIngestion'->'replicationTtlSeconds' as retention_ttl_seconds
        FROM crdb_internal.system_jobs
        WHERE id = (SELECT replication_job_id FROM [SHOW TENANT "destination" WITH REPLICATION STATUS])
        ----
        42

Parameters: TAGS=bazel,gss,race

Help

See also: How To Investigate a Go Test Failure (internal)

/cc @cockroachdb/disaster-recovery

This test on roachdash | Improve this report!

Jira issue: CRDB-30261

@cockroach-teamcity cockroach-teamcity added branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-disaster-recovery labels Aug 1, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.1 milestone Aug 1, 2023
@stevendanna
Copy link
Collaborator

@lidorcarmel I'm throwing this your way since you've thought about this a bit. Happy to look into it with you though.

(cc @knz for visibility)

@lidorcarmel
Copy link
Contributor

See stack below.
I'm thinking just to put a mutex around serverStateUsingChannels.server, I'll send a pr tomorrow.

    ==================
07:39:31     WARNING: DATA RACE
07:39:31     Write at 0x00c003a18828 by goroutine 147262:
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*channelOrchestrator).startControlledServer.func5()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_channel_orchestrator.go:416 +0xa64
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6
07:39:31     
07:39:31     Previous read at 0x00c003a18828 by goroutine 147620:
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverStateUsingChannels).getServer()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_channel_orchestrator.go:109 +0x4e
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).getServer()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_accessors.go:28 +0x138
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).sqlMux()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_sql.go:67 +0x2ac
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).sqlMux-fm()
07:39:31           <autogenerated>:1 +0xc7
07:39:31       github.com/cockroachdb/cockroach/pkg/server.startServeSQL.func1.1()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_sql.go:1766 +0x2eb
07:39:31       github.com/cockroachdb/cockroach/pkg/util/netutil.(*TCPServer).ServeWith.func1()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/netutil/net.go:188 +0x111
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6
07:39:31     
07:39:31     Goroutine 147262 (running) created at:
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x619
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:332 +0xf4c
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*channelOrchestrator).startControlledServer()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_channel_orchestrator.go:292 +0x29
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).createServerEntryLocked()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:173 +0x2b0
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).scanTenantsForRunnableServices()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:134 +0x538
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).start.func1()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:60 +0x21a
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6
07:39:31     
07:39:31     Goroutine 147620 (running) created at:
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x619
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:332 +0x404
07:39:31       github.com/cockroachdb/cockroach/pkg/util/netutil.(*TCPServer).ServeWith()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/netutil/net.go:185 +0x36
07:39:31       github.com/cockroachdb/cockroach/pkg/server.startServeSQL.func1()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_sql.go:1756 +0x17b
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6
07:39:31     ==================

@lidorcarmel lidorcarmel added the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Aug 7, 2023
lidorcarmel added a commit to lidorcarmel/cockroach that referenced this issue Aug 8, 2023
Avoid returning `server` when it's not ready - we shouldn't read it when
`started` is false (that's a race).

Testing: this test (sometimes) fails without this pr and succeeds with it:
`./dev test pkg/ccl/streamingccl/streamingest:streamingest_test -f TestDataDriven --race -- --runs_per_test=10`

Epic: none
Informs: cockroachdb#107930

Release note: None
@craig craig bot closed this as completed in 01a6aab Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-disaster-recovery
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants