Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: import/tpcc/warehouses=4000/geo failed #82982

Closed
cockroach-teamcity opened this issue Jun 16, 2022 · 2 comments
Closed

roachtest: import/tpcc/warehouses=4000/geo failed #82982

cockroach-teamcity opened this issue Jun 16, 2022 · 2 comments
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-disaster-recovery

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Jun 16, 2022

roachtest.import/tpcc/warehouses=4000/geo failed with artifacts on master @ 47df1bc3a28f705e56ff4efc3d81c6fc90e096b5:

		  | I220616 10:25:49.753177 1 ccl/workloadccl/fixture.go:326  [-] 11  imported 274 GiB bytes in 9 tables (took 28m19.861450004s, 165.06 MiB/s)
		  | I220616 10:27:26.956585 1 ccl/workloadccl/cliccl/fixtures.go:343  [-] 12  fixture is restored; now running consistency checks (ctrl-c to abort)
		  | I220616 10:27:27.561148 1 workload/tpcc/tpcc.go:517  [-] 13  check 3.3.2.1 took 603.840582ms
		  | I220616 10:28:04.104640 1 workload/tpcc/tpcc.go:517  [-] 14  check 3.3.2.2 took 36.543423865s
		  | I220616 10:28:09.737041 1 workload/tpcc/tpcc.go:517  [-] 15  check 3.3.2.3 took 5.63230075s
		  | I220616 10:35:48.371892 1 workload/tpcc/tpcc.go:517  [-] 16  check 3.3.2.4 took 7m38.634742241s
		  | I220616 10:37:46.174922 1 workload/tpcc/tpcc.go:517  [-] 17  check 3.3.2.5 took 1m57.80241359s
		  | I220616 10:39:31.771626 1 workload/tpcc/tpcc.go:517  [-] 18  check 3.3.2.7 took 1m45.596606901s
		  | Error: check failed: 3.3.2.7: pq: inbox communication error: rpc error: code = Canceled desc = context canceled
		  |
		  | stdout:
		Wraps: (4) COMMAND_PROBLEM
		Wraps: (5) Node 1. Command with error:
		  | ``````
		  | ./cockroach workload fixtures import tpcc --warehouses=4000 --csv-server='http://localhost:8081'
		  | ``````
		Wraps: (6) exit status 1
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *cluster.WithCommandDetails (4) errors.Cmd (5) *hintdetail.withDetail (6) *exec.ExitError

	monitor.go:127,import.go:154,import.go:181,test_runner.go:884: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitorImpl).WaitE
		  | 	main/pkg/cmd/roachtest/monitor.go:115
		  | main.(*monitorImpl).Wait
		  | 	main/pkg/cmd/roachtest/monitor.go:123
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerImportTPCC.func1
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/import.go:154
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerImportTPCC.func3
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/import.go:181
		  | main.(*testRunner).runTest.func2
		  | 	main/pkg/cmd/roachtest/test_runner.go:884
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitorImpl).wait.func2
		  | 	main/pkg/cmd/roachtest/monitor.go:171
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	main/pkg/cmd/roachtest/monitor.go:80
		  | runtime.doInit
		  | 	GOROOT/src/runtime/proc.go:6498
		  | runtime.main
		  | 	GOROOT/src/runtime/proc.go:238
		  | runtime.goexit
		  | 	GOROOT/src/runtime/asm_amd64.s:1581
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=16 , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

/cc @cockroachdb/bulk-io

This test on roachdash | Improve this report!

Jira issue: CRDB-16772

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Jun 16, 2022
@msbutler
Copy link
Collaborator

msbutler commented Jun 21, 2022

the tpcc workload check3326() failed due to the mysterious pq: inbox communication error: rpc error: code = Canceled desc = context canceled error. I couldn't find 'inbox communication error' in any of the cockroach logs, nor any evidence of a graceful node shutdown (one reason to see this message). So, I'm not sure what the root cause is for this error.

Given that the import completed (the test failed after workloadsql.Setup completed in fixturesImport()) and that the consistency check didn't fail due to data corruption, I'm removing the release blocker.

@msbutler msbutler removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Jun 21, 2022
@msbutler msbutler self-assigned this Jun 21, 2022
@msbutler
Copy link
Collaborator

Closing as its likely a rare test infra flake

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-disaster-recovery
Projects
No open projects
Archived in project
Development

No branches or pull requests

2 participants