Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: tpcc/mixed-headroom/n5cpu16 failed #87781

Closed
cockroach-teamcity opened this issue Sep 10, 2022 · 3 comments
Closed

roachtest: tpcc/mixed-headroom/n5cpu16 failed #87781

cockroach-teamcity opened this issue Sep 10, 2022 · 3 comments
Labels
branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Sep 10, 2022

roachtest.tpcc/mixed-headroom/n5cpu16 failed with artifacts on release-22.2 @ 1bfe9bcda653f55ed3b4216610433b51b2ef0d8f:

		  | 	main/pkg/cmd/roachtest/monitor.go:123
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runTPCC
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpcc.go:256
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCC.func2.1
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpcc.go:376
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.(*backgroundStepper).launch.func1
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/mixed_version_jobs.go:61
		  | main.(*monitorImpl).Go.func1
		  | 	main/pkg/cmd/roachtest/monitor.go:105
		  | golang.org/x/sync/errgroup.(*Group).Go.func1
		  | 	golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:74
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitorImpl).wait.func2
		  | 	main/pkg/cmd/roachtest/monitor.go:171
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.(*clusterImpl).RunE
		  | 	main/pkg/cmd/roachtest/cluster.go:1981
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runTPCC.func1
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpcc.go:246
		  | main.(*monitorImpl).Go.func1
		  | 	main/pkg/cmd/roachtest/monitor.go:105
		  | golang.org/x/sync/errgroup.(*Group).Go.func1
		  | 	golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:74
		  | runtime.goexit
		  | 	GOROOT/src/runtime/asm_amd64.s:1594
		Wraps: (6) output in run_163303.545810466_n5_cockroach_workload_run_tpcc
		Wraps: (7) ./cockroach workload run tpcc --warehouses=909 --histograms=perf/stats.json  --ramp=5m0s --duration=2h0m0s --prometheus-port=0 --pprofport=33333  {pgurl:1-4} returned
		  | stderr:
		  | I220910 16:33:05.252635 1 workload/cli/run.go:427  [-] 1  creating load generator...
		  | I220910 16:33:05.445322 1 workload/cli/run.go:458  [-] 2  creating load generator... done (took 192.687289ms)
		  |
		  | stdout:
		  | Initializing 1818 connections...
		  | Initializing 0 idle connections...
		  | Initializing 9090 workers and preparing statements...
		Wraps: (8) secondary error attachment
		  | UNCLASSIFIED_PROBLEM: context canceled
		  | (1) UNCLASSIFIED_PROBLEM
		  | Wraps: (2) Node 5. Command with error:
		  |   | ``````
		  |   | ./cockroach workload run tpcc --warehouses=909 --histograms=perf/stats.json  --ramp=5m0s --duration=2h0m0s --prometheus-port=0 --pprofport=33333  {pgurl:1-4}
		  |   | ``````
		  | Wraps: (3) context canceled
		  | Error types: (1) errors.Unclassified (2) *hintdetail.withDetail (3) *errors.errorString
		Wraps: (9) context canceled
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.withPrefix (7) *cluster.WithCommandDetails (8) *secondary.withSecondaryError (9) *errors.errorString

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=16 , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

/cc @cockroachdb/test-eng

This test on roachdash | Improve this report!

Jira issue: CRDB-19533

@cockroach-teamcity cockroach-teamcity added branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Sep 10, 2022
@cockroach-teamcity cockroach-teamcity added this to the 22.2 milestone Sep 10, 2022
@cockroach-teamcity
Copy link
Member Author

roachtest.tpcc/mixed-headroom/n5cpu16 failed with artifacts on release-22.2 @ 21e14668fc212614785cb3739378378a0aac4539:

test artifacts and logs in: /artifacts/tpcc/mixed-headroom/n5cpu16/run_1
	monitor.go:127,versionupgrade.go:715,versionupgrade.go:197,tpcc.go:432,test_runner.go:908: monitor failure: monitor task failed: output in run_141949.688081276_n1_v2216cockroach_workload_fixtures_import_bank: v22.1.6/cockroach workload fixtures import bank --payload-bytes=10240 --rows=32552083 --seed=4 --db=bigbank returned: COMMAND_PROBLEM: exit status 1
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitorImpl).WaitE
		  | 	main/pkg/cmd/roachtest/monitor.go:115
		  | main.(*monitorImpl).Wait
		  | 	main/pkg/cmd/roachtest/monitor.go:123
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.importLargeBankStep.func1
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/versionupgrade.go:715
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.(*versionUpgradeTest).run
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/versionupgrade.go:197
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerTPCC.func2
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpcc.go:432
		  | main.(*testRunner).runTest.func2
		  | 	main/pkg/cmd/roachtest/test_runner.go:908
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitorImpl).wait.func2
		  | 	main/pkg/cmd/roachtest/monitor.go:171
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.(*clusterImpl).RunE
		  | 	main/pkg/cmd/roachtest/cluster.go:1981
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.importLargeBankStep.func1.1
		  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/versionupgrade.go:712
		  | main.(*monitorImpl).Go.func1
		  | 	main/pkg/cmd/roachtest/monitor.go:105
		  | golang.org/x/sync/errgroup.(*Group).Go.func1
		  | 	golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:74
		  | runtime.goexit
		  | 	GOROOT/src/runtime/asm_amd64.s:1594
		Wraps: (6) output in run_141949.688081276_n1_v2216cockroach_workload_fixtures_import_bank
		Wraps: (7) v22.1.6/cockroach workload fixtures import bank --payload-bytes=10240 --rows=32552083 --seed=4 --db=bigbank returned
		  | stderr:
		  | I220915 14:19:51.050447 1 ccl/workloadccl/fixture.go:318  [-] 1  starting import of 1 tables
		  | Error: importing fixture: importing table bank: dial tcp 127.0.0.1:26257: connect: connection refused
		  |
		  | stdout:
		Wraps: (8) COMMAND_PROBLEM
		Wraps: (9) Node 1. Command with error:
		  | ``````
		  | v22.1.6/cockroach workload fixtures import bank --payload-bytes=10240 --rows=32552083 --seed=4 --db=bigbank
		  | ``````
		Wraps: (10) exit status 1
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.withPrefix (7) *cluster.WithCommandDetails (8) errors.Cmd (9) *hintdetail.withDetail (10) *exec.ExitError

Parameters: ROACHTEST_cloud=gce , ROACHTEST_cpu=16 , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

Same failure on other branches

This test on roachdash | Improve this report!

@srosenberg
Copy link
Member

Latest failure is OOM on n1 during bank import,

[ 3202.552241] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/cockroach.service,task=cockroach,pid=15364,uid=1000
[ 3202.552397] Out of memory: Killed process 15364 (cockroach) total-vm:16544396kB, anon-rss:8003428kB, file-rss:42856kB, shmem-rss:0kB, UID:1000 pgtables:30184kB oom_score_adj:0
[ 3202.983536] oom_reaper: reaped process 15364 (cockroach), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
run_141949.688081276_n1_v2216cockroach_workload_fixtures_import_bank: 15:06:35 cluster.go:1962: > result: v22.1.6/cockroach workload fixtures import bank --payload-bytes=10240 --rows=32552083 --seed=4 --db=bigbank returned: COMMAND_PROBLEM: exit status 1
(1) v22.1.6/cockroach workload fixtures import bank --payload-bytes=10240 --rows=32552083 --seed=4 --db=bigbank returned
  | stderr:
  | I220915 14:19:51.050447 1 ccl/workloadccl/fixture.go:318  [-] 1  starting import of 1 tables
  | Error: importing fixture: importing table bank: dial tcp 127.0.0.1:26257: connect: connection refused
  |
  | stdout:
Wraps: (2) COMMAND_PROBLEM
Wraps: (3) Node 1. Command with error:

@srosenberg
Copy link
Member

OOMs appear to be happening rather frequently, see #83079 (comment)

@srosenberg srosenberg removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

No branches or pull requests

2 participants