Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: tpce/c=5000/nodes=3 failed #61065

Closed
cockroach-teamcity opened this issue Feb 24, 2021 · 31 comments · Fixed by #62285
Closed

roachtest: tpce/c=5000/nodes=3 failed #61065

cockroach-teamcity opened this issue Feb 24, 2021 · 31 comments · Fixed by #62285
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.

Comments

@cockroach-teamcity
Copy link
Member

(roachtest).tpce/c=5000/nodes=3 failed on master@ec011620c7cf299fdbb898db692b36454defc4a2:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"charge\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(525), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.203
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Feb 24, 2021
@tbg
Copy link
Member

tbg commented Feb 25, 2021

    | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"charge\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(525), routine: Some("FilterDescriptorState") }) }

Doesn't seem like a blocker.

@tbg tbg removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Feb 25, 2021
@nvanbenschoten
Copy link
Member

@pbardea how should we think about this relation "charge" is offline: importing error? This test isn't doing anything particularly special, just running an IMPORT INTO statement and then adding a collection of foreign keys to those imported tables. Does this error indicate that the IMPORT completed but then a later statement still saw the descriptor as importing? Is that expected?

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@6601d827b814d4e85a1081b03bf2562d8ac2a4ab:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"commission_rate\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(525), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.5
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3

See this test on roachdash
powered by pkg/cmd/internal/issues

@pbardea
Copy link
Contributor

pbardea commented Feb 26, 2021

Does this error indicate that the IMPORT completed but then a later statement still saw the descriptor as importing?

Yes, that would indicate that either the statement is running before the import completed, or the import is failing to transition the table to PUBLIC. Either way, this is not expected, I can take a pass at diagnosing what's going on here. (Didn't have bandwidth today, will look early next week.)

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@9595a158f0233e1c3d86786ec4462dd39c7beb20:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"trade_type\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(525), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.241
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@7d1324fa42732f482329a524b0166db8dd7365e6:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"sector\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(525), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.12
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3

See this test on roachdash
powered by pkg/cmd/internal/issues

@pbardea pbardea self-assigned this Mar 1, 2021
@pbardea
Copy link
Contributor

pbardea commented Mar 1, 2021

Starting to take a look at this now.

This error appears when an OFFLINE table is trying to be accessed. Looking at the logs for the test I see:

		  | Initializing schema...
		  | Importing dataset...

Indicating that the error was hit while we were still importing the dataset. So, it doesn't look like the IMPORT is finished (which is when it marks the tables PUBLIC again). The only other table access that I see are the schema changes issued during schema initialization. I wonder if the schema change jobs are executing asynchronously and are racing with the IMPORT INTO when accessing the table descriptor, and some schema change jobs try to access the table descriptor after IMPORT has taken them offline. Going to try and see if this repros in a smaller test case.

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@9ba48738bc511ad6954682cab41e23b8492facd8:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"charge\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(529), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.10
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@b703e663da8ededaee2e28fc39a24e3880ae54cf:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"exchange\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(529), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.127
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@15a185606d5e80b47d9fdd0ed4f54cfe29c527c6:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"charge\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(530), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.52
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@a69e6549a71f5a0e83eb13509001f4d7351050fb:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"sector\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(530), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.126
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3

See this test on roachdash
powered by pkg/cmd/internal/issues

@nvanbenschoten
Copy link
Member

I just reproduced this and found that the IMPORT is finishing. We're then hitting the issue afterward. I'm not sure if this helps, but it does indicate that this is a different issue than the slowdowns we've been seeing on tpc-c recently.

@pbardea
Copy link
Contributor

pbardea commented Mar 9, 2021

Thanks! That is very useful. To confirm, you're seeing the IMPORT go to a successful status and then we're seeing the error? I would have only expected this error to happen when trying to read from the table before the IMPORT completes. Given that description I'm adding the ga-blocker tag for now.

Looking at the failures above, I see most of them log Importing dataset..., but never Finished Initialization which I would expect if the import jobs had finished. Since you're familiar with tpc-e, would you expect to see the Finished Initialization logging (https://github.com/cockroachlabs/tpc-e/blob/3ed1c971168d72a689beae75314193a89cf09208/tier-a/src/schema.rs#L29) before seeing this error pop up? (One thing of note is that looking through the debug.zips, it does look like the descriptors that got the error claiming to be "OFFLINE" are eventually marked as PUBLIC as seen in the system.descriptors.txt file in the debug.zip)

@nvanbenschoten
Copy link
Member

To confirm, you're seeing the IMPORT go to a successful status and then we're seeing the error?

Yes, that's what I'm seeing. However, I haven't been able to determine exactly which statement is returning the error. I'll try to determine that.

@nvanbenschoten
Copy link
Member

I was able to confirm that the IMPORT statements themselves are returning these errors. For instance, in my most recent test, I saw the following three statements all return errors:

IMPORT INTO industry CSV DATA ('gs://cockroach-fixtures/tpce-csv/fixed/Industry.txt') WITH delimiter = '|'; returned error: db error: ERROR: relation "industry" is offline: importing
IMPORT INTO exchange CSV DATA ('gs://cockroach-fixtures/tpce-csv/fixed/Exchange.txt') WITH delimiter = '|'; returned error: db error: ERROR: relation "exchange" is offline: importing
IMPORT INTO trade_type CSV DATA ('gs://cockroach-fixtures/tpce-csv/fixed/TradeType.txt') WITH delimiter = '|'; returned error: db error: ERROR: relation "trade_type" is offline: importing

One potentially interesting thing to note is that we are performing the IMPORTs in parallel. Another thing to note is that we perform these imports immediately after performing a large series of schema changes on the tables to install duplicate indexes.

@pbardea
Copy link
Contributor

pbardea commented Mar 11, 2021

That's very useful! That narrows down what it could be. I'm still not quite able to repro on a smaller scale but given that the failure is from some descriptor access inside IMPORT, and it's failing to access the same descriptor it's importing is interesting. I'll double check the PRs that merged around the time we started seeing this for any erroneous descriptor accesses.

Given that it's the IMPORTs that are hitting the error, I don't suspect the schema changes to effect this but that's good to keep in mind.

@pbardea
Copy link
Contributor

pbardea commented Mar 11, 2021

It's also interesting that you saw the imports finish and then this error being hit, but it's also the import statements themselves that are returning the error. How did you determine that the import statements were finishing? Were you seeing all of them finish or only some of them? I would be very surprised if they were returning an error after being marked as successful.

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@4b98115dfda02a9498f566958bd915c45ec7e449:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"charge\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(366), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.59
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@4d44ddf24153d8ef8e0a996fdbe75ac5607f9574:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "another operation is currently operating on the table", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("import_stmt.go"), line: Some(1233), routine: Some("prepareExistingTableDescForIngestion") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.27
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@nvanbenschoten
Copy link
Member

How did you determine that the import statements were finishing?

I determined that the imports were succeeding on the admin UI's jobs page.

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@bdff5338ca725bf1cfddf7e3f648bbf02ab42999:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"charge\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(366), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.50
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@e09b93fe62541c3a94f32a723778660b528a0792:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"charge\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(366), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.58
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:767: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:767
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@pbardea
Copy link
Contributor

pbardea commented Mar 15, 2021

Looking at the latest failures again.

  • All import jobs succeed (from system.jobs.txt):
 pauls-mbp  ~/dev/…/cockroach  $  cat system.jobs.txt |grep -i 'import into' | wc -l
      32
 pauls-mbp  ~/dev/…/cockroach  $  cat system.jobs.txt |grep -i 'import into' | grep 'succeeded' | wc -l
      32
  • Charge table is PUBLIC:
    From system.descriptors.txt:
64	"\012\244\007\012\006charge\030@ 4(\020:\000B(\012\010ch_tt_id\020\001\032\015\010\007\020\003\030\0000\007P\223\010`\000 \0000\000h\000p\000x\000\200\001\000B(\012\011ch_c_tier\020\002\032\014\010\001\020\020\030\0000\000P\025`\000 \0000\000h\000p\000x\000\200\001\000B'\012\007ch_chrg\020\003\032\015\010\003\020\002\030\0120\000P\244\015`\000 \0000\000h\000p\000x\000\200\001\000H\004Rb\012\007primary\020\001\030\001""\010ch_tt_id""\011ch_c_tier0\0010\002@\000@\000J\020\010\000\020\000\032\000 \000(\0000\0008\000@\000Z\000z\004\010\000 \000\200\001\000\210\001\000\220\001\002\230\001\000\242\001\006\010\000\022\000\030\000\250\001\000\262\001\000\272\001\000Zz\012\024primary_replicated_0\020\002\030\001""\010ch_tt_id""\011ch_c_tier*\007ch_chrg0\0010\002@\000@\000J\020\010\000\020\000\032\000 \000(\0000\0008\000@\000Z\000p\003z\004\010\000 \000\200\001\000\210\001\001\220\001\002\230\001\000\242\001\006\010\000\022\000\030\000\250\001\000\262\001\000\272\001\000Zz\012\024primary_replicated_1\020\003\030\001""\010ch_tt_id""\011ch_c_tier*\007ch_chrg0\0010\002@\000@\000J\020\010\000\020\000\032\000 \000(\0000\0008\000@\000Z\000p\003z\004\010\000 \000\200\001\000\210\001\001\220\001\002\230\001\000\242\001\006\010\000\022\000\030\000\250\001\000\262\001\000\272\001\000Zz\012\024primary_replicated_2\020\004\030\001""\010ch_tt_id""\011ch_c_tier*\007ch_chrg0\0010\002@\000@\000J\020\010\000\020\000\032\000 \000(\0000\0008\000@\000Z\000p\003z\004\010\000 \000\200\001\000\210\001\001\220\001\002\230\001\000\242\001\006\010\000\022\000\030\000\250\001\000\262\001\000\272\001\000`\005j\035\012\011\012\005admin\020\002\012\010\012\004root\020\002\022\004root\030\001\200\001\005\210\001\003\230\001\000\242\001F\012+ch_c_tier IN (1:::INT8, 2:::INT8, 3:::INT8)\022\017check_ch_c_tier\030\001(\0020\0008\000\242\001/\012\026ch_chrg >= 0:::DECIMAL\022\015check_ch_chrg\030\001(\0030\0008\000\262\0011\012\007primary\020\000\032\010ch_tt_id\032\011ch_c_tier\032\007ch_chrg \001 \002 \003(\003\270\001\001\302\001\000\350\001\000\362\001\004\010\000\022\000\370\001\000\200\002\000\222\002\000\232\002\012\010\257\311\343\330\370\227\240\266\026\242\002,\010@\020\001\030\001 F*\032fk_ch_tt_id_ref_trade_type0\0018\000@\000H\000\262\002\000\270\002\000\300\002\035\310\002\000\340\002\000"	0aa4070a066368617267651840203428103a0042280a0863685f74745f696410011a0d080710031800300750930860002000300068007000780080010042280a0963685f635f7469657210021a0c0801101018003000501560002000300068007000780080010042270a0763685f6368726710031a0d08031002180a300050a40d600020003000680070007800800100480452620a077072696d61727910011801220863685f74745f6964220963685f635f7469657230013002400040004a10080010001a00200028003000380040005a007a0408002000800100880100900102980100a20106080012001800a80100b20100ba01005a7a0a147072696d6172795f7265706c6963617465645f3010021801220863685f74745f6964220963685f635f746965722a0763685f6368726730013002400040004a10080010001a00200028003000380040005a0070037a0408002000800100880101900102980100a20106080012001800a80100b20100ba01005a7a0a147072696d6172795f7265706c6963617465645f3110031801220863685f74745f6964220963685f635f746965722a0763685f6368726730013002400040004a10080010001a00200028003000380040005a0070037a0408002000800100880101900102980100a20106080012001800a80100b20100ba01005a7a0a147072696d6172795f7265706c6963617465645f3210041801220863685f74745f6964220963685f635f746965722a0763685f6368726730013002400040004a10080010001a00200028003000380040005a0070037a0408002000800100880101900102980100a20106080012001800a80100b20100ba010060056a1d0a090a0561646d696e10020a080a04726f6f7410021204726f6f741801800105880103980100a201460a2b63685f635f7469657220494e2028313a3a3a494e54382c20323a3a3a494e54382c20333a3a3a494e543829120f636865636b5f63685f635f746965721801280230003800a2012f0a1663685f63687267203e3d20303a3a3a444543494d414c120d636865636b5f63685f636872671801280330003800b201310a077072696d61727910001a0863685f74745f69641a0963685f635f746965721a0763685f636872672001200220032803b80101c20100e80100f2010408001200f801008002009202009a020a08afc9e3d8f897a0b616a2022c08401001180120462a1a666b5f63685f74745f69645f7265665f74726164655f747970653001380040004800b20200b80200c0021dc80200e00200

State:

[email protected]:26257/movr> select crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', decode('0aa4070a066368617267651840203428103a0042280a0863685f74745f696410011a0d080710031800300750930860002000300068007000780080010042280a0963685f635f7469657210021a0c0801101018003000501560002000300068007000780080010042270a0763685f6368726710031a0d08031002180a300050a40d600020003000680070007800800100480452620a077072696d61727910011801220863685f74745f6964220963685f635f7469657230013002400040004a10080010001a00200028003000380040005a007a0408002000800100880100900102980100a20106080012001800a80100b20100ba01005a7a0a147072696d6172795f7265706c6963617465645f3010021801220863685f74745f6964220963685f635f746965722a0763685f6368726730013002400040004a10080010001a00200028003000380040005a0070037a0408002000800100880101900102980100a20106080012001800a80100b20100ba01005a7a0a147072696d6172795f7265706c6963617465645f3110031801220863685f74745f6964220963685f635f746965722a0763685f6368726730013002400040004a10080010001a00200028003000380040005a0070037a0408002000800100880101900102980100a20106080012001800a80100b20100ba01005a7a0a147072696d6172795f7265706c6963617465645f3210041801220863685f74745f6964220963685f635f746965722a0763685f6368726730013002400040004a10080010001a00200028003000380040005a0070037a0408002000800100880101900102980100a20106080012001800a80100b20100ba010060056a1d0a090a0561646d696e10020a080a04726f6f7410021204726f6f741801800105880103980100a201460a2b63685f635f7469657220494e2028313a3a3a494e54382c20323a3a3a494e54382c20333a3a3a494e543829120f636865636b5f63685f635f746965721801280230003800a2012f0a1663685f63687267203e3d20303a3a3a444543494d414c120d636865636b5f63685f636872671801280330003800b201310a077072696d61727910001a0863685f74745f69641a0963685f635f746965721a0763685f636872672001200220032803b80101c20100e80100f2010408001200f801008002009202009a020a08afc9e3d8f897a0b616a2022c08401001180120462a1a666b5f63685f74745f69645f7265665f74726164655f747970653001380040004800b20200b80200c0021dc80200e00200', 'hex'))->'table'->'state';
  ?column?
------------
  "PUBLIC"
(1 row)

Will continue to stress this today.

@pbardea
Copy link
Contributor

pbardea commented Mar 15, 2021

Update: a few of my runs have resulted in import jobs failing with another operation is currently operating on the table, indicating that the table descriptor version changed between taking the tables OFFLINE and marking them PUBLIC again. Looking at the schema changes being issued, I would imagine that they would have completed before-hand. Going to check if the import jobs reference other tables (perhaps when dealing with FKs).

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@e9387a6e5dfdad71c74ccd0a07c907632613fa3e:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "another operation is currently operating on the table", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("import_stmt.go"), line: Some(1233), routine: Some("prepareExistingTableDescForIngestion") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.144
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:768: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:768
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@pbardea
Copy link
Contributor

pbardea commented Mar 16, 2021

I've been able to reproduce this most recent failure. The error is that the table descriptor was unexpectedly modified before setting the descriptors OFFLINE. Adding a bit of logging to the error message reveals the following diff between the tables:

$  diff expected.json actual.json
511c511
<     "wallTime": "1615914513281452644"
---
>     "wallTime": "1615914535851569577"
535c535
<       "validity": "Validated"
---
>       "validity": "Unvalidated"
550c550
<       "validity": "Validated"
---
>       "validity": "Unvalidated"
640c640
<   "version": 30,
---
>   "version": 32,

Both tables were in the PUBLIC state.

Note that IMPORT INTO invalidates FKs, but does so when transitioning the tables from OFFLINE to PUBLIC.

@pbardea
Copy link
Contributor

pbardea commented Mar 16, 2021

The most recent failure is explained by there being 2 import jobs importing into the same table. E.g. in https://teamcity.cockroachdb.com/repository/download/Cockroach_Nightlies_WorkloadNightly/2780695:id/tpce/c%3D5000/nodes%3D3/run_1/debug.zip!/debug/system.jobs.txt 2 IMPORT INTO tpce.public.trade jobs have been created and are being run in parallel. The first job modifies the table during the execution of the second.

Looking at https://github.com/cockroachlabs/tpc-e/blob/master/tier-a/src/schema.rs#L822, I'm not yet seeing why the test is issuing multiple import requests.

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@597e4a8c487e3c23d64885563d608a692b59055c:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "another operation is currently operating on the table", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("import_stmt.go"), line: Some(1233), routine: Some("prepareExistingTableDescForIngestion") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.246
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:768: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:768
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@pbardea
Copy link
Contributor

pbardea commented Mar 17, 2021

Had a bit more time to look at this this afternoon. My hypothesis is that both of these failures can be explained if IMPORT jobs are being double-created.

The latest failures that we're seeing is explained by:

  • both job records are created
  • one is resumed, and takes the table offline
  • the other sees the table was taken offline between when the record was created and when it tries to take it offline
  • it fails with "another operation is currently operating on this table"
  • we see several job record in the debug.zip

The earlier failure mode can be explained by:

  • the first import job creates its job record
  • the first import job takes the table offline
  • when the second job goes to create it's job records, it sees the table as offline, returns the error relation \"charge\" is offline: importing and never creates the record. This leaves us with the first job the happily succeeded.

There were a few jobs-related changes that went in around the time we started seeing this failure (e.g. fe6377c#diff-bbc44b1b8225066d6d73cb8b4efce341bfec316008c10d42cc53dd58010ad781 which are now suspect). I'm going through those changes to see if one explains the double-creation of the job records (and thus these races we're seeing)

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@ee9f47b9ec9476a693464e2dcd09a01bf9d39ad2:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"charge\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(366), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.2
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:768: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:768
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpce/c=5000/nodes=3 failed on master@893643b63ea0b1cfa4888c6b73b5c68a9c100c3a:

		  | cd798458a46f: Pull complete
		  | Digest: sha256:1e299df6b79d4630bdb394ed98211f9303292b35aff2555ba5b997f2f889fdea
		  | Status: Downloaded newer image for cockroachdb/tpc-e:latest
		  | Error: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: None, code: SqlState("XXUUU"), message: "relation \"charge\" is offline: importing", detail: None, hint: None, position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("descriptor.go"), line: Some(366), routine: Some("FilterDescriptorState") }) }
		  | Error: COMMAND_PROBLEM: exit status 1
		  | (1) COMMAND_PROBLEM
		  | Wraps: (2) Node 4. Command with error:
		  |   | ```
		  |   | sudo docker run cockroachdb/tpc-e:latest --customers=5000 --racks=3 --init --hosts=10.128.0.38
		  |   | ```
		  | Wraps: (3) exit status 1
		  | Error types: (1) errors.Cmd (2) *hintdetail.withDetail (3) *exec.ExitError
		  |
		  | stdout:
		  | Initializing schema...
		  | Importing dataset...
		Wraps: (4) exit status 20
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *main.withCommandDetails (4) *exec.ExitError

	cluster.go:2688,tpce.go:96,tpce.go:113,test_runner.go:768: monitor failure: monitor task failed: t.Fatal() was called
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2676
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2684
		  | main.registerTPCE.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:96
		  | main.registerTPCE.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpce.go:113
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:768
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitor).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2732
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2646
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:5652
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:191
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

More

Artifacts: /tpce/c=5000/nodes=3
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants