Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: schemachange/mixed/tpcc failed #40935

Closed
cockroach-teamcity opened this issue Sep 20, 2019 · 112 comments · Fixed by #47492
Closed

roachtest: schemachange/mixed/tpcc failed #40935

cockroach-teamcity opened this issue Sep 20, 2019 · 112 comments · Fixed by #47492
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

SHA: https://github.com/cockroachdb/cockroach/commits/073999b81ddfed3bbc8409d534912fea12b6d500

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/mixed/tpcc PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1498740&tab=artifacts#/schemachange/mixed/tpcc

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190920-1498740/schemachange/mixed/tpcc/run_1
	schemachange.go:463,schemachange.go:426,cluster.go:2091,errgroup.go:57: pq: foreign key violation: "district" row d_w_id=19, d_id=1 has no match in "warehouse"

@cockroach-teamcity cockroach-teamcity added C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. labels Sep 20, 2019
@cockroach-teamcity cockroach-teamcity added this to the 19.2 milestone Sep 20, 2019
@jordanlewis
Copy link
Member

This is scary... is this a new failure mode? I don't think that FK check should ever fail. cc @lucy-zhang

@jordanlewis
Copy link
Member

Okay, this failure is actually coming from the VALIDATE CONSTRAINT statement, not from tpcc itself, which makes me significantly less scared:

						// The FK constraint on tpcc.district referencing tpcc.warehouse is
						// unvalidated, thus this operation will not be a noop.
						`ALTER TABLE tpcc.district VALIDATE CONSTRAINT fk_d_w_id_ref_warehouse;`,

Still, this VALIDATE CONSTRAINT should not be failing.

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/d6a7e59e653596b8baca946b6be714956a0e4c2c

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/mixed/tpcc PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1499672&tab=artifacts#/schemachange/mixed/tpcc

The test failed on branch=provisional_201909201312_v19.1.5, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190920-1499672/schemachange/mixed/tpcc/run_1
	schemachange.go:463,schemachange.go:426,cluster.go:2091,errgroup.go:57: pq: relation "tpcc.orderpks" does not exist

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/a92c7d01d3076eabafbd536d8a344511ec9081c6

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/mixed/tpcc PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1500206&tab=artifacts#/schemachange/mixed/tpcc

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190921-1500206/schemachange/mixed/tpcc/run_1
	schemachange.go:463,schemachange.go:426,cluster.go:2091,errgroup.go:57: pq: foreign key violation: "district" row d_w_id=2, d_id=1 has no match in "warehouse"

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/6b14c0aa3ed1b4ba6d5f937e9352c5383afe1c37

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/mixed/tpcc PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1501050&tab=artifacts#/schemachange/mixed/tpcc

The test failed on branch=release-19.1, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190923-1501050/schemachange/mixed/tpcc/run_1
	schemachange.go:463,schemachange.go:426,cluster.go:2091,errgroup.go:57: pq: relation "tpcc.orderpks" does not exist

@thoszhang thoszhang self-assigned this Sep 23, 2019
@thoszhang
Copy link
Contributor

Unrelatedly to the failure on master, this is also failing on 19.1 because my attempt to fix the test for 19.1 (#40912) caused more problems: since the table created using CREATE TABLE AS is now only created in 19.2, the other statements fail because the table doesn't exist. I'll fix that at least.

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/6b14c0aa3ed1b4ba6d5f937e9352c5383afe1c37

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/mixed/tpcc PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1502387&tab=artifacts#/schemachange/mixed/tpcc

The test failed on branch=provisional_201909231358_v19.1.5, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190923-1502387/schemachange/mixed/tpcc/run_1
	schemachange.go:463,schemachange.go:426,cluster.go:2091,errgroup.go:57: pq: relation "tpcc.orderpks" does not exist

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/4b6ab256d09e9a03e68156a3504e08f249cd0af1

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/mixed/tpcc PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1505090&tab=artifacts#/schemachange/mixed/tpcc

The test failed on branch=release-19.1, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190925-1505090/schemachange/mixed/tpcc/run_1
	schemachange.go:463,schemachange.go:426,cluster.go:2118,errgroup.go:57: pq: relation "tpcc.orderpks" does not exist

craig bot pushed a commit that referenced this issue Sep 25, 2019
41079: roachtest: fix schemachange/mixed/tpcc for 19.1 r=lucy-zhang a=lucy-zhang

`schemachange/mixed/tpcc` uses `CREATE TABLE AS` in 19.2. This PR will have the
test correctly create a similar table in 19.1 without using `CREATE TABLE AS`.

See #40935 (comment).

Release justification: Fixes a test.

Release note: None

Co-authored-by: Lucy Zhang <[email protected]>
@thoszhang
Copy link
Contributor

The FK constraint that's being violated is

alter table district add foreign key (d_w_id) references warehouse (w_id) not valid

and the rows reported as invalid in district in the two test failures are

d_w_id=19, d_id=1
d_w_id=2, d_id=1

which doesn't make sense, since rows in warehouse with w_ids 2 and 19 should obviously exist. So, either some rows were deleted/dropped from warehouse (which seems unlikely), or there's something wrong with the join that's used to find orphaned rows in the origin table.

I'm stressing this roachtest right now, but haven't been able to reproduce these failures yet.

@thoszhang
Copy link
Contributor

Got a successful reproduction with a significantly shorter version of this test:

diff --git a/pkg/cmd/roachtest/schemachange.go b/pkg/cmd/roachtest/schemachange.go
index 77848013b6..9528a22a70 100644
--- a/pkg/cmd/roachtest/schemachange.go
+++ b/pkg/cmd/roachtest/schemachange.go
@@ -424,25 +424,14 @@ func makeMixedSchemaChanges(spec clusterSpec, warehouses int, length time.Durati
                                                }
                                        }
                                        return runAndLogStmts(ctx, t, c, "mixed-schema-changes", []string{
-                                               `CREATE INDEX ON tpcc.order (o_carrier_id);`,
-
-                                               `CREATE TABLE tpcc.customerpks (c_w_id INT, c_d_id INT, c_id INT, FOREIGN KEY (c_w_id, c_d_id, c_id) REFERENCES tpcc.customer (c_w_id, c_d_id, c_id));`,
-
-                                               `ALTER TABLE tpcc.order ADD COLUMN orderdiscount INT DEFAULT 0;`,
-                                               `ALTER TABLE tpcc.order ADD CONSTRAINT nodiscount CHECK (orderdiscount = 0);`,
-
                                                `ALTER TABLE tpcc.orderpks ADD CONSTRAINT warehouse_id FOREIGN KEY (o_w_id) REFERENCES tpcc.warehouse (w_id);`,

                                                // The FK constraint on tpcc.district referencing tpcc.warehouse is
                                                // unvalidated, thus this operation will not be a noop.
                                                `ALTER TABLE tpcc.district VALIDATE CONSTRAINT fk_d_w_id_ref_warehouse;`,
-
-                                               `ALTER TABLE tpcc.orderpks RENAME TO tpcc.readytodrop;`,
-                                               `TRUNCATE TABLE tpcc.readytodrop CASCADE;`,
-                                               `DROP TABLE tpcc.readytodrop CASCADE;`,
                                        })
                                },
-                               Duration: length,
+                               Duration: 10 * time.Minute,
                        })
                },
                MinVersion: "v19.1.0",

(this diff is against 073999b)

pq: foreign key violation: "district" row d_w_id=225, d_id=1 has no match in "warehouse"

@thoszhang
Copy link
Contributor

The patch I posted has produced this failure in 5/10 runs. I checked two of the clusters, and the relevant rows in warehouse and district looked normal, and running VALIDATE CONSTRAINT again turned up no violations (nor did dropping the constraint and creating it again). So we don't appear to have any on-disk inconsistency.

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/77f26d185efb436aaac88243de19a27caa5da9b6

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/mixed/tpcc PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1509340&tab=artifacts#/schemachange/mixed/tpcc

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190926-1509340/schemachange/mixed/tpcc/run_1
	schemachange.go:472,schemachange.go:435,cluster.go:2120,errgroup.go:57: pq: foreign key violation: "district" row d_w_id=289, d_id=1 has no match in "warehouse"

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/fb0bd2f43fb87b26752615bbdbc759502c2c8b0b

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/mixed/tpcc PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1509988&tab=artifacts#/schemachange/mixed/tpcc

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190926-1509988/schemachange/mixed/tpcc/run_1
	schemachange.go:472,schemachange.go:435,cluster.go:2120,errgroup.go:57: pq: foreign key violation: "district" row d_w_id=576, d_id=1 has no match in "warehouse"

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/cf5c2bd2372e633d2f63e08e5bffca7c2a7ec59f

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/mixed/tpcc PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1510142&tab=artifacts#/schemachange/mixed/tpcc

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190927-1510142/schemachange/mixed/tpcc/run_1
	schemachange.go:472,schemachange.go:435,cluster.go:2120,errgroup.go:57: pq: foreign key violation: "district" row d_w_id=87, d_id=1 has no match in "warehouse"

@ajwerner ajwerner mentioned this issue Sep 30, 2019
18 tasks
@andreimatei
Copy link
Contributor

This also failed for the beta release with the fk violation. What's the current thinking on it? It's bad?
#41128 (comment)

@thoszhang
Copy link
Contributor

Yeah, the failures on 19.1 are due to a trivial problem with the test, which is now fixed. The real problem is the foreign key violations reported by VALIDATE CONSTRAINT on master. It's definitely a release blocker and is on the list, but I don't think it should block the beta.

@thoszhang
Copy link
Contributor

I've been able to reproduce this failure by running only the VALIDATE CONSTRAINT alongside the TPCC traffic, so the failure does not depend on other prior schema changes. However, I haven't gotten the failure when I turn off the TPCC loadgen completely (after stressing it a few dozen times overnight), so there is some interaction with the TPCC traffic that's causing this.

I'm currently trying out replacing the VALIDATE CONSTRAINT with just the SELECT query that's used internally, to see if the bug affects a more general class of queries. After these attempts to make the test smaller, the plan is to try to bisect to find the PR that caused this.

@andreimatei
Copy link
Contributor

but I don't think it should block the beta

Then please tick @mjibson's box on #41128 (comment)

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@14094e3d5ea5f548dfada7bdb6e0e1158f53e168:

		     4479.0s  1188122          855.8          643.7    469.8    805.3    939.5   1073.7 payment
		     4479.0s  1188122           94.1           64.4    318.8    536.9    637.5    704.6 stockLevel
		     4480.0s  1188122           54.0           64.4    973.1   1543.5   1677.7   1811.9 delivery
		     4480.0s  1188122          528.3          378.4    838.9   1208.0   1476.4   1677.7 newOrder
		     4480.0s  1188122           56.0           64.4    167.8    352.3    402.7    436.2 orderStatus
		     4480.0s  1188122          683.4          643.8    536.9    872.4   1006.6   1275.1 payment
		     4480.0s  1188122           66.0           64.4    335.5    671.1    906.0   1140.9 stockLevel
		    _elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
		     4481.0s  1188122           49.0           64.4   1275.1   2013.3   2281.7   2281.7 delivery
		     4481.0s  1188122          653.7          378.5    973.1   1543.5   1811.9   2013.3 newOrder
		     4481.0s  1188122           67.0           64.4    176.2    486.5    570.4    604.0 orderStatus
		     4481.0s  1188122          514.7          643.7    536.9    973.1   1208.0   1409.3 payment
		     4481.0s  1188122           52.0           64.4    302.0    570.4    604.0    838.9 stockLevel
		     4482.0s  1188122           67.0           64.4   1073.7   2684.4   2818.6   2952.8 delivery
		     4482.0s  1188122          625.2          378.6    872.4   1543.5   1744.8   2013.3 newOrder
		     4482.0s  1188122           57.0           64.4    125.8    335.5    486.5    536.9 orderStatus
		     4482.0s  1188122          574.2          643.7    570.4    939.5   1140.9   1409.3 payment
		     4482.0s  1188122           65.0           64.4    402.7    906.0    973.1   1140.9 stockLevel:
		  - context canceled

	cluster.go:2368,tpcc.go:168,schemachange.go:416,test_runner.go:741: error with attached stack trace:
		    main.(*monitor).WaitE
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2356
		    main.(*monitor).Wait
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2364
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:168
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:741
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - monitor failure:
		  - error with attached stack trace:
		    main.(*monitor).wait.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2412
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - monitor task failed:
		  - error with attached stack trace:
		    main.init
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2309
		    runtime.doInit
		    	/usr/local/go/src/runtime/proc.go:5222
		    runtime.main
		    	/usr/local/go/src/runtime/proc.go:190
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - Goexit() was called

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@860c137d602b8420a24be23e097c7fbe1a0a3547:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200302-1777546/schemachange/mixed/tpcc/run_1
	test_runner.go:756: test timed out (9h0m0s)

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@5f9a71adb995837bcff27b9456188018434be4b8:

		     7212.0s     1227          510.1          547.8    805.3   1409.3   1744.8   1946.2 payment
		     7212.0s     1227           60.0           54.8    318.8   1744.8   1811.9   1811.9 stockLevel
		    _elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
		     7213.0s     1227           54.0           54.8   1073.7   1811.9   1946.2   2281.7 delivery
		     7213.0s     1227          510.9          547.6    771.8   1476.4   1677.7   2080.4 newOrder
		     7213.0s     1227           52.0           54.8    100.7    268.4    352.3    402.7 orderStatus
		     7213.0s     1227          597.9          547.8    906.0   1610.6   1744.8   2013.3 payment
		     7213.0s     1227           49.0           54.8   1040.2   2281.7   2550.1   2550.1 stockLevel
		     7214.0s     1227           57.0           54.8    973.1   1744.8   2013.3   2147.5 delivery
		     7214.0s     1227          532.2          547.6    906.0   1610.6   1879.0   2415.9 newOrder
		     7214.0s     1227           58.0           54.8    125.8    260.0    285.2    352.3 orderStatus
		     7214.0s     1227          507.2          547.8    838.9   1409.3   1677.7   2013.3 payment
		     7214.0s     1227           63.0           54.8    469.8   1946.2   2080.4   2080.4 stockLevel
		     7215.0s     1227           57.9           54.8    771.8   1811.9   1946.2   1946.2 delivery
		     7215.0s     1227          581.5          547.6    771.8   1208.0   1476.4   2080.4 newOrder
		     7215.0s     1227           59.9           54.8    142.6    302.0    402.7    402.7 orderStatus
		     7215.0s     1227          590.5          547.8    805.3   1342.2   1677.7   2013.3 payment
		     7215.0s     1227           56.0           54.8    369.1   1409.3   1744.8   1946.2 stockLevel:
		  - context canceled

	cluster.go:2368,tpcc.go:168,schemachange.go:416,test_runner.go:741: error with attached stack trace:
		    main.(*monitor).WaitE
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2356
		    main.(*monitor).Wait
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2364
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:168
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:741
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - monitor failure:
		  - error with attached stack trace:
		    main.(*monitor).wait.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2412
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - monitor task failed:
		  - error with attached stack trace:
		    main.init
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2309
		    runtime.doInit
		    	/usr/local/go/src/runtime/proc.go:5222
		    runtime.main
		    	/usr/local/go/src/runtime/proc.go:190
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - Goexit() was called

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@d6a52a8f6f07b3b6b32af2243075e5365fb21c45:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200304-1783198/schemachange/mixed/tpcc/run_1
	test_runner.go:756: test timed out (9h0m0s)

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@c9b189b3a2171c8da78c75fe80dd02933e18a5e5:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200305-1786843/schemachange/mixed/tpcc/run_1
	test_runner.go:756: test timed out (9h0m0s)

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@752dea867f3aeb142e98c22f8d320ce19041aa8d:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200307-1793930/schemachange/mixed/tpcc/run_1
	test_runner.go:756: test timed out (9h0m0s)

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@dfa5bd527ae7d7373dd03c62118df87a87a77130:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200308-1795062/schemachange/mixed/tpcc/run_1
	test_runner.go:756: test timed out (9h0m0s)

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@c473f40078994551cebcbe00fdbf1fa388957658:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200309-1796240/schemachange/mixed/tpcc/run_1
	schemachange.go:476,schemachange.go:439,cluster.go:2344,errgroup.go:57: pq: server is not accepting clients

	cluster.go:2368,tpcc.go:168,schemachange.go:416,test_runner.go:741: error with attached stack trace:
		    main.(*monitor).WaitE
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2356
		    main.(*monitor).Wait
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2364
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:168
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:741
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - monitor failure:
		  - error with attached stack trace:
		    main.(*monitor).wait.func3
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2416
		    main.(*monitor).wait.func4
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2445
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - monitor command failure:
		  - signal: interrupt

	cluster.go:2050,cluster.go:2069,cluster.go:2173,cluster.go:1470,context.go:135,cluster.go:1467,test_runner.go:774: context canceled

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@72c4a1bd411f2f82bf9aaa22883821a946614148:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200310-1799071/schemachange/mixed/tpcc/run_1
	test_runner.go:779: test timed out (9h0m0s)

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@793a9200c16693aff32aa6a4dd9d8bbcbddb30aa:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200312-1804460/schemachange/mixed/tpcc/run_1
	test_runner.go:779: test timed out (9h0m0s)

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@69dc87d68addedf2fabfb2b14c098cfb35b5f3d0:

		    10282.0s        0          537.0          489.8    939.5   1476.4   1879.0   2281.7 newOrder
		    10282.0s        0           42.0           49.0    113.2    302.0    738.2    738.2 orderStatus
		    10282.0s        0          429.0          489.8    771.8   1275.1   1543.5   1811.9 payment
		    10282.0s        0           54.0           49.0    419.4   1811.9   2952.8   3221.2 stockLevel
		    10283.0s        0           68.0           49.0   1073.7   1811.9   1946.2   1946.2 delivery
		    10283.0s        0          544.7          489.8    838.9   1476.4   2013.3   2818.6 newOrder
		    10283.0s        0           48.0           49.0     92.3    302.0    402.7    402.7 orderStatus
		    10283.0s        0          560.7          489.8    771.8   1476.4   1811.9   2080.4 payment
		    10283.0s        0           35.0           49.0    402.7   1610.6   2415.9   2415.9 stockLevel
		    10284.0s        0           53.0           49.0   1006.6   1342.2   1543.5   1744.8 delivery
		    10284.0s        0          419.3          489.8   1006.6   1744.8   2013.3   2147.5 newOrder
		    10284.0s        0           53.0           49.0    121.6    302.0    335.5    335.5 orderStatus
		    10284.0s        0          486.4          489.8    838.9   1677.7   1811.9   2013.3 payment
		    10284.0s        0           51.0           49.0    503.3   1610.6   1677.7   2818.6 stockLevel:
		  - context canceled

	cluster.go:2368,tpcc.go:168,schemachange.go:416,test_runner.go:753: error with attached stack trace:
		    main.(*monitor).WaitE
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2356
		    main.(*monitor).Wait
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2364
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:168
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:753
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - monitor failure:
		  - error with attached stack trace:
		    main.(*monitor).wait.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2412
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - monitor task failed:
		  - error with attached stack trace:
		    main.init
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2309
		    runtime.doInit
		    	/usr/local/go/src/runtime/proc.go:5222
		    runtime.main
		    	/usr/local/go/src/runtime/proc.go:190
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - Goexit() was called


Failed to find issue assignee: 
couldn't find GitHub commits for user email [email protected]
More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@f5585e933a097b53242d8d5800127b821a9a4d41:

		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2010
		    main.(*cluster).Run
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:1933
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:170
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:753
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - error with embedded safe details: output in %s
		    -- arg 1: <string>
		  - output in run_190418.811_n5_workload_check_tpcc:
		  - error with attached stack trace:
		    main.execCmd
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:406
		    main.(*cluster).RunL
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2019
		    main.(*cluster).RunE
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2000
		    main.(*cluster).Run
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:1933
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:170
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:753
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - error with embedded safe details: %s returned:
		    stderr:
		    %s
		    stdout:
		    %s
		    -- arg 1: <string>
		    -- arg 2: <string>
		    -- arg 3: <string>
		  - /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1807675-1584170966-89-n5cpu16:5 -- ./workload check tpcc --warehouses=1000 {pgurl:1} returned:
		    stderr:
		    I200314 19:04:21.233426 1 workload/tpcc/tpcc.go:372  check 3.3.2.1 took 482.62863ms
		    I200314 19:04:40.845718 1 workload/tpcc/tpcc.go:372  check 3.3.2.2 took 19.612242413s
		    I200314 19:04:45.923032 1 workload/tpcc/tpcc.go:372  check 3.3.2.3 took 5.077249758s
		    I200314 19:09:42.567124 1 workload/tpcc/tpcc.go:372  check 3.3.2.4 took 4m56.644009043s
		    I200314 19:12:23.536500 1 workload/tpcc/tpcc.go:372  check 3.3.2.5 took 2m40.969273915s
		    I200314 19:14:53.353251 1 workload/tpcc/tpcc.go:372  check 3.3.2.7 took 2m29.81661868s
		    
		    stdout::
		  - context canceled

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@d8bdc938177ea4b5226b95ee6cc51af0951054fc:

		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2010
		    main.(*cluster).Run
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:1933
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:170
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:753
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - error with embedded safe details: output in %s
		    -- arg 1: <string>
		  - output in run_190659.571_n5_workload_check_tpcc:
		  - error with attached stack trace:
		    main.execCmd
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:406
		    main.(*cluster).RunL
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2019
		    main.(*cluster).RunE
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2000
		    main.(*cluster).Run
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:1933
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:170
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:753
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - error with embedded safe details: %s returned:
		    stderr:
		    %s
		    stdout:
		    %s
		    -- arg 1: <string>
		    -- arg 2: <string>
		    -- arg 3: <string>
		  - /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1809758-1584345030-86-n5cpu16:5 -- ./workload check tpcc --warehouses=1000 {pgurl:1} returned:
		    stderr:
		    I200316 19:07:02.471566 1 workload/tpcc/tpcc.go:372  check 3.3.2.1 took 1.021690374s
		    I200316 19:07:45.154051 1 workload/tpcc/tpcc.go:372  check 3.3.2.2 took 42.68243788s
		    I200316 19:07:49.074491 1 workload/tpcc/tpcc.go:372  check 3.3.2.3 took 3.920375765s
		    I200316 19:15:14.435677 1 workload/tpcc/tpcc.go:372  check 3.3.2.4 took 7m25.361141509s
		    I200316 19:20:03.119103 1 workload/tpcc/tpcc.go:372  check 3.3.2.5 took 4m48.683374407s
		    I200316 19:24:31.809825 1 workload/tpcc/tpcc.go:372  check 3.3.2.7 took 4m28.690663585s
		    
		    stdout::
		  - context canceled

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@5a3d0c9539a671f0e55b680d3021b18dde9d190d:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200317-1811809/schemachange/mixed/tpcc/run_1
	test_runner.go:785: test timed out (9h0m0s)

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@d571624ae4f833c49c717728a74cc7be78a791f0:

		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2010
		    main.(*cluster).Run
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:1933
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:170
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:753
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - error with embedded safe details: output in %s
		    -- arg 1: <string>
		  - output in run_180753.587_n5_workload_check_tpcc:
		  - error with attached stack trace:
		    main.execCmd
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:406
		    main.(*cluster).RunL
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2019
		    main.(*cluster).RunE
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2000
		    main.(*cluster).Run
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:1933
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:170
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:753
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - error with embedded safe details: %s returned:
		    stderr:
		    %s
		    stdout:
		    %s
		    -- arg 1: <string>
		    -- arg 2: <string>
		    -- arg 3: <string>
		  - /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1816964-1584603715-85-n5cpu16:5 -- ./workload check tpcc --warehouses=1000 {pgurl:1} returned:
		    stderr:
		    I200319 18:07:55.785641 1 workload/tpcc/tpcc.go:372  check 3.3.2.1 took 290.903623ms
		    I200319 18:08:29.858022 1 workload/tpcc/tpcc.go:372  check 3.3.2.2 took 34.07233073s
		    I200319 18:08:34.739190 1 workload/tpcc/tpcc.go:372  check 3.3.2.3 took 4.881107107s
		    I200319 18:13:08.290740 1 workload/tpcc/tpcc.go:372  check 3.3.2.4 took 4m33.551493711s
		    I200319 18:16:03.283703 1 workload/tpcc/tpcc.go:372  check 3.3.2.5 took 2m54.992905367s
		    I200319 18:20:59.754147 1 workload/tpcc/tpcc.go:372  check 3.3.2.7 took 4m56.470391205s
		    
		    stdout::
		  - context canceled

More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@2032dafccfa311c7538960e974953cb9dc1d4e50:

		     7417.0s        0          397.1          490.8   1140.9   2415.9   3355.4   3623.9 newOrder
		     7417.0s        0           46.0           49.1    151.0    402.7    536.9    536.9 orderStatus
		     7417.0s        0          452.2          490.9    771.8   1275.1   1744.8   2281.7 payment
		     7417.0s        0           53.0           49.1    604.0   1811.9   2013.3   2684.4 stockLevel
		     7418.0s        0           39.0           49.1   1677.7   2550.1   3892.3   3892.3 delivery
		     7418.0s        0          487.7          490.8   1208.0   2147.5   2952.8   3758.1 newOrder
		     7418.0s        0           47.0           49.1    159.4    352.3    369.1    369.1 orderStatus
		     7418.0s        0          431.7          490.9    805.3   1610.6   1744.8   2550.1 payment
		     7418.0s        0           44.0           49.1    503.3   1744.8   3221.2   3221.2 stockLevel
		     7419.0s        0           53.0           49.1   1476.4   2952.8   2952.8   3221.2 delivery
		     7419.0s        0          494.0          490.8   1040.2   2415.9   2952.8   3355.4 newOrder
		     7419.0s        0           38.0           49.1    125.8    335.5    469.8    469.8 orderStatus
		     7419.0s        0          453.0          490.8    838.9   1208.0   2281.7   2952.8 payment
		     7419.0s        0           44.0           49.1    671.1   1610.6   3087.0   3087.0 stockLevel:
		  - context canceled

	cluster.go:2368,tpcc.go:168,schemachange.go:416,test_runner.go:753: error with attached stack trace:
		    main.(*monitor).WaitE
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2356
		    main.(*monitor).Wait
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2364
		    main.runTPCC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:168
		    main.makeMixedSchemaChanges.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		    main.(*testRunner).runTest.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:753
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - monitor failure:
		  - error with attached stack trace:
		    main.(*monitor).wait.func2
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2412
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - monitor task failed:
		  - error with attached stack trace:
		    main.init
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2309
		    runtime.doInit
		    	/usr/local/go/src/runtime/proc.go:5222
		    runtime.main
		    	/usr/local/go/src/runtime/proc.go:190
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357
		  - Goexit() was called


Failed to find issue assignee: 
couldn't find GitHub commits for user email [email protected]
More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).schemachange/mixed/tpcc failed on master@7b0f60fe2034ba8677242dbcdad86d3e5587c0d4:

		Wraps: (6) context canceled
		Error types: (1) *withstack.withStack (2) *safedetails.withSafeDetails (3) *errutil.withMessage (4) *main.withCommandDetails (5) *secondary.withSecondaryError (6) *errors.errorString

	cluster.go:2418,tpcc.go:168,schemachange.go:416,test_runner.go:753: monitor failure: unexpected node event: 2: dead
		(1) attached stack trace
		  | main.(*monitor).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2406
		  | main.(*monitor).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2414
		  | main.runTPCC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcc.go:168
		  | main.makeMixedSchemaChanges.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/schemachange.go:416
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:753
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1357
		Wraps: (2) monitor failure
		Wraps: (3) unexpected node event: 2: dead
		Error types: (1) *withstack.withStack (2) *errutil.withMessage (3) *errors.errorString

	cluster.go:1460,context.go:135,cluster.go:1449,test_runner.go:825: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-1869957-1586676672-77-n5cpu16 --oneshot --ignore-empty-nodes: exit status 1 5: skipped
		1: 4250
		2: dead
		4: 4126
		3: 4163
		Error: UNCLASSIFIED_PROBLEM: 2: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) 2: dead
		  | main.glob..func13
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1129
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:272
		  | github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:766
		  | github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:852
		  | github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:800
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1793
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:203
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1357
		Error types: (1) errors.Unclassified (2) *errors.fundamental


Failed to find issue assignee: 
couldn't find GitHub commits for user email [email protected]
More

Artifacts: /schemachange/mixed/tpcc
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@nvanbenschoten
Copy link
Member

F200412 15:39:08.307836 95496295 kv/kvclient/kvcoord/dist_sender.go:1250  [n2] received 215 results, limit was 200
goroutine 95496295 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0x763c101, 0xed6252d1c, 0x0, 0xc00d2572f8)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xb8
github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0x7638e20, 0xc000000004, 0x6b7b4b5, 0x22, 0x4e2, 0xc00e29d410, 0x28)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:212 +0xa0c
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x4c68ae0, 0xc01672ee10, 0x4, 0x2, 0x43e8a63, 0x21, 0xc00d257420, 0x2, 0x2)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:66 +0x2c9
github.com/cockroachdb/cockroach/pkg/util/log.logDepth(0x4c68ae0, 0xc01672ee10, 0x1, 0x4, 0x43e8a63, 0x21, 0xc00d257420, 0x2, 0x2)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:44 +0x8c
github.com/cockroachdb/cockroach/pkg/util/log.Fatalf(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:155
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).divideAndSendBatchToRanges(0xc000970a20, 0x4c68ae0, 0xc01672ee10, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1250 +0xcfd
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).Send(0xc000970a20, 0x4c68ae0, 0xc01672edb0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:730 +0x785
github.com/cockroachdb/cockroach/pkg/kv.(*CrossRangeTxnWrapperSender).Send(0xc00027a060, 0x4c68ae0, 0xc01672edb0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/db.go:223 +0x95
github.com/cockroachdb/cockroach/pkg/internal/client/requestbatcher.(*RequestBatcher).sendBatch.func1.1(0x4c68ae0, 0xc01672edb0, 0x4c68ae0, 0xc026e8cf00)
	/go/src/github.com/cockroachdb/cockroach/pkg/internal/client/requestbatcher/batcher.go:275 +0xdc
github.com/cockroachdb/cockroach/pkg/internal/client/requestbatcher.(*RequestBatcher).sendBatch.func1(0x4c68ae0, 0xc01672edb0)
	/go/src/github.com/cockroachdb/cockroach/pkg/internal/client/requestbatcher/batcher.go:306 +0x88d
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker.func1(0xc004d61370, 0xc0006eba70, 0xc005c7e8a0)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:198 +0x13e
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:191 +0xa8

We also saw this in #47471. Let's track it there.

craig bot pushed a commit that referenced this issue Apr 14, 2020
47492: kv: respect exhausted key limit during ranged intent resolution r=nvanbenschoten a=nvanbenschoten

Fixes #47471.
Fixes #40935.

This commit fixes a long-standing bug where ranged intent resolution would not respect the MaxSpanRequestKeys set on a batch once the limit had already been exhausted by other requests in the same batch. Instead of treating the limit as exhausted, ranged intent resolution would consider the limit nonexistent (unbounded). This bug was triggering an assertion in DistSender. We became more likely to hit this issue in v20.1 because we started performing ranged intent resolution more often due to implicit SELECT FOR UPDATE.

This commit fixes the bug in two ways:
1. it addresses the root cause, updating MVCCResolveWriteIntentRangeUsingIter to properly respect the limit placed on the request when it is exhauted.
2. it disables the assertion in DistSender when it detects that we are hitting this bug. This ensures that we don't hit the assertion in mixed version clusters (see #40935). By the time we're in DistSender, the damage is already done and has already potentially resulted in a large Raft entry. Maintaining the assertion doesn't do us any good.

Release notes (bug fix): a bug that could could trigger an assertion with the text "received X results, limit was Y" has been fixed. The underlying bug was only performance related and could not cause user-visible correctness violations.

Release justification: fixes a medium-priority bug in existing functionality. The bug could result in an assertion failure and a node crashing. Even though this was an old bug (present in many releases before v20.1), it became a lot easier to hit in v20.1 because we started performing ranged intent resolution more often due to implicit SELECT FOR UPDATE.

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in 1ab9eca Apr 14, 2020
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Apr 15, 2020
Fixes cockroachdb#47471.
Fixes cockroachdb#40935.

This commit fixes a long-standing bug where ranged intent resolution
would not respect the MaxSpanRequestKeys set on a batch once the limit
had already been exhausted by other requests in the same batch. Instead
of treating the limit as exhausted, ranged intent resolution would
consider the limit nonexistent (unbounded). This bug was triggering an
assertion in DistSender. We became more likely to hit this issue in
v20.1 because we started performing ranged intent resolution more often
due to implicit SELECT FOR UPDATE.

This commit fixes the bug in two ways:
1. it addresses the root cause, updating MVCCResolveWriteIntentRangeUsingIter
  to properly respect the limit placed on the request when it is exhauted.
2. it disables the assertion in DistSender when it detects that we are hitting
  this bug. This ensures that we don't hit the assertion in mixed version
  clusters (see cockroachdb#40935). By the time we're in DistSender, the damage is
  already done and has already potentially resulted in a large Raft entry.
  Maintaining the assertion doesn't do us any good.

Release notes (bug fix): a bug that could could trigger an assertion
with the text "received X results, limit was Y" has been fixed. The
underlying bug was only performance related and could not cause
user-visible correctness violations.

Release justification: fixes a medium-priority bug in existing
functionality. The bug could result in an assertion failure and a node
crashing. Even though this was an old bug (present in many releases
before v20.1), it became a lot easier to hit in v20.1 because we started
performing ranged intent resolution more often due to implicit SELECT
FOR UPDATE.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants