Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: schemachange/index/tpcc/w=100 failed #35734

Closed
cockroach-teamcity opened this issue Mar 14, 2019 · 6 comments
Closed

roachtest: schemachange/index/tpcc/w=100 failed #35734

cockroach-teamcity opened this issue Mar 14, 2019 · 6 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

SHA: https://github.com/cockroachdb/cockroach/commits/57e825a7940495b67e0cc7213a5fabc24e12be0e

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/index/tpcc/w=100 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1176948&tab=buildLog

The test failed on master:
	test.go:1202: test timed out (30m0s)
	cluster.go:1251,tpcc.go:138,schemachange.go:310,test.go:1214: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1176948-schemachange-index-tpcc-w-100:5 -- ./workload check tpcc --warehouses=100 {pgurl:1} returned:
		stderr:
		
		stdout:
		I190314 14:01:51.325134 1 workload/tpcc/tpcc.go:291  check 3.3.2.1 took 358.616857ms
		I190314 14:01:54.595379 1 workload/tpcc/tpcc.go:291  check 3.3.2.2 took 3.270139023s
		I190314 14:01:55.345421 1 workload/tpcc/tpcc.go:291  check 3.3.2.3 took 749.965545ms
		I190314 14:02:18.857901 1 workload/tpcc/tpcc.go:291  check 3.3.2.4 took 23.51238816s
		I190314 14:02:25.166335 1 workload/tpcc/tpcc.go:291  check 3.3.2.5 took 6.308355498s
		I190314 14:02:57.935942 1 workload/tpcc/tpcc.go:291  check 3.3.2.7 took 32.769537106s
		I190314 14:03:02.368606 1 workload/tpcc/tpcc.go:291  check 3.3.2.8 took 4.432602026s
		: signal: killed
	test.go:978,asm_amd64.s:523,panic.go:513,log.go:219,test.go:1174,asm_amd64.s:522,panic.go:397,test.go:774,test.go:760,cluster.go:1251,tpcc.go:138,schemachange.go:310,test.go:1214: write /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190314-1176948/schemachange/index/tpcc/w=100/test.log: file already closed

@cockroach-teamcity cockroach-teamcity added this to the 19.1 milestone Mar 14, 2019
@cockroach-teamcity cockroach-teamcity added C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. labels Mar 14, 2019
@vivekmenezes
Copy link
Contributor

timing out due to #34834

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/04ef15974085e14f758b20c552a84052eac9fa2b

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/index/tpcc/w=100 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1180753&tab=buildLog

The test failed on master:
	schemachange.go:446,schemachange.go:314,cluster.go:1605,errgroup.go:57: read tcp 172.17.0.2:59392->35.196.208.131:26257: read: connection reset by peer
	cluster.go:1267,tpcc.go:130,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1180753-schemachange-index-tpcc-w-100:5 -- ./workload run tpcc --warehouses=100 --histograms=logs/stats.json --wait=false --tolerate-errors --ramp=5m0s --duration=15m0s {pgurl:1-4} returned:
		stderr:
		
		stdout:
		0.0      0.0 delivery
		  13m16s    39606            0.0            0.8      0.0      0.0      0.0      0.0 newOrder
		  13m16s    39606            0.0            0.1      0.0      0.0      0.0      0.0 orderStatus
		  13m16s    39606            0.0            0.7      0.0      0.0      0.0      0.0 payment
		  13m16s    39606            0.0            0.0      0.0      0.0      0.0      0.0 stockLevel
		E190316 14:38:43.637780 1 workload/cli/run.go:420  error in payment: EOF
		_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
		  13m17s    62522            0.0            0.1      0.0      0.0      0.0      0.0 delivery
		  13m17s    62522            0.0            0.8      0.0      0.0      0.0      0.0 newOrder
		  13m17s    62522            0.0            0.1      0.0      0.0      0.0      0.0 orderStatus
		  13m17s    62522            0.0            0.7      0.0      0.0      0.0      0.0 payment
		  13m17s    62522            0.0            0.0      0.0      0.0      0.0      0.0 stockLevel
		: signal: killed
	cluster.go:1626,tpcc.go:140,schemachange.go:310,test.go:1214: unexpected node event: 3: dead

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/c2939ec9a4f15b7fb8683a5805deeb241953e7aa

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/index/tpcc/w=100 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1182991&tab=buildLog

The test failed on master:
	test.go:1202: test timed out (30m0s)
	schemachange.go:446,schemachange.go:314,cluster.go:1605,errgroup.go:57: pq: server is not accepting clients
	cluster.go:1626,tpcc.go:140,schemachange.go:310,test.go:1214: Goexit() was called
	test.go:978,asm_amd64.s:523,panic.go:513,log.go:219,cluster.go:926,context.go:90,cluster.go:916,test.go:1159,asm_amd64.s:522,panic.go:397,test.go:774,test.go:760,cluster.go:1626,tpcc.go:140,schemachange.go:310,test.go:1214: write /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190318-1182991/schemachange/index/tpcc/w=100/test.log: file already closed

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/a512e390f7f2f2629f3f811bab5866c46e3e5713

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/index/tpcc/w=100 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1183678&tab=buildLog

The test failed on provisional_201903122203_v19.1.0-beta.20190318:
	test.go:1202: test timed out (30m0s)
	cluster.go:1267,tpcc.go:130,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1183678-schemachange-index-tpcc-w-100:5 -- ./workload run tpcc --warehouses=100 --histograms=logs/stats.json --wait=false --tolerate-errors --ramp=5m0s --duration=15m0s {pgurl:1-4} returned:
		stderr:
		
		stdout:
		       4            9.0            7.9     32.5    771.8    771.8    771.8 stockLevel
		  13m31s        4            6.0            7.9    184.5   1275.1   1275.1   1275.1 delivery
		  13m31s        4           47.0           78.4    570.4   1208.0   2952.8   2952.8 newOrder
		  13m31s        4            6.0            7.8     41.9    335.5    335.5    335.5 orderStatus
		  13m31s        4           45.0           77.8    243.3   1006.6   1140.9   1140.9 payment
		  13m31s        4            4.0            7.9     37.7    125.8    125.8    125.8 stockLevel
		  13m32s        4            3.0            7.9   1208.0   1208.0   1208.0   1208.0 delivery
		  13m32s        4           35.0           78.4    805.3   1040.2  81604.4  81604.4 newOrder
		  13m32s        4            3.0            7.8      6.3     19.9     19.9     19.9 orderStatus
		  13m32s        4           35.0           77.8    268.4   1342.2 103079.2 103079.2 payment
		  13m32s        4            2.0            7.9    109.1    536.9    536.9    536.9 stockLevel
		: signal: killed
	cluster.go:1626,tpcc.go:140,schemachange.go:310,test.go:1214: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/f95d45653df6be5587fb9887de241f50b6932000

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/index/tpcc/w=100 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1185396&tab=buildLog

The test failed on master:
	schemachange.go:446,schemachange.go:314,cluster.go:1605,errgroup.go:57: pq: internal error: table: 61 has lease: node_id:4 expiration_time:1553005072948309736 , expected: {1 1553004746044062251 {} 0}: the schema change lease has expired
	cluster.go:1626,tpcc.go:140,schemachange.go:310,test.go:1214: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/3a7ea2d8c9d4a3e0d97f8f106fcf95b3f03765ec

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=schemachange/index/tpcc/w=100 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1187480&tab=buildLog

The test failed on master:
	schemachange.go:446,schemachange.go:314,cluster.go:1605,errgroup.go:57: pq: internal error: table: 58 has lease: node_id:4 expiration_time:1553086847557452293 , expected: {1 1553086534453426832 {} 0}: the schema change lease has expired
	cluster.go:1267,tpcc.go:130,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1187480-schemachange-index-tpcc-w-100:5 -- ./workload run tpcc --warehouses=100 --histograms=logs/stats.json --wait=false --tolerate-errors --ramp=5m0s --duration=15m0s {pgurl:1-4} returned:
		stderr:
		
		stdout:
		       0            0.0            4.2      0.0      0.0      0.0      0.0 stockLevel
		  14m18s        0            0.0            4.3      0.0      0.0      0.0      0.0 delivery
		  14m18s        0            1.0           42.7  19327.4  19327.4  19327.4  19327.4 newOrder
		  14m18s        0            0.0            4.3      0.0      0.0      0.0      0.0 orderStatus
		  14m18s        0            3.0           42.2   7784.6  10200.5  10200.5  10200.5 payment
		  14m18s        0            0.0            4.2      0.0      0.0      0.0      0.0 stockLevel
		  14m19s        0            1.0            4.3  45097.2  45097.2  45097.2  45097.2 delivery
		  14m19s        0            2.0           42.6  14495.5  27917.3  27917.3  27917.3 newOrder
		  14m19s        0            0.0            4.3      0.0      0.0      0.0      0.0 orderStatus
		  14m19s        0            1.0           42.2   2818.6   2818.6   2818.6   2818.6 payment
		  14m19s        0            0.0            4.2      0.0      0.0      0.0      0.0 stockLevel
		: signal: killed
	cluster.go:1626,tpcc.go:140,schemachange.go:310,test.go:1214: Goexit() was called

craig bot pushed a commit that referenced this issue Mar 20, 2019
36011: sql: avoid wrapping the special lease expiration error object r=knz a=knz

Fixes #35734.

Issue #35854 notwithstanding, I misunderstood the logic and considered
a "perfectly normal" case to be an internal error, which was
wrong. This was causing long DDL txns to abort due to lease
expirations, that they would not renew.

Release note: None

Co-authored-by: Raphael 'kena' Poss <[email protected]>
@craig craig bot closed this as completed in #36011 Mar 20, 2019
craig bot pushed a commit that referenced this issue Mar 22, 2019
35953: roachtest: extend the schema change test timeout by a bit r=vivekmenezes a=vivekmenezes

We have #34834 tracking why a schema change is taking more
time. Until that is fixed we'd like to extend this timeout
and run the tests reliably so that we can find other
problems.

related to #35734 #35658 

Release note: None

Co-authored-by: Vivek Menezes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

No branches or pull requests

3 participants