Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: scaledata/filesystem_simulator/nodes=6 failed #39618

Closed
cockroach-teamcity opened this issue Aug 13, 2019 · 14 comments
Closed

roachtest: scaledata/filesystem_simulator/nodes=6 failed #39618

cockroach-teamcity opened this issue Aug 13, 2019 · 14 comments
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

SHA: https://github.com/cockroachdb/cockroach/commits/51a6fdedf0ce1d1329d40d801a7deaf8206b6b07

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1436116&tab=buildLog

The test failed on branch=provisional_201908060405_v19.1.4, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190812-1436116/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2099,scaledata.go:121,scaledata.go:48,test_runner.go:691: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1565651234-44-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.121:26257,10.128.0.116:26257,10.128.0.120:26257,10.128.0.115:26257,10.128.0.117:26257,10.128.0.118:26257'  returned:
		stderr:
		09:09:54 RobustDB.RandomDB chose DB at index 4
		2019/08/13 09:09:54 ExecuteTx retry attempt 1 failed, started at 2019-08-13 09:09:54.618583318 +0000 UTC m=+332.747019663, now = 2019-08-13 09:09:54.659717208 +0000 UTC m=+332.788153603, took 41.13394ms
		2019/08/13 09:09:54 Attempt failed with error driver: bad connection: ... Retrying after sleeping 5ns
		2019/08/13 09:09:54 ExecuteTx retry attempt 1 failed, started at 2019-08-13 09:09:51.160817373 +0000 UTC m=+329.289253739, now = 2019-08-13 09:09:54.65977589 +0000 UTC m=+332.788212254, took 3.498958515s
		2019/08/13 09:09:54 Attempt failed with error driver: bad connection: ... Retrying after sleeping 5ns
		2019/08/13 09:09:54 ExecuteTx retry attempt 1 failed, started at 2019-08-13 09:09:54.505491201 +0000 UTC m=+332.633927547, now = 2019-08-13 09:09:54.659874671 +0000 UTC m=+332.788311083, took 154.383536ms
		2019/08/13 09:09:54 Attempt failed with error driver: bad connection: ... Retrying after sleeping 5ns
		2019/08/13 09:09:54 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1

@cockroach-teamcity cockroach-teamcity added C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. labels Aug 13, 2019
@cockroach-teamcity cockroach-teamcity added this to the 19.2 milestone Aug 13, 2019
@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/e8faca611a902766154ed82581d6d3a7483ad231

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1462518&tab=buildLog

The test failed on branch=provisional_201908291837_v19.2.0-beta.20190903, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190830-1462518/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2114,scaledata.go:121,scaledata.go:48,test_runner.go:673: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1567188851-66-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.112:26257,10.128.0.91:26257,10.128.15.207:26257,10.128.0.165:26257,10.128.0.159:26257,10.128.0.111:26257'  returned:
		stderr:
		 files - 17026, childRelations - 17025, stripes - 2637
		2019/08/30 22:44:58 Consistency Test 5_240 @ 1567205096964092271.0000000000: sizes :- files - 17026, childRelations - 17025, stripes - 2637
		2019/08/30 22:44:58 Consistency Test 8_254 @ 1567205096986336588.0000000000: sizes :- files - 17027, childRelations - 17026, stripes - 2637
		2019/08/30 22:44:58 ExecuteTx retry attempt 1 failed, started at 2019-08-30 22:44:57.871844929 +0000 UTC m=+576.308444056, now = 2019-08-30 22:44:58.208154207 +0000 UTC m=+576.644753369, took 336.309313ms
		2019/08/30 22:44:58 Attempt failed with error driver: bad connection: ... Retrying after sleeping 5ns
		2019/08/30 22:44:58 ExecuteTx retry attempt 1 failed, started at 2019-08-30 22:44:57.205234313 +0000 UTC m=+575.641833432, now = 2019-08-30 22:44:58.208845895 +0000 UTC m=+576.645445061, took 1.003611629s
		2019/08/30 22:44:58 Aborting Retries because this error of type *errors.errorString is not retryable : unexpected EOF
		2019/08/30 22:44:58 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/d51fa78ff90a113c9009d263dfaf58d3672670a6

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1463583&tab=buildLog

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190902-1463583/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2114,scaledata.go:121,scaledata.go:48,test_runner.go:673: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1567405952-63-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.95:26257,10.128.0.162:26257,10.128.0.17:26257,10.128.0.163:26257,10.128.0.161:26257,10.128.0.86:26257'  returned:
		stderr:
		-09-02 10:42:28.585095794 +0000 UTC m=+576.561619524, took 615.931274ms
		2019/09/02 10:42:28 Attempt failed with error driver: bad connection: ... Retrying after sleeping 5ns
		2019/09/02 10:42:28 ExecuteTx retry attempt 1 failed, started at 2019-09-02 10:42:25.755124359 +0000 UTC m=+573.731648048, now = 2019-09-02 10:42:28.593759035 +0000 UTC m=+576.570282774, took 2.838634726s
		2019/09/02 10:42:28 pq error - Error code : XX000, Error class : XX
		2019/09/02 10:42:28 Attempt failed with error pq: internal error: unexpected error from the vectorized runtime: rpc error: code = Canceled desc = context canceled: ... Retrying after sleeping 5ns
		2019/09/02 10:42:28 ExecuteTx retry attempt 1 failed, started at 2019-09-02 10:42:27.59436839 +0000 UTC m=+575.570892082, now = 2019-09-02 10:42:28.593807934 +0000 UTC m=+576.570331643, took 999.439561ms
		2019/09/02 10:42:28 Aborting Retries because this error of type *errors.errorString is not retryable : unexpected EOF
		2019/09/02 10:42:28 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/991282eacbbe1315fde694be9785ad8f6fa929d3

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1481778&tab=buildLog

The test failed on branch=release-2.1, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20190912-1481778/scaledata/filesystem_simulator/nodes=6/run_1
	test_runner.go:703: test timed out (20m0s)

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/09d51e9f6265ed70caf49385be905606ebf722c7

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1515124&tab=artifacts#/scaledata/filesystem_simulator/nodes=6

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20191001-1515124/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2143,scaledata.go:121,scaledata.go:48,test_runner.go:689: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1569906547-66-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.8:26257,10.128.0.114:26257,10.128.0.74:26257,10.128.0.29:26257,10.128.0.28:26257,10.128.0.95:26257'  returned:
		stderr:
		 Retrying after sleeping 5ns
		2019/10/01 10:00:16 ExecuteTx retry attempt 1 failed, started at 2019-10-01 10:00:15.878804464 +0000 UTC m=+454.095151532, now = 2019-10-01 10:00:16.433826126 +0000 UTC m=+454.650173199, took 555.021667ms
		2019/10/01 10:00:16 Attempt failed with error driver: bad connection: ... Retrying after sleeping 5ns
		2019/10/01 10:00:16 ExecuteTx retry attempt 1 failed, started at 2019-10-01 10:00:16.203075516 +0000 UTC m=+454.419422568, now = 2019-10-01 10:00:16.433874607 +0000 UTC m=+454.650221721, took 230.799153ms
		2019/10/01 10:00:16 Attempt failed with error driver: bad connection: ... Retrying after sleeping 5ns
		2019/10/01 10:00:16 ExecuteTx retry attempt 1 failed, started at 2019-10-01 10:00:14.438030346 +0000 UTC m=+452.654377403, now = 2019-10-01 10:00:16.437988787 +0000 UTC m=+454.654335893, took 1.99995849s
		2019/10/01 10:00:16 Aborting Retries because this error of type *errors.errorString is not retryable : unexpected EOF
		2019/10/01 10:00:16 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/948a00e7b418b82d5421ad6aa0d651d7e9eeec91

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1520710&tab=artifacts#/scaledata/filesystem_simulator/nodes=6

The test failed on branch=release-19.1, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20191004-1520710/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2143,scaledata.go:121,scaledata.go:48,test_runner.go:689: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1570165931-62-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.100:26257,10.128.0.95:26257,10.128.0.27:26257,10.128.0.99:26257,10.128.0.97:26257,10.128.0.91:26257'  returned:
		stderr:
		9 +0000 UTC m=+454.982951886, now = 2019-10-04 09:47:17.906329228 +0000 UTC m=+454.987231228, took 4.279342ms
		2019/10/04 09:47:17 Attempt failed with error dial tcp 10.128.0.91:26257: connect: connection refused: ... Retrying after sleeping 10ns
		2019/10/04 09:47:17 ExecuteTx retry attempt 2 failed, started at 2019-10-04 09:47:17.903057424 +0000 UTC m=+454.983959417, now = 2019-10-04 09:47:17.906489324 +0000 UTC m=+454.987391321, took 3.431904ms
		2019/10/04 09:47:17 Attempt failed with error dial tcp 10.128.0.91:26257: connect: connection refused: ... Retrying after sleeping 10ns
		2019/10/04 09:47:17 RobustDB.RandomDB chose DB at index 0
		2019/10/04 09:47:17 ExecuteTx retry attempt 1 failed, started at 2019-10-04 09:47:15.136811684 +0000 UTC m=+452.217713663, now = 2019-10-04 09:47:17.90934123 +0000 UTC m=+454.990243237, took 2.772529574s
		2019/10/04 09:47:17 Aborting Retries because this error of type *errors.errorString is not retryable : unexpected EOF
		2019/10/04 09:47:17 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/35e138aa3c2be545fb4e17a85ea6f1b8d6525e53

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1584763&tab=artifacts#/scaledata/filesystem_simulator/nodes=6

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20191110-1584763/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2163,scaledata.go:121,scaledata.go:48,test_runner.go:697: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1584763-1573370983-65-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.114:26257,10.128.0.111:26257,10.128.0.106:26257,10.128.0.90:26257,10.128.0.105:26257,10.128.0.108:26257'  returned:
		stderr:
		de : 58C01, Error class : 58
		2019/11/10 12:04:46 pq error - Error code : 58C01, Error class : 58
		2019/11/10 12:04:46 Aborting Retries because this error of type *pq.Error is not retryable : pq: communication error: rpc error: code = Canceled desc = context canceled
		2019/11/10 12:04:46 postgres error code is 58C01 and class is 58
		2019/11/10 12:04:46 pq: communication error: rpc error: code = Canceled desc = context canceled
		2019/11/10 12:04:46 ExecuteTx retry attempt 1 failed, started at 2019-11-10 12:04:45.943982452 +0000 UTC m=+88.636632915, now = 2019-11-10 12:04:46.494561976 +0000 UTC m=+89.187212470, took 550.579555ms
		2019/11/10 12:04:46 pq error - Error code : 58C01, Error class : 58
		2019/11/10 12:04:46 pq error - Error code : 58C01, Error class : 58
		2019/11/10 12:04:46 Aborting Retries because this error of type *pq.Error is not retryable : pq: communication error: rpc error: code = Canceled desc = context canceled
		2019/11/10 12:04:46 postgres error code is 58C01 and class is 58
		Error:  exit status 255
		
		stdout:
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/35e138aa3c2be545fb4e17a85ea6f1b8d6525e53

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1588906&tab=artifacts#/scaledata/filesystem_simulator/nodes=6

The test failed on branch=provisional_201911111508_v20.1.0-alpha.20191118, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20191112-1588906/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2163,scaledata.go:121,scaledata.go:48,test_runner.go:697: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1588906-1573600978-64-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.194:26257,10.128.0.197:26257,10.128.0.77:26257,10.128.0.198:26257,10.128.0.193:26257,10.128.0.200:26257'  returned:
		stderr:
		662124113865.0000000000: sizes :- files - 22846, childRelations - 22845, stripes - 3556
		2019/11/13 04:01:04 ExecuteTx retry attempt 1 failed, started at 2019-11-13 04:01:03.382655574 +0000 UTC m=+575.991273353, now = 2019-11-13 04:01:04.013570277 +0000 UTC m=+576.622188100, took 630.914747ms
		2019/11/13 04:01:04 Attempt failed with error restarting txn failed. ROLLBACK TO SAVEPOINT encountered error: driver: bad connection. Original error: pq: restart transaction: TransactionRetryWithProtoRefreshError: WriteTooOldError: write at timestamp 1573617663.687283126,1 too old; wrote at 1573617663.799338297,1.: ... Retrying after sleeping 5ns
		2019/11/13 04:01:04 ExecuteTx retry attempt 1 failed, started at 2019-11-13 04:01:00.546825849 +0000 UTC m=+573.155443625, now = 2019-11-13 04:01:04.013955591 +0000 UTC m=+576.622573409, took 3.467129784s
		2019/11/13 04:01:04 Aborting Retries because this error of type *errors.errorString is not retryable : unexpected EOF
		2019/11/13 04:01:04 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1

@andreimatei andreimatei removed their assignment Nov 13, 2019
@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/9012e6449e63baca6b59c00d3e350bbc3ab0dd3b

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1611118&tab=artifacts#/scaledata/filesystem_simulator/nodes=6

The test failed on branch=release-19.1, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20191126-1611118/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2163,scaledata.go:121,scaledata.go:48,test_runner.go:697: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1611118-1574751452-67-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.128:26257,10.128.0.113:26257,10.128.0.125:26257,10.128.0.122:26257,10.128.0.60:26257,10.128.0.115:26257'  returned:
		stderr:
		Retrying after sleeping 5ns
		2019/11/26 11:15:15 ExecuteTx retry attempt 1 failed, started at 2019-11-26 11:15:14.870431999 +0000 UTC m=+210.714698236, now = 2019-11-26 11:15:15.185883112 +0000 UTC m=+211.030149381, took 315.451145ms
		2019/11/26 11:15:15 Attempt failed with error driver: bad connection: ... Retrying after sleeping 5ns
		2019/11/26 11:15:15 ExecuteTx retry attempt 1 failed, started at 2019-11-26 11:15:14.072387804 +0000 UTC m=+209.916654048, now = 2019-11-26 11:15:15.185990166 +0000 UTC m=+211.030256466, took 1.113602418s
		2019/11/26 11:15:15 Aborting Retries because this error of type *errors.errorString is not retryable : unexpected EOF
		2019/11/26 11:15:15 ExecuteTx retry attempt 1 failed, started at 2019-11-26 11:15:14.785724254 +0000 UTC m=+210.629990514, now = 2019-11-26 11:15:15.186093205 +0000 UTC m=+211.030359473, took 400.368959ms
		2019/11/26 11:15:15 Attempt failed with error driver: bad connection: ... Retrying after sleeping 5ns
		2019/11/26 11:15:15 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/f97dc13163020a032b098ef3eb88e4d9f54a04ba

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1613952&tab=artifacts#/scaledata/filesystem_simulator/nodes=6

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20191127-1613952/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2163,scaledata.go:121,scaledata.go:48,test_runner.go:697: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1613952-1574840180-71-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.1.12:26257,10.128.1.7:26257,10.128.1.19:26257,10.128.1.11:26257,10.128.1.14:26257,10.128.1.6:26257'  returned:
		stderr:
		eTx retry attempt 1 failed, started at 2019-11-27 12:20:20.82536647 +0000 UTC m=+454.588592788, now = 2019-11-27 12:20:20.827095374 +0000 UTC m=+454.590321719, took 1.728931ms
		2019/11/27 12:20:20 Attempt failed with error dial tcp 10.128.1.12:26257: connect: connection refused: ... Retrying after sleeping 5ns
		2019/11/27 12:20:20 RobustDB.RandomDB chose DB at index 5
		2019/11/27 12:20:20 Removing &{4871cb9b-38c0-46db-a8ef-618d4acc6dda 1 0 80 default}
		2019/11/27 12:20:20 Created file 9_2226 with uuid 7612448b-9507-4d60-83ee-e3a33c0fab6b and parent /default
		2019/11/27 12:20:20 Deleted stripes for uuid 4871cb9b-38c0-46db-a8ef-618d4acc6dda
		2019/11/27 12:20:20 ExecuteTx retry attempt 1 failed, started at 2019-11-27 12:20:19.197426217 +0000 UTC m=+452.960652549, now = 2019-11-27 12:20:20.836469369 +0000 UTC m=+454.599695734, took 1.639043185s
		2019/11/27 12:20:20 Aborting Retries because this error of type *errors.errorString is not retryable : unexpected EOF
		2019/11/27 12:20:20 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/6c13f01ef0d999095a16345b21fc455648796e0c

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1619815&tab=artifacts#/scaledata/filesystem_simulator/nodes=6

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20191202-1619815/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2163,scaledata.go:121,scaledata.go:48,test_runner.go:697: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1619815-1575271421-66-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.104:26257,10.128.0.105:26257,10.128.0.98:26257,10.128.0.103:26257,10.128.0.127:26257,10.128.0.102:26257'  returned:
		stderr:
		r from the vectorized runtime: rpc error: code = Canceled desc = context canceled: ... Retrying after sleeping 5ns
		2019/12/02 12:03:12 ExecuteTx retry attempt 1 failed, started at 2019-12-02 12:03:10.806499321 +0000 UTC m=+574.281111051, now = 2019-12-02 12:03:12.963079032 +0000 UTC m=+576.437690802, took 2.156579751s
		2019/12/02 12:03:12 pq error - Error code : XX000, Error class : XX
		2019/12/02 12:03:12 Attempt failed with error pq: internal error: unexpected error from the vectorized runtime: rpc error: code = Canceled desc = context canceled: ... Retrying after sleeping 5ns
		2019/12/02 12:03:12 RobustDB.RandomDB chose DB at index 4
		2019/12/02 12:03:12 ExecuteTx retry attempt 1 failed, started at 2019-12-02 12:03:10.857418941 +0000 UTC m=+574.332030680, now = 2019-12-02 12:03:12.968458729 +0000 UTC m=+576.443070512, took 2.111039832s
		2019/12/02 12:03:12 Aborting Retries because this error of type *errors.errorString is not retryable : unexpected EOF
		2019/12/02 12:03:12 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/5224df9c8bd9a36dfde24ae3abe7b7e42a5d9660

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1631737&tab=artifacts#/scaledata/filesystem_simulator/nodes=6

The test failed on branch=release-19.2, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20191207-1631737/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2163,scaledata.go:121,scaledata.go:48,test_runner.go:697: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1631737-1575703996-67-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.215:26257,10.128.0.221:26257,10.128.0.202:26257,10.128.0.200:26257,10.128.0.214:26257,10.128.0.206:26257'  returned:
		stderr:
		file 13_2051 with uuid 3a527888-88a5-483f-a9ff-ab0377737e2b and parent /default
		2019/12/07 12:12:07 RobustDB.RandomDB chose DB at index 4
		2019/12/07 12:12:07 Writing new stripe 0
		2019/12/07 12:12:07 &{ea75f8cb-8150-48c3-9062-dd486b08db49 0 default}
		2019/12/07 12:12:07 RobustDB.RandomDB chose DB at index 3
		2019/12/07 12:12:07 Created file 12_2133 with uuid 90b44fdc-ba1a-4005-9761-b89186c9db87 and parent /default
		2019/12/07 12:12:07 Writing new stripe 0
		2019/12/07 12:12:07 &{f8aca0bf-bfc9-4790-8f61-817274a072cd 0 default}
		2019/12/07 12:12:07 RobustDB.RandomDB chose DB at index 2
		2019/12/07 12:12:07 RobustDB.RandomDB chose DB at index 2
		2019/12/07 12:12:07 ExecuteTx retry attempt 1 failed, started at 2019-12-07 12:12:05.407812423 +0000 UTC m=+330.954961884, now = 2019-12-07 12:12:07.366194393 +0000 UTC m=+332.913343943, took 1.958382059s
		2019/12/07 12:12:07 Aborting Retries because this error of type *errors.errorString is not retryable : unexpected EOF
		2019/12/07 12:12:07 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

(roachtest).scaledata/filesystem_simulator/nodes=6 failed on provisional_201912100444_v2.1.10@9ad9eb5fb8806e4b74546910ca8bda66786d4288:

The test failed on branch=provisional_201912100444_v2.1.10, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20191210-1634845/scaledata/filesystem_simulator/nodes=6/run_1
	test_runner.go:715: test timed out (20m0s)

details

Artifacts: /scaledata/filesystem_simulator/nodes=6

make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=./pkg/roachtest TESTTIMEOUT=5m STRESSFLAGS=-timeout 5m' 2>&1

powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).scaledata/filesystem_simulator/nodes=6 failed on provisional_201912132008_v20.1.0-alpha20191216@e9e2a80361a25fd9f9b179f84be4c5c3d7e7d8cb:

The test failed on branch=provisional_201912132008_v20.1.0-alpha20191216, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20191213-1642926/scaledata/filesystem_simulator/nodes=6/run_1
	cluster.go:2163,scaledata.go:121,scaledata.go:48,test_runner.go:700: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1642926-1576273385-71-n7cpu4:7 -- ./filesystem_simulator  --duration_secs=600 --num_workers=16 --cockroach_ip_addresses_csv='10.128.0.33:26257,10.128.0.57:26257,10.128.0.65:26257,10.128.0.68:26257,10.128.0.75:26257,10.128.0.64:26257'  returned:
		stderr:
		019/12/14 02:19:03 RobustDB.RandomDB chose DB at index 2
		2019/12/14 02:19:03 Writing new stripe 0
		2019/12/14 02:19:03 &{205436ff-f414-435f-8ab1-17b6fb4d3185 0 default}
		2019/12/14 02:19:03 Removing &{4c336530-983d-4d04-a88e-b7e84d1bf463 1 0 4 default}
		2019/12/14 02:19:03 Deleted &{d791039c-cb13-48b3-b4a5-b957b0509146 1 1 37 default}
		2019/12/14 02:19:03 RobustDB.RandomDB chose DB at index 3
		2019/12/14 02:19:03 Removing &{125084df-8728-4a0b-b05c-69a112829af3 1 1 180 default}
		2019/12/14 02:19:03 RobustDB.RandomDB chose DB at index 5
		2019/12/14 02:19:03 Created file 6_2080 with uuid 6138d6eb-8e9b-4be7-9705-136962fbc85c and parent /default
		2019/12/14 02:19:03 ExecuteTx retry attempt 1 failed, started at 2019-12-14 02:19:01.841426347 +0000 UTC m=+330.893171845, now = 2019-12-14 02:19:03.774015233 +0000 UTC m=+332.825760754, took 1.932588909s
		2019/12/14 02:19:03 Aborting Retries because this error of type *errors.errorString is not retryable : unexpected EOF
		2019/12/14 02:19:03 unexpected EOF
		Error:  exit status 255
		
		stdout:
		: exit status 1
Repro

Artifacts: /scaledata/filesystem_simulator/nodes=6

make stressrace TESTS=scaledata/filesystem_simulator/nodes=6 PKG=./pkg/roachtest TESTTIMEOUT=5m STRESSFLAGS=-timeout 5m' 2>&1

powered by pkg/cmd/internal/issues

nvanbenschoten added a commit to nvanbenschoten/rksql that referenced this issue Dec 17, 2019
Fixes cockroachdb/cockroach#36981.
Fixes cockroachdb/cockroach#39618.
Fixes cockroachdb/cockroach#40552.
Fixes cockroachdb/cockroach#41735.

cockroachdb/cockroach#41451 switched two forms
of errors that can be thrown during chaos events over to a new error code
class - 58, internal system errors. This commit updates `pqConnectionError`
to consider this error code class as retry-worthy.
@nvanbenschoten
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

No branches or pull requests

3 participants