Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: tpccbench/nodes=9/cpu=4/chaos/partition failed #32135

Closed
cockroach-teamcity opened this issue Nov 2, 2018 · 89 comments
Closed

roachtest: tpccbench/nodes=9/cpu=4/chaos/partition failed #32135

cockroach-teamcity opened this issue Nov 2, 2018 · 89 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

SHA: https://github.com/cockroachdb/cockroach/commits/acd1250b15b7ed3c8938dfd53b8bc53bb53c578c

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=999222&tab=buildLog

The test failed on master:
	test.go:639,cluster.go:1461,tpcc.go:667,tpcc.go:346: signal: interrupt

@cockroach-teamcity cockroach-teamcity added this to the 2.2 milestone Nov 2, 2018
@cockroach-teamcity cockroach-teamcity added C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Nov 2, 2018
@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/fb4a974646b8fd440ed60471e70fbfdb79d95a76

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1000678&tab=buildLog

The test failed on master:
	test.go:639,cluster.go:1461,tpcc.go:667,tpcc.go:346: error running tpcc load generator:
		
		Error: EOF
		Error:  exit status 1
		
		: /home/agent/work/.go/bin/roachprod run teamcity-1000678-tpccbench-nodes-9-cpu-4-chaos-partition:10 -- ./workload run tpcc --warehouses=2000 --active-warehouses=600 --tolerate-errors --ramp=5m0s --duration=10m0s --partitions=3 --split {pgurl:10}: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/cb25dd55d1bfcaf54615ade8cb92b88fdc677129

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1003388&tab=buildLog

The test failed on master:
	test.go:639,cluster.go:1461,tpcc.go:667,tpcc.go:346: context canceled

@tbg
Copy link
Member

tbg commented Nov 12, 2018

@nvanbenschoten the last one is a fluke (clearrange held up the test harness) but the one before that seems real-er.

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/b3ba03fdb86cbefceb42f46069ffd685749aa7b0

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1008034&tab=buildLog

The test failed on master:
	test.go:639,cluster.go:1461,tpcc.go:667,tpcc.go:346: error running tpcc load generator:
		
		Error: EOF
		Error:  exit status 1
		
		: /home/agent/work/.go/bin/roachprod run teamcity-1008034-tpccbench-nodes-9-cpu-4-chaos-partition:10 -- ./workload run tpcc --warehouses=2000 --active-warehouses=600 --tolerate-errors --ramp=5m0s --duration=10m0s --partitions=3 --split {pgurl:10}: exit status 1

@tbg
Copy link
Member

tbg commented Nov 13, 2018

^- can't get anything actionable from this failure. Looks like workload doesn't log anything except EOF?

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/96aafe6226579176f496dfadae78b52d687c3faa

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1008356&tab=buildLog

The test failed on master:
	test.go:639,test.go:651,cluster.go:1078,tpcc.go:525,search.go:47,search.go:177,tpcc.go:520,cluster.go:1474,errgroup.go:58: /home/agent/work/.go/bin/roachprod start --racks=3 teamcity-1008356-tpccbench-nodes-9-cpu-4-chaos-partition:1-9 returned:
		stderr:
		
		stdout:
		a in your cluster.
		* - There is no network encryption nor authentication, and thus no confidentiality.
		* 
		* Check out how to secure your cluster: https://www.cockroachlabs.com/docs/v2.2/secure-a-cluster.html
		*
		*
		* ERROR: cockroach server exited with error: failed to create engines: could not open rocksdb instance: Invalid argument: encryption was used on this store before, but no encryption flags specified. You need a CCL build and must fully specify the --enterprise-encryption flag
		*
		Failed running "start"
		E181113 19:01:04.397436 1 cli/error.go:230  exit status 1
		Error: exit status 1
		Failed running "start"
		
		github.com/cockroachdb/roachprod/install.Cockroach.Start.func6
			/home/agent/work/.go/src/github.com/cockroachdb/roachprod/install/cockroach.go:377
		github.com/cockroachdb/roachprod/install.(*SyncedCluster).Parallel.func1.1
			/home/agent/work/.go/src/github.com/cockroachdb/roachprod/install/cluster_synced.go:1118
		runtime.goexit
			/usr/local/go/src/runtime/asm_amd64.s:2361: 
		2018/11/13 19:01:04 command failed
		: exit status 1
	test.go:639,cluster.go:1495,tpcc.go:667,tpcc.go:346: Goexit() was called

@tbg
Copy link
Member

tbg commented Nov 13, 2018

^- this particular one shouldn't happen tomorrow, it was fixed in #32259

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/eb8345b19cf6c15b3e4fcb9c156136357d83cb2d

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1015656&tab=buildLog

The test failed on master:
	test.go:645,cluster.go:1484,tpcc.go:667,tpcc.go:346: error running tpcc load generator:
		
		Error: EOF
		Error:  exit status 1
		
		: /home/agent/work/.go/bin/roachprod run teamcity-1015656-tpccbench-nodes-9-cpu-4-chaos-partition:10 -- ./workload run tpcc --warehouses=2000 --active-warehouses=600 --tolerate-errors --ramp=5m0s --duration=10m0s --partitions=3 --split {pgurl:10}: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/96f03c95d5078ebad7167c5cdb145e365978a008

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1022695&tab=buildLog

The test failed on master:
	test.go:645,cluster.go:1484,tpcc.go:667,tpcc.go:346: error running tpcc load generator:
		
		Error: EOF
		Error:  exit status 1
		
		: /home/agent/work/.go/bin/roachprod run teamcity-1022695-tpccbench-nodes-9-cpu-4-chaos-partition:10 -- ./workload run tpcc --warehouses=2000 --active-warehouses=600 --tolerate-errors --ramp=5m0s --duration=10m0s --partitions=3 --split {pgurl:10}: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/07edde23460e8ffe0ec40f89975c3b95fc28343e

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1030507&tab=buildLog

The test failed on master:
	test.go:645,cluster.go:1484,tpcc.go:667,tpcc.go:346: error running tpcc load generator:
		
		Error: EOF
		Error:  exit status 1
		
		: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1030507-tpccbench-nodes-9-cpu-4-chaos-partition:10 -- ./workload run tpcc --warehouses=2000 --active-warehouses=600 --tolerate-errors --ramp=5m0s --duration=10m0s --partitions=3 --split {pgurl:10}: exit status 1

@tbg
Copy link
Member

tbg commented Nov 30, 2018

@nvanbenschoten this is on your radar, right?

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/63f5dc75d0c69950ce0dfb4e7ef3f2e2be2889b9

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1035651&tab=buildLog

The test failed on master:
	test.go:645,cluster.go:1488,tpcc.go:666,tpcc.go:346: error running tpcc load generator:
		
		Error: EOF
		Error:  exit status 1
		
		: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1035651-tpccbench-nodes-9-cpu-4-chaos-partition:10 -- ./workload run tpcc --warehouses=2000 --active-warehouses=600 --tolerate-errors --ramp=5m0s --duration=10m0s --partitions=3 --split {pgurl:10}: exit status 1

@nvanbenschoten
Copy link
Member

Yes, on my radar, I just haven't gotten around to it. Looks like it's failing ~20% of the time.

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/e6cb0c5c329617b560eee37527248171b5e06382

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1038478&tab=buildLog

The test failed on master:
	test.go:630,cluster.go:1488,tpcc.go:662,tpcc.go:342: error running tpcc load generator:
		
		Error: EOF
		Error:  exit status 1
		
		: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1038478-tpccbench-nodes-9-cpu-4-chaos-partition:10 -- ./workload run tpcc --warehouses=2000 --active-warehouses=600 --tolerate-errors --ramp=5m0s --duration=10m0s --partitions=3 --split {pgurl:10}: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/1146a03cc217cb57bdddd795e2d2fe2806c64985

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1042719&tab=buildLog

The test failed on release-2.1:
	test.go:630,cluster.go:1488,tpcc.go:662,tpcc.go:342: error running tpcc load generator:
		
		Error: EOF
		Error:  exit status 1
		
		: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1042719-tpccbench-nodes-9-cpu-4-chaos-partition:10 -- ./workload run tpcc --warehouses=2000 --active-warehouses=600 --tolerate-errors --ramp=5m0s --duration=10m0s --partitions=3 --split {pgurl:10}: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/06d2222fd9010f01a8cdf6a6c24597bbed181f36

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1044826&tab=buildLog

The test failed on master:
	test.go:630,cluster.go:1486,tpcc.go:661,tpcc.go:341: unexpected node event: 1: dead

@tbg
Copy link
Member

tbg commented Dec 7, 2018

F181207 17:55:01.856897 4003 storage/replica_raftstorage.go:955 [n1,s1,r182/1:/Table/57/1/"\x1{4v&\…-6ln>…}] unable to mark replica initialized while applying snapshot: [n1,s1]: cannot initialize replica; range [n1,s1,r182/1:/Table/57/1/"\x1{4v&\…-6ln>…}] has overlapping range r108:/Table/5{7/1/"\a֥\x01\xec\xe8MŮ\x13{f\x9a\x93\x12\xdc"/PrefixEnd-8} [(n1,s1):1, (n7,s7):2, (n4,s4):3, (n6,s6):4, (n8,s8):5, next=6, gen=0]

Uh oh, but I bet this is on me and #32817

@tbg
Copy link
Member

tbg commented Dec 11, 2018

^- fix for this last failure is incoming, but this issue had earlier failures so I won't close it.

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/dfa23c01e4ea39b19ca8b2e5c8a4e7cf9b9445f4

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1189954&tab=buildLog

The test failed on master:
	cluster.go:1626,tpcc.go:725,tpcc.go:420,test.go:1214: unexpected node event: 2: dead

@tbg
Copy link
Member

tbg commented Mar 22, 2019

cc @nvanbenschoten

panic: batch size 197 -> 195 bytes
re-marshaled protobuf:
00000000  0a a5 01 0a 00 12 06 08  02 10 02 18 02 18 f5 19  |................|
00000010  2a 8e 01 0a 22 0a 10 f1  bd 50 46 46 dc 4f 7e 88  |*..."....PFF.O~.|
00000020  bd 7e 27 6c 41 3e 81 2a  0a 08 80 a0 cc ed c0 83  |.~'lA>.*........|
00000030  83 c7 15 30 c6 bb 3b 12  07 75 6e 6e 61 6d 65 64  |...0..;..unnamed|
00000040  2a 0c 08 ec 86 d0 b9 b1  84 83 c7 15 10 3e 32 0a  |*............>2.|
00000050  08 80 a0 cc ed c0 83 83  c7 15 3a 0a 08 80 a0 cc  |..........:.....|
00000060  ed c0 83 83 c7 15 42 0e  08 02 12 0a 08 87 c9 9e  |......B.........|
00000070  8b b2 84 83 c7 15 42 10  08 04 12 0c 08 e9 ca de  |......B.........|
00000080  84 b2 84 83 c7 15 10 06  42 10 08 07 12 0c 08 e9  |........B.......|
00000090  ca de 84 b2 84 83 c7 15  10 3c 72 00 7a 00 80 01  |.........<r.z...|
000000a0  01 40 90 4e 50 01 58 08  12 19 3a 17 0a 13 1a 08  |[email protected]...:.....|
000000b0  c1 89 9a f8 01 48 08 88  22 07 c1 89 9b f8 01 5a  |.....H.."......Z|
000000c0  07 20 01                                          |. .|

original panic:  <nil>
 [recovered]
	panic: batch size 197 -> 195 bytes
re-marshaled protobuf:
00000000  0a a5 01 0a 00 12 06 08  02 10 02 18 02 18 f5 19  |................|
00000010  2a 8e 01 0a 22 0a 10 f1  bd 50 46 46 dc 4f 7e 88  |*..."....PFF.O~.|
00000020  bd 7e 27 6c 41 3e 81 2a  0a 08 80 a0 cc ed c0 83  |.~'lA>.*........|
00000030  83 c7 15 30 c6 bb 3b 12  07 75 6e 6e 61 6d 65 64  |...0..;..unnamed|
00000040  2a 0c 08 ec 86 d0 b9 b1  84 83 c7 15 10 3e 32 0a  |*............>2.|
00000050  08 80 a0 cc ed c0 83 83  c7 15 3a 0a 08 80 a0 cc  |..........:.....|
00000060  ed c0 83 83 c7 15 42 0e  08 02 12 0a 08 87 c9 9e  |......B.........|
00000070  8b b2 84 83 c7 15 42 10  08 04 12 0c 08 e9 ca de  |......B.........|
00000080  84 b2 84 83 c7 15 10 06  42 10 08 07 12 0c 08 e9  |........B.......|
00000090  ca de 84 b2 84 83 c7 15  10 3c 72 00 7a 00 80 01  |.........<r.z...|
000000a0  01 40 90 4e 50 01 58 08  12 19 3a 17 0a 13 1a 08  |[email protected]...:.....|
000000b0  c1 89 9a f8 01 48 08 88  22 07 c1 89 9b f8 01 5a  |.....H.."......Z|
000000c0  07 20 01                                          |. .|

original panic:  <nil>


goroutine 155643 [running]:
panic(0x2d98760, 0xc011d20b30)
	/usr/local/go/src/runtime/panic.go:556 +0x2cb fp=0xc00c613060 sp=0xc00c612fd0 pc=0x72b49b
github.com/cockroachdb/cockroach/pkg/kv.(*DistSender).divideAndSendBatchToRanges.func1(0xc00c614398, 0xc00c614688, 0xc00c6145c8, 0xc00c614680, 0xc00c6142ff, 0xc00c614380, 0xc00c614304)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/dist_sender.go:825 +0x50e fp=0xc00c613228 sp=0xc00c613060 pc=0x1596fde
runtime.call64(0x0, 0x33f8970, 0xc006c03370, 0x3800000038)
	/usr/local/go/src/runtime/asm_amd64.s:523 +0x3b fp=0xc00c613278 sp=0xc00c613228 pc=0x75a0eb
panic(0x2d98760, 0xc011d20b30)
	/usr/local/go/src/runtime/panic.go:513 +0x1b9 fp=0xc00c613308 sp=0xc00c613278 pc=0x72b389
github.com/cockroachdb/cockroach/pkg/kv.withMarshalingDebugging.func1(0xc00914a300, 0xc5)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/transport.go:181 +0x264 fp=0xc00c6133d8 sp=0xc00c613308 pc=0x15992f4
github.com/cockroachdb/cockroach/pkg/kv.withMarshalingDebugging(0x39fcf80, 0xc0104c2420, 0x0, 0x0, 0x200000002, 0x2, 0xcf5, 0x0, 0xc00b310a00, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/transport.go:185 +0xac fp=0xc00c613410 sp=0xc00c6133d8 pc=0x158555c
github.com/cockroachdb/cockroach/pkg/kv.(*grpcTransport).SendNext(0xc0104c23f0, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x200000002, 0x2, 0xcf5, 0x0, 0xc00b310a00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/transport.go:202 +0x211 fp=0xc00c6135f0 sp=0xc00c613410 pc=0x15857c1
github.com/cockroachdb/cockroach/pkg/kv.(*DistSender).sendToReplicas(0xc00068a000, 0x39fcf80, 0xc00f95e3f0, 0xc00068a050, 0xcf5, 0xc00eab4690, 0x3, 0x3, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/dist_sender.go:1365 +0x2d3 fp=0xc00c613998 sp=0xc00c6135f0 pc=0x157b563
github.com/cockroachdb/cockroach/pkg/kv.(*DistSender).sendRPC(0xc00068a000, 0x39fcf80, 0xc00f95e3f0, 0xcf5, 0xc00eab4690, 0x3, 0x3, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/dist_sender.go:416 +0x244 fp=0xc00c613ab8 sp=0xc00c613998 pc=0x1575824
github.com/cockroachdb/cockroach/pkg/kv.(*DistSender).sendSingleRange(0xc00068a000, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00b310a00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/dist_sender.go:496 +0x221 fp=0xc00c613c38 sp=0xc00c613ab8 pc=0x1575da1
github.com/cockroachdb/cockroach/pkg/kv.(*DistSender).sendPartialBatch(0xc00068a000, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00b310a00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/dist_sender.go:1141 +0x322 fp=0xc00c6141e8 sp=0xc00c613c38 pc=0x15796b2
github.com/cockroachdb/cockroach/pkg/kv.(*DistSender).divideAndSendBatchToRanges(0xc00068a000, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00b310a00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/dist_sender.go:962 +0x8b3 fp=0xc00c6145b0 sp=0xc00c6141e8 pc=0x15784c3
github.com/cockroachdb/cockroach/pkg/kv.(*DistSender).Send(0xc00068a000, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00df8ce00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/dist_sender.go:710 +0x48b fp=0xc00c614898 sp=0xc00c6145b0 pc=0x157735b
github.com/cockroachdb/cockroach/pkg/kv.(*txnLockGatekeeper).SendLocked(0xc00586bf10, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00df8ce00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/txn_coord_sender.go:234 +0xe8 fp=0xc00c614950 sp=0xc00c614898 pc=0x1586838
github.com/cockroachdb/cockroach/pkg/kv.(*txnSpanRefresher).sendLockedWithRefreshAttempts(0xc00586be48, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00df8ce00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/txn_interceptor_span_refresher.go:144 +0x83 fp=0xc00c614a30 sp=0xc00c614950 pc=0x1593e13
github.com/cockroachdb/cockroach/pkg/kv.(*txnSpanRefresher).SendLocked(0xc00586be48, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00df8ce00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/txn_interceptor_span_refresher.go:100 +0xf9 fp=0xc00c614b08 sp=0xc00c614a30 pc=0x1593a09
github.com/cockroachdb/cockroach/pkg/kv.(*txnPipeliner).SendLocked(0xc00586bdc0, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00df8ce00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/txn_interceptor_pipeliner.go:183 +0xf9 fp=0xc00c614c20 sp=0xc00c614b08 pc=0x1591a09
github.com/cockroachdb/cockroach/pkg/kv.(*txnSeqNumAllocator).SendLocked(0xc00586bd68, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00df8ce00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/txn_interceptor_seq_num_allocator.go:92 +0x23b fp=0xc00c614d50 sp=0xc00c614c20 pc=0x15937cb
github.com/cockroachdb/cockroach/pkg/kv.(*TxnCoordSender).Send(0xc00586bb00, 0x39fcf80, 0xc00f95e3f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/txn_coord_sender.go:780 +0x591 fp=0xc00c615098 sp=0xc00c614d50 pc=0x1589de1
github.com/cockroachdb/cockroach/pkg/internal/client.(*DB).sendUsingSender(0xc0007a9200, 0x39fcf80, 0xc00f95e3c0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/internal/client/db.go:622 +0x119 fp=0xc00c615168 sp=0xc00c615098 pc=0x1175459
github.com/cockroachdb/cockroach/pkg/internal/client.(*Txn).Send(0xc009ee4ab0, 0x39fcf80, 0xc00f95e3c0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/internal/client/txn.go:791 +0x13c fp=0xc00c6152c8 sp=0xc00c615168 pc=0x117f5fc
github.com/cockroachdb/cockroach/pkg/sql/row.(*txnKVFetcher).fetch(0xc00a72c8f0, 0x39fcf80, 0xc00f95e3c0, 0xc00778ecf0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/row/kv_batch_fetcher.go:242 +0x626 fp=0xc00c6157c8 sp=0xc00c6152c8 pc=0x1c3bd96
github.com/cockroachdb/cockroach/pkg/sql/row.(*txnKVFetcher).nextBatch(0xc00a72c8f0, 0x39fcf80, 0xc00f95e3c0, 0x7fac61f486c0, 0x7a, 0xc007a4aa90, 0x70b01f, 0xc0116c0000, 0x5000, 0x4b08, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/row/kv_batch_fetcher.go:326 +0x1dd fp=0xc00c6159c8 sp=0xc00c6157c8 pc=0x1c3c86d
github.com/cockroachdb/cockroach/pkg/sql/row.(*kvFetcher).nextKV(0xc01010bcd8, 0x39fcf80, 0xc00f95e3c0, 0xc00eb60540, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/row/kv_fetcher.go:72 +0x113 fp=0xc00c615b20 sp=0xc00c6159c8 pc=0x1c3d0c3
github.com/cockroachdb/cockroach/pkg/sql/row.(*Fetcher).NextKey(0xc01010bca0, 0x39fcf80, 0xc00f95e3c0, 0x0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/row/fetcher.go:490 +0x82 fp=0xc00c615c20 sp=0xc00c615b20 pc=0x1c2dec2
github.com/cockroachdb/cockroach/pkg/sql/row.(*Fetcher).StartScanFrom(0xc01010bca0, 0x39fcf80, 0xc00f95e3c0, 0x39d7380, 0xc00a72c8f0, 0x0, 0xe39201)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/row/fetcher.go:480 +0x97 fp=0xc00c615c60 sp=0xc00c615c20 pc=0x1c2ddc7
github.com/cockroachdb/cockroach/pkg/sql/row.(*Fetcher).StartScan(0xc01010bca0, 0x39fcf80, 0xc00f95e3c0, 0xc009ee4ab0, 0xc00b927000, 0x157, 0x157, 0x1, 0x0, 0x2fc2100, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/row/fetcher.go:471 +0x1e7 fp=0xc00c615e78 sp=0xc00c615c60 pc=0x1c2dc87
github.com/cockroachdb/cockroach/pkg/sql/distsqlrun.(*tableReader).Start(0xc01010b800, 0x39fcec0, 0xc009289340, 0xc000030d00, 0xc000a93f70)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/tablereader.go:253 +0x271 fp=0xc00c615f20 sp=0xc00c615e78 pc=0x1e04ee1
github.com/cockroachdb/cockroach/pkg/sql/distsqlrun.(*samplerProcessor).Run(0xc0119d0a80, 0x39fcec0, 0xc009289340)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/sampler.go:165 +0x5b fp=0xc00c615fa0 sp=0xc00c615f20 pc=0x1dea6db
github.com/cockroachdb/cockroach/pkg/sql/distsqlrun.(*Flow).startInternal.func1(0xc00abeb4a0, 0xc011023840, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/flow.go:567 +0x67 fp=0xc00c615fc8 sp=0xc00c615fa0 pc=0x1e13c27
runtime.goexit()

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/a200cea4368ec90aaee12337d7ab5f9ca555108f

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1191939&tab=buildLog

The test failed on master:
	cluster.go:1626,tpcc.go:725,tpcc.go:420,test.go:1214: unexpected node event: 4: dead

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/b5768aecd39461ab9a54e2e7db059a3fe8b00459

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1191957&tab=buildLog

The test failed on release-19.1:
	cluster.go:1193,tpcc.go:613,search.go:47,search.go:177,tpcc.go:608,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod start --racks=3 teamcity-1191957-tpccbench-nodes-9-cpu-4-chaos-partition:1-9 returned:
		stderr:
		
		stdout:
		t how to secure your cluster: https://www.cockroachlabs.com/docs/v19.1/secure-a-cluster.html
		*
		*
		* ERROR: could not cleanup temporary directories from record file: could not lock temporary directory /mnt/data1/cockroach/cockroach-temp638443614, may still be in use: IO error: While lock file: /mnt/data1/cockroach/cockroach-temp638443614/TEMP_DIR.LOCK: Resource temporarily unavailable
		*
		Failed running "start"
		E190322 18:09:24.117020 1 cli/error.go:229  exit status 1
		Error: exit status 1
		Failed running "start"
		
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func7
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:397
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1320
		runtime.goexit
			/usr/local/go/src/runtime/asm_amd64.s:1333: 
		I190322 18:09:26.894888 1 cluster_synced.go:1402  command failed
		: exit status 1
	cluster.go:1626,tpcc.go:725,tpcc.go:420,test.go:1214: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/9399d559ae196e5cf2ad122195048ff9115ab56a

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1194326&tab=buildLog

The test failed on release-19.1:
	cluster.go:1193,tpcc.go:613,search.go:47,search.go:177,tpcc.go:608,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod start --racks=3 teamcity-1194326-tpccbench-nodes-9-cpu-4-chaos-partition:1-9 returned:
		stderr:
		
		stdout:
		t how to secure your cluster: https://www.cockroachlabs.com/docs/v19.1/secure-a-cluster.html
		*
		*
		* ERROR: could not cleanup temporary directories from record file: could not lock temporary directory /mnt/data1/cockroach/cockroach-temp582912183, may still be in use: IO error: While lock file: /mnt/data1/cockroach/cockroach-temp582912183/TEMP_DIR.LOCK: Resource temporarily unavailable
		*
		Failed running "start"
		E190323 17:59:44.291265 1 cli/error.go:229  exit status 1
		Error: exit status 1
		Failed running "start"
		
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func7
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:397
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1320
		runtime.goexit
			/usr/local/go/src/runtime/asm_amd64.s:1333: 
		I190323 17:59:55.352805 1 cluster_synced.go:1402  command failed
		: exit status 1
	cluster.go:1626,tpcc.go:725,tpcc.go:420,test.go:1214: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/6cac063ae1cb578130afbafb2abf4035268a10c9

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1194308&tab=buildLog

The test failed on master:
	cluster.go:1626,tpcc.go:725,tpcc.go:420,test.go:1214: unexpected node event: 4: dead

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/5a746073c3f8ede851f37dd895cf1a91d6dcc3cf

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1195714&tab=buildLog

The test failed on master:
	cluster.go:1626,tpcc.go:725,tpcc.go:420,test.go:1214: unexpected node event: 1: dead

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/7f8a0969e8e9eb7e9fc0d2fe96e03849d30dd561

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1199677&tab=buildLog

The test failed on release-19.1:
	cluster.go:1193,chaos.go:98,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod start teamcity-1199677-tpccbench-nodes-9-cpu-4-chaos-partition:3 returned:
		stderr:
		
		stdout:
		i-node clusters: --advertise-addr=<host/IP addr>
		* 
		*
		*
		* WARNING: RUNNING IN INSECURE MODE!
		* 
		* - Your cluster is open for any client that can access <all your IP addresses>.
		* - Any user, even root, can log in without providing a password.
		* - Any user, connecting as root, can read or write any data in your cluster.
		* - There is no network encryption nor authentication, and thus no confidentiality.
		* 
		* Check out how to secure your cluster: https://www.cockroachlabs.com/docs/v19.1/secure-a-cluster.html
		*
		
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func7
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:397
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1320
		runtime.goexit
			/usr/local/go/src/runtime/asm_amd64.s:1333: 
		I190326 18:54:14.534927 1 cluster_synced.go:1402  command failed
		: exit status 1
	cluster.go:1193,tpcc.go:613,search.go:47,search.go:177,tpcc.go:608,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod start --racks=3 teamcity-1199677-tpccbench-nodes-9-cpu-4-chaos-partition:1-9 returned:
		stderr:
		
		stdout:
		teamcity-1199677-tpccbench-nodes-9-cpu-4-chaos-partition: starting.: signal: killed
	cluster.go:1626,tpcc.go:725,tpcc.go:420,test.go:1214: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/ed21f6da64b022a3e1e550fad5850fdffe2a7d17

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1201887&tab=buildLog

The test failed on master:
	cluster.go:1193,tpcc.go:613,search.go:47,search.go:177,tpcc.go:608,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod start --racks=3 teamcity-1201887-tpccbench-nodes-9-cpu-4-chaos-partition:1-9 returned:
		stderr:
		
		stdout:
		ash COCKROACH_SKIP_ENABLING_DIAGNOSTIC_REPORTING=1 COCKROACH_ENABLE_RPC_COMPRESSION=false ./cockroach start --insecure --store=path=/mnt/data1/cockroach --log-dir=${HOME}/logs --background --cache=25% --max-sql-memory=25% --port=26257 --http-port=26258 --locality=cloud=gce,region=us-east1,zone=us-east1-b,rack=2 --join=35.237.204.235:26257 >> ${HOME}/logs/cockroach.stdout 2>> ${HOME}/logs/cockroach.stderr || (x=$?; cat ${HOME}/logs/cockroach.stderr; exit $x)
		Connection to 34.73.136.31 closed by remote host.
		
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func7
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:397
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1320
		runtime.goexit
			/usr/local/go/src/runtime/asm_amd64.s:1333: 
		I190327 17:32:56.893124 1 cluster_synced.go:1402  command failed
		: exit status 1
	cluster.go:1626,tpcc.go:725,tpcc.go:420,test.go:1216: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/4f23ef547ad7af684f7b8cc349be8c1dc4d30aa3

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1204603&tab=buildLog

The test failed on release-2.1:
	cluster.go:1626,tpcc.go:732,tpcc.go:420,test.go:1216: error running tpcc load generator: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1204603-tpccbench-nodes-9-cpu-4-chaos-partition:10 -- ./workload run tpcc --warehouses=2000 --active-warehouses=590 --tolerate-errors --ramp=5m0s --duration=10m0s --partitions=3 --split {pgurl:10} --histograms=logs/warehouses=590/stats.json returned:
		stderr:
		
		stdout:
		Tables are not being partitioned because they've been previously partitioned.
		I190328 14:48:38.251731 1 workload/workload.go:441  starting 199 splits
		I190328 14:48:50.303946 1 workload/workload.go:441  starting 199 splits
		I190328 14:48:54.792278 1 workload/workload.go:441  starting 999 splits
		Error: ALTER TABLE history SPLIT AT VALUES ('89fbe76c-8b43-9435-0000-000000000000'): EOF
		Error:  exit status 1
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/17565100d1e7c66341e6db3e39bb66202958cb81

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1204567&tab=buildLog

The test failed on master:
	cluster.go:1233,tpcc.go:619,search.go:47,search.go:177,tpcc.go:615,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod stop teamcity-1204567-tpccbench-nodes-9-cpu-4-chaos-partition:1-9 returned:
		stderr:
		
		stdout:
		teamcity-1204567-tpccbench-nodes-9-cpu-4-chaos-partition: stopping and waiting........................................................................................................................
		2: exit status 255: Connection to 35.196.75.10 closed by remote host.
		
		I190328 17:37:46.418792 1 cluster_synced.go:1402  command failed
		: exit status 1
	cluster.go:1626,tpcc.go:732,tpcc.go:420,test.go:1216: Goexit() was called

@cockroach-teamcity

This comment has been minimized.

@tbg
Copy link
Member

tbg commented Mar 29, 2019

Looks like the test did run through, so perhaps the dead node detection misfired on a node that was left down intentionally. I fixed the output so we should see what it thought was down in tmrws run

tbg added a commit to tbg/cockroach that referenced this issue Mar 29, 2019
@tbg
Copy link
Member

tbg commented Mar 29, 2019

nvm, test harness shuts down nodes at the end, so that explains that

craig bot pushed a commit that referenced this issue Mar 29, 2019
36328: storage: skip TestStoreRangeRemoveDead, TestLeaseNotUsedAfterRestart r=andreimatei,nvanbenschoten a=tbg

We need to investigate both (in particular the former), but for now
minimize disruption.

Release note: None

36329: roachtest: don't stop nodes at end of tpc{c,h}bench r=andreimatei,nvanbenschoten a=tbg

Fixes #36325.
Touches #36277.
Touches #32135.
Fixes #36022.

Release note: None

Co-authored-by: Tobias Schottdorf <[email protected]>
@cockroachdb cockroachdb deleted a comment from cockroach-teamcity Apr 1, 2019
@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/c6df752eefe4609b8a5bbada0955f79a2cfb790e

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1217763&tab=buildLog

The test failed on master:
	cluster.go:1107,tpcc.go:577,tpcc.go:420,test.go:1228: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod put teamcity-1217763-tpccbench-nodes-9-cpu-4-chaos-partition:1-9 /home/agent/work/.go/src/github.com/cockroachdb/cockroach/cockroach.linux-2.6.32-gnu-amd64 ./cockroach returned:
		stderr:
		
		stdout:
		teamcity-1217763-tpccbench-nodes-9-cpu-4-chaos-partition: putting (dist) /home/agent/work/.go/src/github.com/cockroachdb/cockroach/cockroach.linux-2.6.32-gnu-amd64 ./cockroach
		...............................................................................................................................
		   1: done
		   2: done
		   3: done
		   4: done
		   5: done
		   6: done
		   7: ~ scp -r -C -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa -i /root/.ssh/google_compute_engine [email protected]:./cockroach [email protected]:./cockroach
		Connection to 35.227.98.255 closed by remote host.
		: exit status 1
		   8: done
		   9: done
		I190403 14:46:09.310786 1 cluster_synced.go:962  put /home/agent/work/.go/src/github.com/cockroachdb/cockroach/cockroach.linux-2.6.32-gnu-amd64 failed
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/5267932f6fec0405b31328c1ad43711b0bb013e5

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1220238&tab=buildLog

The test failed on master:
	cluster.go:1255,tpcc.go:620,search.go:47,search.go:177,tpcc.go:615,cluster.go:1667,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod start --racks=3 teamcity-1220238-tpccbench-nodes-9-cpu-4-chaos-partition:1-9 returned:
		stderr:
		
		stdout:
		t how to secure your cluster: https://www.cockroachlabs.com/docs/v19.1/secure-a-cluster.html
		*
		*
		* ERROR: could not cleanup temporary directories from record file: could not lock temporary directory /mnt/data1/cockroach/cockroach-temp338017004, may still be in use: IO error: While lock file: /mnt/data1/cockroach/cockroach-temp338017004/TEMP_DIR.LOCK: Resource temporarily unavailable
		*
		Failed running "start"
		E190404 17:49:51.270932 1 cli/error.go:229  exit status 1
		Error: exit status 1
		Failed running "start"
		
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func7
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:397
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1417
		runtime.goexit
			/usr/local/go/src/runtime/asm_amd64.s:1333: 
		I190404 17:49:57.037158 1 cluster_synced.go:1499  command failed
		: exit status 1
	cluster.go:1688,tpcc.go:732,tpcc.go:420,test.go:1228: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/b88a6ce86bfe507e14e1e80fdaefd219b5f0b046

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1222890&tab=buildLog

The test failed on release-19.1:
	cluster.go:1255,tpcc.go:653,search.go:47,search.go:177,tpcc.go:648,cluster.go:1667,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod start --racks=3 teamcity-1222890-tpccbench-nodes-9-cpu-4-chaos-partition:1-9 --encrypt returned:
		stderr:
		
		stdout:
		t how to secure your cluster: https://www.cockroachlabs.com/docs/v19.1/secure-a-cluster.html
		*
		*
		* ERROR: could not cleanup temporary directories from record file: could not lock temporary directory /mnt/data1/cockroach/cockroach-temp086456445, may still be in use: IO error: While lock file: /mnt/data1/cockroach/cockroach-temp086456445/TEMP_DIR.LOCK: Resource temporarily unavailable
		*
		Failed running "start"
		E190405 18:39:33.164351 1 cli/error.go:229  exit status 1
		Error: exit status 1
		Failed running "start"
		
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func7
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:397
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1417
		runtime.goexit
			/usr/local/go/src/runtime/asm_amd64.s:1333: 
		I190405 18:39:44.608207 1 cluster_synced.go:1499  command failed
		: exit status 1
	cluster.go:1688,tpcc.go:765,tpcc.go:453,test.go:1228: Goexit() was called

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/304007d94a5354dc8fa23d47d29f4ba3f214251d

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1224738&tab=buildLog

The test failed on release-2.1:
	cluster.go:1688,tpcc.go:765,tpcc.go:453,test.go:1228: error running tpcc load generator: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1224738-tpccbench-nodes-9-cpu-4-chaos-partition:10 -- ./workload run tpcc --warehouses=2000 --active-warehouses=600 --tolerate-errors --ramp=5m0s --duration=10m0s --partitions=3 --split {pgurl:10} --histograms=logs/warehouses=600/stats.json returned:
		stderr:
		
		stdout:
		Error: EOF
		Error:  exit status 1
		: exit status 1

@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/1a5eabad4511a3371a6b2809d2bfc29e8aff66a6

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=tpccbench/nodes=9/cpu=4/chaos/partition PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1224702&tab=buildLog

The test failed on master:
	cluster.go:1688,tpcc.go:765,tpcc.go:453,test.go:1228: could not restart node :9: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod start teamcity-1224702-tpccbench-nodes-9-cpu-4-chaos-partition:9 --encrypt returned:
		stderr:
		
		stdout:
		--insecure --store=path=/mnt/data1/cockroach --log-dir=${HOME}/logs --background --cache=25% --max-sql-memory=25% --port=26257 --http-port=26258 --locality=cloud=gce,region=us-east1,zone=us-east1-b --join=35.196.158.128:26257 --enterprise-encryption=path=/mnt/data1/cockroach,key=/mnt/data1/cockroach/aes-128.key,old-key=plain >> ${HOME}/logs/cockroach.stdout.log 2>> ${HOME}/logs/cockroach.stderr.log || (x=$?; cat ${HOME}/logs/cockroach.stderr.log; exit $x)
		Connection to 35.196.242.234 closed by remote host.
		
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func7
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:397
		github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
			/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1417
		runtime.goexit
			/usr/local/go/src/runtime/asm_amd64.s:1333: 
		I190406 16:55:38.397921 1 cluster_synced.go:1499  command failed
		: exit status 1
	cluster.go:953,context.go:90,cluster.go:942,asm_amd64.s:522,panic.go:397,test.go:776,test.go:762,cluster.go:1688,tpcc.go:765,tpcc.go:453,test.go:1228: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-1224702-tpccbench-nodes-9-cpu-4-chaos-partition --oneshot --ignore-empty-nodes: exit status 1 10: skipped
		9: dead
		8: 5148
		5: 4935
		6: 4729
		2: 4695
		3: 4904
		4: 4868
		1: 4706
		7: 4544
		Error:  9: dead

craig bot pushed a commit that referenced this issue Apr 9, 2019
33242: workload/tpcc: split, partition, and scatter during initialization r=nvanbenschoten a=nvanbenschoten

Fixes #32135.

This PR pushes splitting, partitioning, and scattering logic into `tpcc`'s `PostLoad` hook, removing it from its `Ops` method.

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in #33242 Apr 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Projects
None yet
Development

No branches or pull requests

3 participants