Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ccl/backupccl: TestBackupRestoreWithConcurrentWrites failed #46827

Closed
cockroach-teamcity opened this issue Apr 1, 2020 · 9 comments
Closed
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

(ccl/backupccl).TestBackupRestoreWithConcurrentWrites failed on release-20.1@e17e3ad399a0c553ecde81236aec27b979c5c499:

Fatal error:

panic: event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{} [recovered]
	panic: panic while executing 1 statements: UPDATE _._ SET _ = $1 WHERE _ = $2; caused by event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}

Stack:

goroutine 173672 [running]:
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).closeWrapper(0xc001955000, 0x6ea0ea0, 0xc0044405c0, 0x5c4c8c0, 0xc0057896a0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:810 +0x4e1
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn.func1(0xc001955000, 0x6ea0ea0, 0xc0044405c0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:474 +0x6f
panic(0x5c4c8c0, 0xc0057896a0)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).txnStateTransitionsApplyWrapper(0xc001955000, 0x6e23720, 0xc005789680, 0x5c4b2e0, 0xc00069fe60, 0x7f779022f348, 0xc0016f6f80, 0x120, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:2085 +0x16b5
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd(0xc001955000, 0x6ea0ea0, 0xc0044405c0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1497 +0xa77
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc001955000, 0x6ea0ea0, 0xc0044405c0, 0xc0045ae110, 0x5400, 0x15000, 0xc0045ae1b0, 0xc0002ade20, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1297 +0x484
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn(0xc0021ff080, 0x6ea0ea0, 0xc0044405c0, 0xc001955000, 0x5400, 0x15000, 0xc0045ae1b0, 0xc0002ade20, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:476 +0x132
github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync.func1(0xc004cd5e4d, 0xc0053c82c0, 0x6ea0ea0, 0xc0044405c0, 0xc0002ade20, 0xc0021ff080, 0xc0016f6900, 0x6ec52a0, 0xc0022a0120, 0xc0045379e0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:594 +0x36c
created by github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:526 +0x1ea

Log preceding fatal error

I200401 00:21:42.079582 173192 kv/kvserver/replica_command.go:397  [n1,s1,r34/1:/{Table/53/1/5-Max}] initiating a split of this range at key /Table/53/1/8 [r36] (manual)
I200401 00:21:42.604741 172131 kv/kvserver/replica_command.go:397  [n1,s1,r33/1:/Table/53{-/1/3}] initiating a split of this range at key /Table/53/1/2 [r37] (manual)
I200401 00:21:42.794219 173218 kv/kvserver/replica_command.go:397  [n3,s3,r35/2:/Table/53/1/{3-5}] initiating a split of this range at key /Table/53/1/4 [r53] (manual)
I200401 00:21:42.957431 173276 kv/kvserver/replica_command.go:397  [n1,s1,r34/1:/Table/53/1/{5-8}] initiating a split of this range at key /Table/53/1/7 [r38] (manual)
I200401 00:21:43.089714 173350 kv/kvserver/replica_command.go:397  [n3,s3,r36/2:/{Table/53/1/8-Max}] initiating a split of this range at key /Table/53/1/9 [r54] (manual); delayed split for 0.2s to avoid Raft snapshot
I200401 00:21:43.645984 172131 kv/kvserver/replica_command.go:397  [n1,s1,r33/1:/Table/53{-/1/2}] initiating a split of this range at key /Table/53/1/1 [r39] (manual)
I200401 00:21:43.919091 173276 kv/kvserver/replica_command.go:397  [n1,s1,r34/1:/Table/53/1/{5-7}] initiating a split of this range at key /Table/53/1/6 [r40] (manual)
I200401 00:21:44.775099 167706 testutils/testcluster/testcluster.go:760  WaitForFullReplication took: 83.069015ms
I200401 00:21:45.641686 167942 server/status/runtime.go:498  [n1] runtime stats: 1.1 GiB RSS, 771 goroutines, 60 MiB/45 MiB/109 MiB GO alloc/idle/total, 92 MiB/141 MiB CGO alloc/total, 3219.7 CGO/sec, 212.9/17.1 %(u/s)time, 0.9 %gc (11x), 3.8 MiB/3.8 MiB (r/w)net
I200401 00:21:49.063394 174204 jobs/registry.go:797  [n1] job 542736737633337345: stepping through state running with error <nil>
I200401 00:21:49.873820 169358 server/status/runtime.go:498  [n2] runtime stats: 1.1 GiB RSS, 776 goroutines, 46 MiB/60 MiB/113 MiB GO alloc/idle/total, 94 MiB/143 MiB CGO alloc/total, 1782.3 CGO/sec, 211.2/17.9 %(u/s)time, 0.8 %gc (11x), 3.8 MiB/3.8 MiB (r/w)net
I200401 00:21:50.186224 174421 ccl/storageccl/export.go:102  [n1,s1,r33/1:/Table/53{-/1/1}] export [/Table/53/1,/Table/53/1/1)
I200401 00:21:50.194886 174324 ccl/storageccl/export.go:102  [n1,s1,r39/1:/Table/53/1/{1-2}] export [/Table/53/1/1,/Table/53/1/2)
I200401 00:21:50.223058 174291 ccl/storageccl/export.go:102  [n2,s2,r37/3:/Table/53/1/{2-3}] export [/Table/53/1/2,/Table/53/1/3)
I200401 00:21:50.237904 174292 ccl/storageccl/export.go:102  [n2,s2,r40/3:/Table/53/1/{6-7}] export [/Table/53/1/6,/Table/53/1/7)
I200401 00:21:50.248867 174371 ccl/storageccl/export.go:102  [n1,s1,r34/1:/Table/53/1/{5-6}] export [/Table/53/1/5,/Table/53/1/6)
I200401 00:21:50.259299 174391 ccl/storageccl/export.go:102  [n3,s3,r38/2:/Table/53/1/{7-8}] export [/Table/53/1/7,/Table/53/1/8)
I200401 00:21:50.263767 174439 ccl/storageccl/export.go:102  [n1,s1,r54/1:/{Table/53/1/9-Max}] export [/Table/53/1/9,/Table/53/2)
I200401 00:21:50.266212 174438 ccl/storageccl/export.go:102  [n1,s1,r53/1:/Table/53/1/{4-5}] export [/Table/53/1/4,/Table/53/1/5)
I200401 00:21:50.277082 174409 ccl/storageccl/export.go:102  [n3,s3,r36/2:/Table/53/1/{8-9}] export [/Table/53/1/8,/Table/53/1/9)
I200401 00:21:50.281148 174410 ccl/storageccl/export.go:102  [n3,s3,r35/2:/Table/53/1/{3-4}] export [/Table/53/1/3,/Table/53/1/4)
I200401 00:21:50.866128 169706 server/status/runtime.go:498  [n3] runtime stats: 1.1 GiB RSS, 774 goroutines, 44 MiB/62 MiB/108 MiB GO alloc/idle/total, 94 MiB/143 MiB CGO alloc/total, 1717.1 CGO/sec, 207.4/17.3 %(u/s)time, 0.7 %gc (11x), 3.7 MiB/3.7 MiB (r/w)net
I200401 00:21:51.615633 174204 jobs/registry.go:797  [n1] job 542736737633337345: stepping through state succeeded with error <nil>
I200401 00:21:52.231787 174204 jobs/registry.go:952  [n1] job 542736737633337345: status succeeded after adoption finished
I200401 00:21:54.436076 174782 kv/kvserver/replica_command.go:663  [n2,merge,s2,r37/3:/Table/53/1/{2-3}] initiating a merge of r35:/Table/53/1/{3-4} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=8] into this range (lhs+rhs has (size=1.4 KiB+1.7 KiB=3.1 KiB qps=7.96+7.32=15.28qps) below threshold (size=128 MiB, qps=1250.00))
I200401 00:21:55.376849 173261 sql/table.go:942  [n1,client=127.0.0.1:40598,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 00:21:55.380065 173261 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:40598,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 00:21:55.410929 169322 kv/kvserver/store_remove_replica.go:122  [n2,s2,r37/3:/Table/53/1/{2-3}] removing replica r35/3
I200401 00:21:55.462886 169619 kv/kvserver/store_remove_replica.go:122  [n3,s3,r37/2:/Table/53/1/{2-3}] removing replica r35/2
I200401 00:21:55.467180 167882 kv/kvserver/store_remove_replica.go:122  [n1,s1,r37/1:/Table/53/1/{2-3}] removing replica r35/1
I200401 00:21:55.540333 174967 kv/kvserver/replica_command.go:663  [n1,merge,s1,r39/1:/Table/53/1/{1-2}] initiating a merge of r37:/Table/53/1/{2-4} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=9] into this range (lhs+rhs has (size=2.8 KiB+1.7 KiB=4.5 KiB qps=13.75+7.96=21.71qps) below threshold (size=128 MiB, qps=1250.00))
I200401 00:21:55.616399 175077 kv/kvserver/replica_command.go:663  [n2,merge,s2,r40/3:/Table/53/1/{6-7}] initiating a merge of r38:/Table/53/1/{7-8} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=8] into this range (lhs+rhs has (size=2.0 KiB+2.4 KiB=4.4 KiB qps=4.94+6.34=11.28qps) below threshold (size=128 MiB, qps=1250.00))
I200401 00:21:55.661232 167942 server/status/runtime.go:498  [n1] runtime stats: 1.1 GiB RSS, 794 goroutines, 52 MiB/53 MiB/108 MiB GO alloc/idle/total, 98 MiB/147 MiB CGO alloc/total, 966.5 CGO/sec, 206.5/17.4 %(u/s)time, 1.0 %gc (11x), 3.7 MiB/3.7 MiB (r/w)net
I200401 00:21:55.994052 173261 sql/event_log.go:132  [n1,client=127.0.0.1:40598,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 00:21:55.994778 173261 jobs/registry.go:285  [n1,client=127.0.0.1:40598,hostssl,user=root] scheduled jobs [542736761821659137]
I200401 00:21:56.656832 168016 jobs/jobs.go:743  job 542736761821659137: adopting
I200401 00:21:56.739423 167868 kv/kvserver/store_remove_replica.go:122  [n1,s1,r39/1:/Table/53/1/{1-2}] removing replica r37/1
I200401 00:21:56.743060 169295 kv/kvserver/store_remove_replica.go:122  [n2,s2,r39/3:/Table/53/1/{1-2}] removing replica r37/3
I200401 00:21:56.843830 169564 kv/kvserver/store_remove_replica.go:122  [n3,s3,r39/2:/Table/53/1/{1-2}] removing replica r37/2
I200401 00:21:57.232405 169320 kv/kvserver/store_remove_replica.go:122  [n2,s2,r40/3:/Table/53/1/{6-7}] removing replica r38/3
I200401 00:21:57.332909 167830 kv/kvserver/store_remove_replica.go:122  [n1,s1,r40/1:/Table/53/1/{6-7}] removing replica r38/1
I200401 00:21:57.350377 169617 kv/kvserver/store_remove_replica.go:122  [n3,s3,r40/2:/Table/53/1/{6-7}] removing replica r38/2
I200401 00:21:57.436308 175372 kv/kvserver/replica_command.go:663  [n1,merge,s1,r34/1:/Table/53/1/{5-6}] initiating a merge of r40:/Table/53/1/{6-8} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=10] into this range (lhs+rhs has (size=1.7 KiB+2.3 KiB=3.9 KiB qps=5.72+7.33=13.05qps) below threshold (size=128 MiB, qps=1250.00))
I200401 00:21:57.713474 175412 kv/kvserver/replica_command.go:663  [n2,merge,s2,r40/3:/Table/53/1/{6-8}] initiating a merge of r36:/Table/53/1/{8-9} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=8] into this range (lhs+rhs has (size=4.6 KiB+2.7 KiB=7.3 KiB qps=7.33+1.78=9.10qps) below threshold (size=128 MiB, qps=1250.00))
I200401 00:21:57.810257 168016 jobs/registry.go:1178  job 542736761821659137: resuming execution
I200401 00:21:58.107235 175565 jobs/registry.go:797  [n1] job 542736761821659137: stepping through state running with error <nil>
I200401 00:21:58.194765 175565 sql/schema_changer.go:445  [n1,job=542736761821659137,scExec] schema change on bank (53 v2) mutation 0 starting execution...
I200401 00:21:58.508238 175565 sql/sqlbase/structured.go:1554  [n1,job=542736761821659137,scExec] publish: descID=53 (bank) version=3 mtime=1970-01-01 00:00:00 +0000 UTC
E200401 00:21:59.324749 173672 sql/conn_executor.go:796  [n3,client=127.0.0.1:36166,hostssl,user=root] a SQL panic has occurred while executing "UPDATE data.bank SET payload = $1 WHERE id = $2": event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}
E200401 00:21:59.325793 173672 util/log/crash_reporting.go:208  [n3,client=127.0.0.1:36166,hostssl,user=root] a panic has occurred!

More

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestBackupRestoreWithConcurrentWrites PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity cockroach-teamcity added branch-release-20.1 C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Apr 1, 2020
@cockroach-teamcity cockroach-teamcity added this to the 20.1 milestone Apr 1, 2020
@cockroach-teamcity
Copy link
Member Author

(ccl/backupccl).TestBackupRestoreWithConcurrentWrites failed on release-20.1@2f0eaf4891eec71d2f75a254f59b5c4af5e4c7b0:

Fatal error:

panic: event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{} [recovered]
	panic: panic while executing 1 statements: UPDATE _._ SET _ = $1 WHERE _ = $2; caused by event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}

Stack:

goroutine 165830 [running]:
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).closeWrapper(0xc000c77000, 0x6ea1b00, 0xc005fac140, 0x5c4d140, 0xc004f105e0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:810 +0x4e1
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn.func1(0xc000c77000, 0x6ea1b00, 0xc005fac140)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:474 +0x6f
panic(0x5c4d140, 0xc004f105e0)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).txnStateTransitionsApplyWrapper(0xc000c77000, 0x6e24380, 0xc004f105c0, 0x5c4bb60, 0xc000e3be90, 0x7f3b2c33b0e8, 0xc002d63880, 0xfc, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:2085 +0x16b5
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd(0xc000c77000, 0x6ea1b00, 0xc005fac140, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1497 +0xa77
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc000c77000, 0x6ea1b00, 0xc005fac140, 0xc000ae3290, 0x5400, 0x15000, 0xc000ae3330, 0xc0002cecb0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1297 +0x484
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn(0xc000f74100, 0x6ea1b00, 0xc005fac140, 0xc000c77000, 0x5400, 0x15000, 0xc000ae3330, 0xc0002cecb0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:476 +0x132
github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync.func1(0xc00593a642, 0xc0048e3380, 0x6ea1b00, 0xc005fac140, 0xc0002cecb0, 0xc000f74100, 0xc002d63200, 0x6ec5f00, 0xc0050bb1a0, 0xc003e75f80, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:594 +0x36c
created by github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:526 +0x1ea

Log preceding fatal error

I200401 01:06:08.000040 165339 kv/kvserver/replica_command.go:397  [n1,s1,r35/1:/Table/53/1/{3-5}] initiating a split of this range at key /Table/53/1/4 [r37] (manual); delayed split for 0.2s to avoid Raft snapshot
I200401 01:06:08.478138 165503 kv/kvserver/replica_command.go:397  [n3,s3,r34/3:/Table/53/1/{5-8}] initiating a split of this range at key /Table/53/1/7 [r44] (manual)
I200401 01:06:08.681079 165492 kv/kvserver/replica_command.go:397  [n1,s1,r43/1:/{Table/53/1/8-Max}] initiating a split of this range at key /Table/53/1/9 [r39] (manual)
I200401 01:06:09.319537 165765 kv/kvserver/replica_command.go:397  [n3,s3,r34/3:/Table/53/1/{5-7}] initiating a split of this range at key /Table/53/1/6 [r45] (manual)
I200401 01:06:09.762420 159976 server/status/runtime.go:498  [n1] runtime stats: 1.2 GiB RSS, 757 goroutines, 70 MiB/102 MiB/114 MiB GO alloc/idle/total, 92 MiB/140 MiB CGO alloc/total, 2346.2 CGO/sec, 182.3/16.0 %(u/s)time, 1.3 %gc (11x), 4.3 MiB/4.3 MiB (r/w)net
I200401 01:06:09.837644 159789 testutils/testcluster/testcluster.go:760  WaitForFullReplication took: 138.4578ms
I200401 01:06:13.316426 166161 jobs/registry.go:797  [n1] job 542745469280419841: stepping through state running with error <nil>
I200401 01:06:13.843354 161403 server/status/runtime.go:498  [n2] runtime stats: 1.2 GiB RSS, 771 goroutines, 51 MiB/119 MiB/111 MiB GO alloc/idle/total, 96 MiB/141 MiB CGO alloc/total, 1953.7 CGO/sec, 192.5/16.8 %(u/s)time, 0.8 %gc (11x), 4.1 MiB/4.1 MiB (r/w)net
I200401 01:06:14.006949 166422 ccl/storageccl/export.go:102  [n1,s1,r35/1:/Table/53/1/{3-4}] export [/Table/53/1/3,/Table/53/1/4)
I200401 01:06:14.009275 166425 ccl/storageccl/export.go:102  [n1,s1,r37/1:/Table/53/1/{4-5}] export [/Table/53/1/4,/Table/53/1/5)
I200401 01:06:14.011343 166424 ccl/storageccl/export.go:102  [n1,s1,r39/1:/{Table/53/1/9-Max}] export [/Table/53/1/9,/Table/53/2)
I200401 01:06:14.017762 166423 ccl/storageccl/export.go:102  [n1,s1,r33/1:/Table/53{-/1/1}] export [/Table/53/1,/Table/53/1/1)
I200401 01:06:14.028941 166426 ccl/storageccl/export.go:102  [n1,s1,r44/1:/Table/53/1/{7-8}] export [/Table/53/1/7,/Table/53/1/8)
I200401 01:06:14.044738 166430 ccl/storageccl/export.go:102  [n1,s1,r36/1:/Table/53/1/{2-3}] export [/Table/53/1/2,/Table/53/1/3)
I200401 01:06:14.047485 166428 ccl/storageccl/export.go:102  [n1,s1,r43/1:/Table/53/1/{8-9}] export [/Table/53/1/8,/Table/53/1/9)
I200401 01:06:14.050837 166431 ccl/storageccl/export.go:102  [n1,s1,r45/1:/Table/53/1/{6-7}] export [/Table/53/1/6,/Table/53/1/7)
I200401 01:06:14.064173 166406 ccl/storageccl/export.go:102  [n3,s3,r34/3:/Table/53/1/{5-6}] export [/Table/53/1/5,/Table/53/1/6)
I200401 01:06:14.104900 166257 ccl/storageccl/export.go:102  [n3,s3,r38/3:/Table/53/1/{1-2}] export [/Table/53/1/1,/Table/53/1/2)
I200401 01:06:14.743059 161752 server/status/runtime.go:498  [n3] runtime stats: 1.2 GiB RSS, 773 goroutines, 50 MiB/120 MiB/113 MiB GO alloc/idle/total, 96 MiB/142 MiB CGO alloc/total, 1866.7 CGO/sec, 195.3/17.5 %(u/s)time, 0.6 %gc (11x), 4.0 MiB/4.0 MiB (r/w)net
I200401 01:06:15.057022 166161 jobs/registry.go:797  [n1] job 542745469280419841: stepping through state succeeded with error <nil>
I200401 01:06:15.451964 166161 jobs/registry.go:952  [n1] job 542745469280419841: status succeeded after adoption finished
I200401 01:06:18.336996 166928 kv/kvserver/replica_command.go:663  [n1,merge,s1,r35/1:/Table/53/1/{3-4}] initiating a merge of r37:/Table/53/1/{4-5} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=2.1 KiB+2.1 KiB=4.3 KiB qps=0.00+0.88=0.88qps) below threshold (size=128 MiB, qps=1250.00))
I200401 01:06:18.520936 165339 sql/table.go:942  [n1,client=127.0.0.1:37168,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 01:06:18.521135 165339 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:37168,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 01:06:18.714085 159895 kv/kvserver/store_remove_replica.go:122  [n1,s1,r35/1:/Table/53/1/{3-4}] removing replica r37/1
I200401 01:06:18.716197 161324 kv/kvserver/store_remove_replica.go:122  [n2,s2,r35/2:/Table/53/1/{3-4}] removing replica r37/2
I200401 01:06:18.759245 161658 kv/kvserver/store_remove_replica.go:122  [n3,s3,r35/3:/Table/53/1/{3-4}] removing replica r37/3
I200401 01:06:18.833273 165339 sql/event_log.go:132  [n1,client=127.0.0.1:37168,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 01:06:18.837165 165339 jobs/registry.go:285  [n1,client=127.0.0.1:37168,hostssl,user=root] scheduled jobs [542745488294379521]
I200401 01:06:19.306214 160102 jobs/jobs.go:743  job 542745488294379521: adopting
I200401 01:06:19.375998 167349 kv/kvserver/replica_command.go:663  [n1,merge,s1,r44/1:/Table/53/1/{7-8}] initiating a merge of r43:/Table/53/1/{8-9} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=1.8 KiB+2.4 KiB=4.1 KiB qps=2.78+10.64=13.42qps) below threshold (size=128 MiB, qps=1250.00))
I200401 01:06:19.414109 167143 kv/kvserver/replica_command.go:663  [n3,merge,s3,r34/3:/Table/53/1/{5-6}] initiating a merge of r45:/Table/53/1/{6-7} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=9] into this range (lhs+rhs has (size=1.4 KiB+2.6 KiB=4.0 KiB qps=9.70+7.86=17.56qps) below threshold (size=128 MiB, qps=1250.00))
I200401 01:06:19.773754 159976 server/status/runtime.go:498  [n1] runtime stats: 1.2 GiB RSS, 794 goroutines, 62 MiB/109 MiB/115 MiB GO alloc/idle/total, 100 MiB/146 MiB CGO alloc/total, 1136.2 CGO/sec, 198.8/16.3 %(u/s)time, 0.7 %gc (11x), 4.0 MiB/4.0 MiB (r/w)net
I200401 01:06:19.799358 160102 jobs/registry.go:1178  job 542745488294379521: resuming execution
I200401 01:06:19.866788 159933 kv/kvserver/store_remove_replica.go:122  [n1,s1,r44/1:/Table/53/1/{7-8}] removing replica r43/1
I200401 01:06:19.892381 161699 kv/kvserver/store_remove_replica.go:122  [n3,s3,r44/3:/Table/53/1/{7-8}] removing replica r43/3
I200401 01:06:19.914121 161358 kv/kvserver/store_remove_replica.go:122  [n2,s2,r44/2:/Table/53/1/{7-8}] removing replica r43/2
I200401 01:06:20.100599 161666 kv/kvserver/store_remove_replica.go:122  [n3,s3,r34/3:/Table/53/1/{5-6}] removing replica r45/3
I200401 01:06:20.153551 161329 kv/kvserver/store_remove_replica.go:122  [n2,s2,r34/2:/Table/53/1/{5-6}] removing replica r45/2
I200401 01:06:20.173614 159931 kv/kvserver/store_remove_replica.go:122  [n1,s1,r34/1:/Table/53/1/{5-6}] removing replica r45/1
I200401 01:06:20.195987 167450 jobs/registry.go:797  [n1] job 542745488294379521: stepping through state running with error <nil>
I200401 01:06:20.243843 167450 sql/schema_changer.go:445  [n1,job=542745488294379521,scExec] schema change on bank (53 v2) mutation 0 starting execution...
I200401 01:06:20.371056 167428 kv/kvserver/replica_command.go:663  [n1,merge,s1,r36/1:/Table/53/1/{2-3}] initiating a merge of r35:/Table/53/1/{3-5} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=9] into this range (lhs+rhs has (size=1.9 KiB+4.4 KiB=6.3 KiB qps=0.93+2.00=2.92qps) below threshold (size=128 MiB, qps=1250.00))
I200401 01:06:20.412874 167498 kv/kvserver/replica_command.go:663  [n3,merge,s3,r38/3:/Table/53/1/{1-2}] initiating a merge of r36:/Table/53/1/{2-3} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=2.1 KiB+1.9 KiB=4.0 KiB qps=2.57+0.93=3.49qps) below threshold (size=128 MiB, qps=1250.00))
I200401 01:06:20.543067 167450 sql/sqlbase/structured.go:1554  [n1,job=542745488294379521,scExec] publish: descID=53 (bank) version=3 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 01:06:21.100420 159909 kv/kvserver/store_remove_replica.go:122  [n1,s1,r36/1:/Table/53/1/{2-3}] removing replica r35/1
I200401 01:06:21.139666 161347 kv/kvserver/store_remove_replica.go:122  [n2,s2,r36/2:/Table/53/1/{2-3}] removing replica r35/2
I200401 01:06:21.148601 161648 kv/kvserver/store_remove_replica.go:122  [n3,s3,r36/3:/Table/53/1/{2-3}] removing replica r35/3
E200401 01:06:21.590972 165830 sql/conn_executor.go:796  [n2,client=127.0.0.1:59126,hostssl,user=root] a SQL panic has occurred while executing "UPDATE data.bank SET payload = $1 WHERE id = $2": event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}
E200401 01:06:21.593089 165830 util/log/crash_reporting.go:208  [n2,client=127.0.0.1:59126,hostssl,user=root] a panic has occurred!

More

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestBackupRestoreWithConcurrentWrites PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@tbg
Copy link
Member

tbg commented Apr 1, 2020

Also saw this immediately with the big hammer that is

make roachprod-stressrace CLUSTER=tobias-stress PKG=./pkg/ccl/backupccl/ TESTS=TestBackupRestoreWithConcurrentWrites

(49s)

cc @nvanbenschoten or @andreimatei

@dt dt assigned ajwerner and unassigned dt Apr 1, 2020
@cockroach-teamcity
Copy link
Member Author

(ccl/backupccl).TestBackupRestoreWithConcurrentWrites failed on release-20.1@6d087fd618457c55498a6e73bd8d16e64d0093ff:

Fatal error:

panic: event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{} [recovered]
	panic: panic while executing 1 statements: UPDATE _._ SET _ = $1 WHERE _ = $2; caused by event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}

Stack:

goroutine 166948 [running]:
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).closeWrapper(0xc002a3c000, 0x6ea50a0, 0xc000080a00, 0x5c4f960, 0xc006b0bba0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:810 +0x4e1
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn.func1(0xc002a3c000, 0x6ea50a0, 0xc000080a00)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:474 +0x6f
panic(0x5c4f960, 0xc006b0bba0)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).txnStateTransitionsApplyWrapper(0xc002a3c000, 0x6e27920, 0xc006b0bb80, 0x5c4e380, 0xc002999e60, 0x7fded5384188, 0xc002374680, 0x132, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:2085 +0x16b5
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd(0xc002a3c000, 0x6ea50a0, 0xc000080a00, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1497 +0xa77
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc002a3c000, 0x6ea50a0, 0xc000080a00, 0xc005489a10, 0x5400, 0x15000, 0xc005489ab0, 0xc0002605e0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1297 +0x484
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn(0xc000be2c00, 0x6ea50a0, 0xc000080a00, 0xc002a3c000, 0x5400, 0x15000, 0xc005489ab0, 0xc0002605e0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:476 +0x132
github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync.func1(0xc0022244f2, 0xc0031e82e0, 0x6ea50a0, 0xc000080a00, 0xc0002605e0, 0xc000be2c00, 0xc002374000, 0x6ec94a0, 0xc0091bf650, 0xc00850b860, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:594 +0x36c
created by github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:526 +0x1ea

Log preceding fatal error

Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
estimated_pending_compaction_bytes: 0 B
I200401 15:01:20.888309 161051 testutils/testcluster/testcluster.go:760  WaitForFullReplication took: 142.981434ms
I200401 15:01:25.168942 161275 server/status/runtime.go:498  [n1] runtime stats: 1.3 GiB RSS, 776 goroutines, 77 MiB/92 MiB/130 MiB GO alloc/idle/total, 93 MiB/138 MiB CGO alloc/total, 1891.3 CGO/sec, 187.0/16.4 %(u/s)time, 0.8 %gc (8x), 3.9 MiB/3.9 MiB (r/w)net
I200401 15:01:26.887727 167477 kv/kvserver/replica_consistency.go:255  [n1,consistencyChecker,s1,r4/1:/System{/tsd-tse}] triggering stats recomputation to resolve delta of {ContainsEstimates:12690 LastUpdateNanos:1585753281985974785 IntentAge:0 GCBytesAge:0 LiveBytes:54552 LiveCount:-2034 KeyBytes:-99006 KeyCount:-2034 ValBytes:153558 ValCount:-2034 IntentBytes:0 IntentCount:0 SysBytes:0 SysCount:0}
I200401 15:01:29.319574 162754 server/status/runtime.go:498  [n2] runtime stats: 1.3 GiB RSS, 771 goroutines, 49 MiB/119 MiB/133 MiB GO alloc/idle/total, 94 MiB/141 MiB CGO alloc/total, 2008.1 CGO/sec, 162.6/15.2 %(u/s)time, 0.3 %gc (7x), 3.6 MiB/3.6 MiB (r/w)net
I200401 15:01:30.116685 163038 server/status/runtime.go:498  [n3] runtime stats: 1.3 GiB RSS, 776 goroutines, 81 MiB/89 MiB/133 MiB GO alloc/idle/total, 96 MiB/143 MiB CGO alloc/total, 1759.2 CGO/sec, 159.1/15.2 %(u/s)time, 0.3 %gc (6x), 3.6 MiB/3.6 MiB (r/w)net
I200401 15:01:30.367583 167844 jobs/registry.go:797  [n1] job 542909691597586433: stepping through state running with error <nil>
I200401 15:01:32.009225 168053 ccl/storageccl/export.go:102  [n1,s1,r33/1:/Table/53{-/1/1}] export [/Table/53/1,/Table/53/1/1)
I200401 15:01:32.108473 168057 ccl/storageccl/export.go:102  [n2,s2,r40/2:/Table/53/1/{6-7}] export [/Table/53/1/6,/Table/53/1/7)
I200401 15:01:32.117402 168056 ccl/storageccl/export.go:102  [n2,s2,r38/2:/Table/53/1/{7-8}] export [/Table/53/1/7,/Table/53/1/8)
I200401 15:01:32.123497 168055 ccl/storageccl/export.go:102  [n2,s2,r36/2:/Table/53/1/{8-9}] export [/Table/53/1/8,/Table/53/1/9)
I200401 15:01:32.131946 167968 ccl/storageccl/export.go:102  [n1,s1,r53/1:/Table/53/1/{4-5}] export [/Table/53/1/4,/Table/53/1/5)
I200401 15:01:32.137336 167970 ccl/storageccl/export.go:102  [n1,s1,r34/1:/Table/53/1/{5-6}] export [/Table/53/1/5,/Table/53/1/6)
I200401 15:01:32.137524 167994 ccl/storageccl/export.go:102  [n3,s3,r37/3:/Table/53/1/{2-3}] export [/Table/53/1/2,/Table/53/1/3)
I200401 15:01:32.158673 167996 ccl/storageccl/export.go:102  [n3,s3,r35/3:/Table/53/1/{3-4}] export [/Table/53/1/3,/Table/53/1/4)
I200401 15:01:32.181850 167951 ccl/storageccl/export.go:102  [n2,s2,r63/2:/{Table/53/1/9-Max}] export [/Table/53/1/9,/Table/53/2)
I200401 15:01:32.195520 167934 ccl/storageccl/export.go:102  [n3,s3,r39/3:/Table/53/1/{1-2}] export [/Table/53/1/1,/Table/53/1/2)
I200401 15:01:33.649068 167844 jobs/registry.go:797  [n1] job 542909691597586433: stepping through state succeeded with error <nil>
I200401 15:01:34.054851 167844 jobs/registry.go:952  [n1] job 542909691597586433: status succeeded after adoption finished
I200401 15:01:35.189823 161275 server/status/runtime.go:498  [n1] runtime stats: 1.3 GiB RSS, 775 goroutines, 87 MiB/87 MiB/130 MiB GO alloc/idle/total, 98 MiB/145 MiB CGO alloc/total, 1657.0 CGO/sec, 171.4/14.6 %(u/s)time, 0.4 %gc (8x), 3.6 MiB/3.6 MiB (r/w)net
I200401 15:01:36.166696 166581 sql/table.go:942  [n1,client=127.0.0.1:49768,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 15:01:36.167549 166581 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:49768,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 15:01:36.505320 166581 sql/event_log.go:132  [n1,client=127.0.0.1:49768,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 15:01:36.507024 166581 jobs/registry.go:285  [n1,client=127.0.0.1:49768,hostssl,user=root] scheduled jobs [542909713750163457]
I200401 15:01:36.572244 161236 kv/kvserver/queue.go:578  [n1,s1,r3/1:/System/{NodeLive…-tsd}] rate limited in MaybeAdd (merge): throttled on async limiting semaphore
I200401 15:01:36.927872 161416 jobs/jobs.go:743  job 542909713750163457: adopting
I200401 15:01:37.106019 168575 kv/kvserver/replica_command.go:663  [n1,merge,s1,r33/1:/Table/53{-/1/1}] initiating a merge of r39:/Table/53/1/{1-2} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=9] into this range (lhs+rhs has (size=2.6 KiB+2.0 KiB=4.6 KiB qps=3.18+5.20=8.38qps) below threshold (size=128 MiB, qps=1250.00))
I200401 15:01:37.187380 168589 kv/kvserver/replica_command.go:663  [n3,merge,s3,r37/3:/Table/53/1/{2-3}] initiating a merge of r35:/Table/53/1/{3-4} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=2.0 KiB+2.1 KiB=4.1 KiB qps=2.54+12.64=15.19qps) below threshold (size=128 MiB, qps=1250.00))
I200401 15:01:37.306230 168888 kv/kvserver/replica_command.go:663  [n2,merge,s2,r40/2:/Table/53/1/{6-7}] initiating a merge of r38:/Table/53/1/{7-8} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=3.1 KiB+2.3 KiB=5.3 KiB qps=13.17+9.14=22.30qps) below threshold (size=128 MiB, qps=1250.00))
I200401 15:01:37.664260 161416 jobs/registry.go:1178  job 542909713750163457: resuming execution
I200401 15:01:37.924301 169020 jobs/registry.go:797  [n1] job 542909713750163457: stepping through state running with error <nil>
I200401 15:01:38.074876 169020 sql/schema_changer.go:445  [n1,job=542909713750163457,scExec] schema change on bank (53 v2) mutation 0 starting execution...
I200401 15:01:38.188791 161213 kv/kvserver/store_remove_replica.go:122  [n1,s1,r33/1:/Table/53{-/1/1}] removing replica r39/1
I200401 15:01:38.209168 162710 kv/kvserver/store_remove_replica.go:122  [n2,s2,r33/2:/Table/53{-/1/1}] removing replica r39/2
I200401 15:01:38.242428 163017 kv/kvserver/store_remove_replica.go:122  [n3,s3,r33/3:/Table/53{-/1/1}] removing replica r39/3
I200401 15:01:38.308760 169020 sql/sqlbase/structured.go:1554  [n1,job=542909713750163457,scExec] publish: descID=53 (bank) version=3 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 15:01:38.457957 169002 kv/kvserver/replica_command.go:663  [n1,merge,s1,r53/1:/Table/53/1/{4-5}] initiating a merge of r34:/Table/53/1/{5-6} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=9] into this range (lhs+rhs has (size=1.5 KiB+2.0 KiB=3.6 KiB qps=0.00+0.00=0.00qps) below threshold (size=128 MiB, qps=1250.00))
I200401 15:01:38.556202 162674 kv/kvserver/store_remove_replica.go:122  [n2,s2,r40/2:/Table/53/1/{6-7}] removing replica r38/2
I200401 15:01:38.598361 162914 kv/kvserver/store_remove_replica.go:122  [n3,s3,r40/3:/Table/53/1/{6-7}] removing replica r38/3
I200401 15:01:38.600405 161254 kv/kvserver/store_remove_replica.go:122  [n1,s1,r40/1:/Table/53/1/{6-7}] removing replica r38/1
I200401 15:01:38.609322 162993 kv/kvserver/store_remove_replica.go:122  [n3,s3,r37/3:/Table/53/1/{2-3}] removing replica r35/3
I200401 15:01:38.633899 162681 kv/kvserver/store_remove_replica.go:122  [n2,s2,r37/2:/Table/53/1/{2-3}] removing replica r35/2
I200401 15:01:38.610107 161212 kv/kvserver/store_remove_replica.go:122  [n1,s1,r37/1:/Table/53/1/{2-3}] removing replica r35/1
I200401 15:01:38.693683 169116 kv/kvserver/replica_command.go:663  [n2,merge,s2,r36/2:/Table/53/1/{8-9}] initiating a merge of r63:/{Table/53/1/9-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=3.0 KiB+1.9 KiB=4.9 KiB qps=0.00+1.18=1.18qps) below threshold (size=128 MiB, qps=1250.00))
I200401 15:01:39.042564 161217 kv/kvserver/store_remove_replica.go:122  [n1,s1,r53/1:/Table/53/1/{4-5}] removing replica r34/1
I200401 15:01:39.068662 162690 kv/kvserver/store_remove_replica.go:122  [n2,s2,r53/2:/Table/53/1/{4-5}] removing replica r34/2
I200401 15:01:39.161299 163013 kv/kvserver/store_remove_replica.go:122  [n3,s3,r53/3:/Table/53/1/{4-5}] removing replica r34/3
E200401 15:01:39.201586 166948 sql/conn_executor.go:796  [n3,client=127.0.0.1:41996,hostssl,user=root] a SQL panic has occurred while executing "UPDATE data.bank SET payload = $1 WHERE id = $2": event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}
E200401 15:01:39.202615 166948 util/log/crash_reporting.go:208  [n3,client=127.0.0.1:41996,hostssl,user=root] a panic has occurred!

More

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestBackupRestoreWithConcurrentWrites PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(ccl/backupccl).TestBackupRestoreWithConcurrentWrites failed on release-20.1@4833237a56b698a809f1a56d77e573af2775e727:

Fatal error:

panic: event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{} [recovered]
	panic: panic while executing 1 statements: UPDATE _._ SET _ = $1 WHERE _ = $2; caused by event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}

Stack:

goroutine 163579 [running]:
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).closeWrapper(0xc001489000, 0x6ea8720, 0xc003f24dc0, 0x5c52100, 0xc0060fcba0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:810 +0x4e1
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn.func1(0xc001489000, 0x6ea8720, 0xc003f24dc0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:474 +0x6f
panic(0x5c52100, 0xc0060fcba0)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).txnStateTransitionsApplyWrapper(0xc001489000, 0x6e2af20, 0xc0060fcb20, 0x5c50b20, 0xc0009825a0, 0x7fa6a4f49a98, 0xc0015bc680, 0x126, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:2085 +0x16b5
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd(0xc001489000, 0x6ea8720, 0xc003f24dc0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1497 +0xa77
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc001489000, 0x6ea8720, 0xc003f24dc0, 0xc0064a9c90, 0x5400, 0x15000, 0xc0064a9d30, 0xc0003910f0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1297 +0x484
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn(0xc000de0680, 0x6ea8720, 0xc003f24dc0, 0xc001489000, 0x5400, 0x15000, 0xc0064a9d30, 0xc0003910f0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:476 +0x132
github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync.func1(0xc0058ffc82, 0xc000f91460, 0x6ea8720, 0xc003f24dc0, 0xc0003910f0, 0xc000de0680, 0xc0015bc000, 0x6ecca60, 0xc006d75560, 0xc005735260, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:594 +0x36c
created by github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:526 +0x1ea

Log preceding fatal error

I200401 18:18:50.778133 163174 kv/kvserver/replica_command.go:397  [n1,s1,r33/1:/Table/53{-/1/2}] initiating a split of this range at key /Table/53/1/1 [r39] (manual)
I200401 18:18:51.774389 163464 kv/kvserver/replica_command.go:397  [n2,s2,r35/2:/Table/53/1/{5-7}] initiating a split of this range at key /Table/53/1/6 [r45] (manual)
I200401 18:18:52.637764 157516 testutils/testcluster/testcluster.go:760  WaitForFullReplication took: 152.961996ms
I200401 18:18:52.896748 157746 server/status/runtime.go:498  [n1] runtime stats: 1.1 GiB RSS, 771 goroutines, 59 MiB/111 MiB/116 MiB GO alloc/idle/total, 94 MiB/144 MiB CGO alloc/total, 2379.1 CGO/sec, 175.8/14.3 %(u/s)time, 1.0 %gc (11x), 3.5 MiB/3.5 MiB (r/w)net
I200401 18:18:56.259771 164065 jobs/registry.go:797  [n1] job 542948508402515969: stepping through state running with error <nil>
I200401 18:18:56.870230 159027 server/status/runtime.go:498  [n2] runtime stats: 1.2 GiB RSS, 792 goroutines, 55 MiB/111 MiB/117 MiB GO alloc/idle/total, 96 MiB/144 MiB CGO alloc/total, 2145.5 CGO/sec, 182.1/16.0 %(u/s)time, 1.5 %gc (11x), 3.7 MiB/3.7 MiB (r/w)net
I200401 18:18:57.280915 163939 kv/kvserver/replica_consistency.go:255  [n1,consistencyChecker,s1,r4/1:/System{/tsd-tse}] triggering stats recomputation to resolve delta of {ContainsEstimates:14100 LastUpdateNanos:1585765135122460013 IntentAge:0 GCBytesAge:0 LiveBytes:80880 LiveCount:-2034 KeyBytes:-99006 KeyCount:-2034 ValBytes:179886 ValCount:-2034 IntentBytes:0 IntentCount:0 SysBytes:0 SysCount:0}
I200401 18:18:57.340091 164256 ccl/storageccl/export.go:102  [n1,s1,r44/1:/Table/53/1/{7-8}] export [/Table/53/1/7,/Table/53/1/8)
I200401 18:18:57.344377 164267 ccl/storageccl/export.go:102  [n3,s3,r36/3:/Table/53/1/{3-4}] export [/Table/53/1/3,/Table/53/1/4)
I200401 18:18:57.347149 164278 ccl/storageccl/export.go:102  [n1,s1,r43/1:/Table/53/1/{8-9}] export [/Table/53/1/8,/Table/53/1/9)
I200401 18:18:57.348505 164255 ccl/storageccl/export.go:102  [n1,s1,r39/1:/Table/53/1/{1-2}] export [/Table/53/1/1,/Table/53/1/2)
I200401 18:18:57.349089 164258 ccl/storageccl/export.go:102  [n1,s1,r33/1:/Table/53{-/1/1}] export [/Table/53/1,/Table/53/1/1)
I200401 18:18:57.386051 164223 ccl/storageccl/export.go:102  [n3,s3,r38/3:/{Table/53/1/9-Max}] export [/Table/53/1/9,/Table/53/2)
I200401 18:18:57.427892 164224 ccl/storageccl/export.go:102  [n3,s3,r45/3:/Table/53/1/{6-7}] export [/Table/53/1/6,/Table/53/1/7)
I200401 18:18:57.428863 164139 ccl/storageccl/export.go:102  [n2,s2,r37/2:/Table/53/1/{2-3}] export [/Table/53/1/2,/Table/53/1/3)
I200401 18:18:57.432017 164142 ccl/storageccl/export.go:102  [n2,s2,r63/2:/Table/53/1/{4-5}] export [/Table/53/1/4,/Table/53/1/5)
I200401 18:18:57.470030 164141 ccl/storageccl/export.go:102  [n2,s2,r35/2:/Table/53/1/{5-6}] export [/Table/53/1/5,/Table/53/1/6)
I200401 18:18:57.704044 159494 server/status/runtime.go:498  [n3] runtime stats: 1.1 GiB RSS, 775 goroutines, 47 MiB/119 MiB/119 MiB GO alloc/idle/total, 95 MiB/144 MiB CGO alloc/total, 2659.1 CGO/sec, 184.1/16.9 %(u/s)time, 1.6 %gc (11x), 3.7 MiB/3.7 MiB (r/w)net
I200401 18:18:59.623141 164065 jobs/registry.go:797  [n1] job 542948508402515969: stepping through state succeeded with error <nil>
I200401 18:19:00.291920 164065 jobs/registry.go:952  [n1] job 542948508402515969: status succeeded after adoption finished
I200401 18:19:02.914753 157746 server/status/runtime.go:498  [n1] runtime stats: 1.2 GiB RSS, 780 goroutines, 53 MiB/112 MiB/120 MiB GO alloc/idle/total, 98 MiB/148 MiB CGO alloc/total, 1772.0 CGO/sec, 188.9/18.5 %(u/s)time, 1.7 %gc (10x), 3.7 MiB/3.7 MiB (r/w)net
I200401 18:19:03.159913 162956 sql/table.go:942  [n1,client=127.0.0.1:57142,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 18:19:03.160194 162956 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:57142,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 18:19:03.608074 164935 kv/kvserver/replica_command.go:663  [n2,merge,s2,r63/2:/Table/53/1/{4-5}] initiating a merge of r35:/Table/53/1/{5-6} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=9] into this range (lhs+rhs has (size=2.0 KiB+1.8 KiB=3.8 KiB qps=5.31+7.92=13.23qps) below threshold (size=128 MiB, qps=1250.00))
I200401 18:19:03.824619 162956 sql/event_log.go:132  [n1,client=127.0.0.1:57142,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 18:19:03.825259 162956 jobs/registry.go:285  [n1,client=127.0.0.1:57142,hostssl,user=root] scheduled jobs [542948534270099457]
I200401 18:19:04.008594 164984 kv/kvserver/replica_command.go:663  [n1,merge,s1,r39/1:/Table/53/1/{1-2}] initiating a merge of r37:/Table/53/1/{2-3} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=1.8 KiB+2.8 KiB=4.6 KiB qps=6.87+11.90=18.77qps) below threshold (size=128 MiB, qps=1250.00))
I200401 18:19:04.121837 157862 jobs/jobs.go:743  job 542948534270099457: adopting
I200401 18:19:04.614250 164972 kv/kvserver/replica_command.go:663  [n3,merge,s3,r45/3:/Table/53/1/{6-7}] initiating a merge of r44:/Table/53/1/{7-8} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=1.5 KiB+2.7 KiB=4.3 KiB qps=4.77+7.49=12.25qps) below threshold (size=128 MiB, qps=1250.00))
I200401 18:19:04.775001 159117 kv/kvserver/store_remove_replica.go:122  [n2,s2,r63/2:/Table/53/1/{4-5}] removing replica r35/2
I200401 18:19:04.895211 157682 kv/kvserver/store_remove_replica.go:122  [n1,s1,r63/1:/Table/53/1/{4-5}] removing replica r35/1
I200401 18:19:04.929912 159416 kv/kvserver/store_remove_replica.go:122  [n3,s3,r63/3:/Table/53/1/{4-5}] removing replica r35/3
I200401 18:19:05.024306 157862 jobs/registry.go:1178  job 542948534270099457: resuming execution
I200401 18:19:05.224432 157714 kv/kvserver/store_remove_replica.go:122  [n1,s1,r39/1:/Table/53/1/{1-2}] removing replica r37/1
I200401 18:19:05.266621 165431 jobs/registry.go:797  [n1] job 542948534270099457: stepping through state running with error <nil>
I200401 18:19:05.283768 159131 kv/kvserver/store_remove_replica.go:122  [n2,s2,r39/2:/Table/53/1/{1-2}] removing replica r37/2
I200401 18:19:05.285828 159426 kv/kvserver/store_remove_replica.go:122  [n3,s3,r39/3:/Table/53/1/{1-2}] removing replica r37/3
I200401 18:19:05.295836 165431 sql/schema_changer.go:445  [n1,job=542948534270099457,scExec] schema change on bank (53 v2) mutation 0 starting execution...
I200401 18:19:05.695161 165415 kv/kvserver/replica_command.go:663  [n1,merge,s1,r43/1:/Table/53/1/{8-9}] initiating a merge of r38:/{Table/53/1/9-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=1.2 KiB+2.3 KiB=3.4 KiB qps=0.00+1.38=1.38qps) below threshold (size=128 MiB, qps=1250.00))
I200401 18:19:05.730905 165431 sql/sqlbase/structured.go:1554  [n1,job=542948534270099457,scExec] publish: descID=53 (bank) version=3 mtime=1970-01-01 00:00:00 +0000 UTC
W200401 18:19:05.938674 165502 kv/kvserver/intentresolver/intent_resolver.go:428  failed to push during intent resolution: failed to push meta={id=c1a884ad key=/Local/Range/Table/53/1/6/RangeDescriptor pri=0.02304777 epo=0 ts=1585765143.806372968,0 min=1585765143.806372968,0 seq=0} lock=true stat=PENDING rts=0,0 wto=false max=0,0
I200401 18:19:06.173544 159437 kv/kvserver/store_remove_replica.go:122  [n3,s3,r45/3:/Table/53/1/{6-7}] removing replica r44/3
I200401 18:19:06.207432 159122 kv/kvserver/store_remove_replica.go:122  [n2,s2,r45/2:/Table/53/1/{6-7}] removing replica r44/2
I200401 18:19:06.235814 157663 kv/kvserver/store_remove_replica.go:122  [n1,s1,r45/1:/Table/53/1/{6-7}] removing replica r44/1
I200401 18:19:06.387706 165452 kv/kvserver/replica_command.go:663  [n2,merge,s2,r63/2:/Table/53/1/{4-6}] initiating a merge of r45:/Table/53/1/{6-8} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=10] into this range (lhs+rhs has (size=4.3 KiB+4.4 KiB=8.6 KiB qps=1.98+1.06=3.04qps) below threshold (size=128 MiB, qps=1250.00))
I200401 18:19:06.521689 157690 kv/kvserver/store_remove_replica.go:122  [n1,s1,r43/1:/Table/53/1/{8-9}] removing replica r38/1
I200401 18:19:06.539354 159404 kv/kvserver/store_remove_replica.go:122  [n3,s3,r43/3:/Table/53/1/{8-9}] removing replica r38/3
I200401 18:19:06.607378 159114 kv/kvserver/store_remove_replica.go:122  [n2,s2,r43/2:/Table/53/1/{8-9}] removing replica r38/2
E200401 18:19:06.837519 163579 sql/conn_executor.go:796  [n3,client=127.0.0.1:43276,hostssl,user=root] a SQL panic has occurred while executing "UPDATE data.bank SET payload = $1 WHERE id = $2": event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}
E200401 18:19:06.839123 163579 util/log/crash_reporting.go:208  [n3,client=127.0.0.1:43276,hostssl,user=root] a panic has occurred!

More

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestBackupRestoreWithConcurrentWrites PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@ajwerner
Copy link
Contributor

ajwerner commented Apr 1, 2020

Duplicate of #43251

@ajwerner ajwerner marked this as a duplicate of #43251 Apr 1, 2020
@ajwerner ajwerner closed this as completed Apr 1, 2020
@ajwerner
Copy link
Contributor

ajwerner commented Apr 1, 2020

re-opening for the backport.

@cockroach-teamcity
Copy link
Member Author

(ccl/backupccl).TestBackupRestoreWithConcurrentWrites failed on release-20.1@e8ab9905a2af48d2651464f957ea0ca54382ebb0:

Fatal error:

panic: event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{} [recovered]
	panic: panic while executing 1 statements: UPDATE _._ SET _ = $1 WHERE _ = $2; caused by event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}

Stack:

goroutine 165187 [running]:
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).closeWrapper(0xc003266000, 0x6ea9640, 0xc0033df680, 0x5c52b60, 0xc0033c2160)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:810 +0x4e1
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn.func1(0xc003266000, 0x6ea9640, 0xc0033df680)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:474 +0x6f
panic(0x5c52b60, 0xc0033c2160)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).txnStateTransitionsApplyWrapper(0xc003266000, 0x6e2be80, 0xc0033c2140, 0x5c51580, 0xc0047f2e40, 0x7f5373ff5a98, 0xc0032da680, 0x279, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:2085 +0x16b5
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd(0xc003266000, 0x6ea9640, 0xc0033df680, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1497 +0xa77
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc003266000, 0x6ea9640, 0xc0033df680, 0xc000b08610, 0x5400, 0x15000, 0xc000b086b0, 0xc00444bb40, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1297 +0x484
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn(0xc002395b80, 0x6ea9640, 0xc0033df680, 0xc003266000, 0x5400, 0x15000, 0xc000b086b0, 0xc00444bb40, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:476 +0x132
github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync.func1(0xc0029c2e0d, 0xc004b8f360, 0x6ea9640, 0xc0033df680, 0xc00444bb40, 0xc002395b80, 0xc0032da000, 0x6ecd980, 0xc00402ca20, 0xc000888c60, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:594 +0x36c
created by github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:526 +0x1ea

Log preceding fatal error

I200401 21:32:49.153226 164576 sql/table.go:942  [n1,client=127.0.0.1:52060,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 21:32:49.153502 164576 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:52060,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 21:32:49.572867 161145 kv/kvserver/store_remove_replica.go:122  [n3,s3,r35/3:/Table/53/1/{3-4}] removing replica r44/3
I200401 21:32:49.613582 159394 kv/kvserver/store_remove_replica.go:122  [n1,s1,r35/1:/Table/53/1/{3-4}] removing replica r44/1
I200401 21:32:49.650333 160849 kv/kvserver/store_remove_replica.go:122  [n2,s2,r35/2:/Table/53/1/{3-4}] removing replica r44/2
I200401 21:32:50.169368 164576 sql/table.go:942  [n1,client=127.0.0.1:52060,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 21:32:50.169615 164576 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:52060,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 21:32:50.277520 167073 kv/kvserver/replica_command.go:663  [n2,merge,s2,r43/2:/Table/53/1/{8-9}] initiating a merge of r45:/{Table/53/1/9-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=2.8 KiB+3.5 KiB=6.4 KiB qps=0.60+2.38=2.98qps) below threshold (size=128 MiB, qps=1250.00))
I200401 21:32:51.009389 160909 server/status/runtime.go:498  [n2] runtime stats: 1.3 GiB RSS, 790 goroutines, 102 MiB/68 MiB/166 MiB GO alloc/idle/total, 99 MiB/145 MiB CGO alloc/total, 1163.5 CGO/sec, 192.3/23.1 %(u/s)time, 0.3 %gc (6x), 3.9 MiB/3.9 MiB (r/w)net
I200401 21:32:51.153661 160855 kv/kvserver/store_remove_replica.go:122  [n2,s2,r43/2:/Table/53/1/{8-9}] removing replica r45/2
I200401 21:32:51.160555 159389 kv/kvserver/store_remove_replica.go:122  [n1,s1,r43/1:/Table/53/1/{8-9}] removing replica r45/1
I200401 21:32:51.168286 161164 kv/kvserver/store_remove_replica.go:122  [n3,s3,r43/3:/Table/53/1/{8-9}] removing replica r45/3
I200401 21:32:51.400022 164576 sql/table.go:942  [n1,client=127.0.0.1:52060,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 21:32:51.400590 164576 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:52060,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 21:32:52.110281 161224 server/status/runtime.go:498  [n3] runtime stats: 1.3 GiB RSS, 784 goroutines, 82 MiB/85 MiB/166 MiB GO alloc/idle/total, 99 MiB/145 MiB CGO alloc/total, 1199.2 CGO/sec, 192.4/22.7 %(u/s)time, 0.3 %gc (6x), 3.8 MiB/3.8 MiB (r/w)net
I200401 21:32:52.238861 167314 kv/kvserver/replica_command.go:663  [n1,merge,s1,r33/1:/Table/53{-/1/3}] initiating a merge of r35:/Table/53/1/{3-5} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=9] into this range (lhs+rhs has (size=9.7 KiB+4.9 KiB=15 KiB qps=19.23+7.34=26.57qps) below threshold (size=128 MiB, qps=1250.00))
I200401 21:32:52.354377 164576 sql/table.go:942  [n1,client=127.0.0.1:52060,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 21:32:52.354634 164576 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:52060,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 21:32:52.926822 159391 kv/kvserver/store_remove_replica.go:122  [n1,s1,r33/1:/Table/53{-/1/3}] removing replica r35/1
I200401 21:32:52.937107 161155 kv/kvserver/store_remove_replica.go:122  [n3,s3,r33/3:/Table/53{-/1/3}] removing replica r35/3
I200401 21:32:52.947101 160848 kv/kvserver/store_remove_replica.go:122  [n2,s2,r33/2:/Table/53{-/1/3}] removing replica r35/2
I200401 21:32:53.259909 164576 sql/table.go:942  [n1,client=127.0.0.1:52060,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 21:32:53.260872 164576 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:52060,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 21:32:53.427723 167591 kv/kvserver/replica_command.go:663  [n2,merge,s2,r34/2:/Table/53/1/{5-8}] initiating a merge of r43:/{Table/53/1/8-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=9] into this range (lhs+rhs has (size=10 KiB+8.5 KiB=19 KiB qps=17.84+25.33=43.16qps) below threshold (size=128 MiB, qps=1250.00))
I200401 21:32:54.321581 164576 sql/table.go:942  [n1,client=127.0.0.1:52060,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 21:32:54.321849 164576 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:52060,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 21:32:54.349652 159400 kv/kvserver/store_remove_replica.go:122  [n1,s1,r34/1:/Table/53/1/{5-8}] removing replica r43/1
I200401 21:32:54.353481 161174 kv/kvserver/store_remove_replica.go:122  [n3,s3,r34/3:/Table/53/1/{5-8}] removing replica r43/3
I200401 21:32:54.359532 160842 kv/kvserver/store_remove_replica.go:122  [n2,s2,r34/2:/Table/53/1/{5-8}] removing replica r43/2
I200401 21:32:54.825552 164576 sql/event_log.go:132  [n1,client=127.0.0.1:52060,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 21:32:54.825835 164576 sql/event_log.go:132  [n1,client=127.0.0.1:52060,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 21:32:54.826006 164576 sql/event_log.go:132  [n1,client=127.0.0.1:52060,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 21:32:54.826106 164576 sql/event_log.go:132  [n1,client=127.0.0.1:52060,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 21:32:54.826268 164576 sql/event_log.go:132  [n1,client=127.0.0.1:52060,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 21:32:54.826379 164576 sql/event_log.go:132  [n1,client=127.0.0.1:52060,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 21:32:54.826463 164576 sql/event_log.go:132  [n1,client=127.0.0.1:52060,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 21:32:54.826572 164576 sql/event_log.go:132  [n1,client=127.0.0.1:52060,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 21:32:54.826657 164576 sql/event_log.go:132  [n1,client=127.0.0.1:52060,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 21:32:54.831995 164576 jobs/registry.go:285  [n1,client=127.0.0.1:52060,hostssl,user=root] scheduled jobs [542986646973579265]
I200401 21:32:55.273424 159459 jobs/jobs.go:743  job 542986646973579265: adopting
I200401 21:32:55.279107 167890 kv/kvserver/replica_command.go:663  [n1,merge,s1,r33/1:/Table/53{-/1/5}] initiating a merge of r34:/{Table/53/1/5-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=12] into this range (lhs+rhs has (size=16 KiB+20 KiB=36 KiB qps=13.99+16.76=30.76qps) below threshold (size=128 MiB, qps=1250.00))
I200401 21:32:55.790023 159459 jobs/registry.go:1178  job 542986646973579265: resuming execution
I200401 21:32:55.957682 168132 jobs/registry.go:797  [n1] job 542986646973579265: stepping through state running with error <nil>
I200401 21:32:55.958540 159414 kv/kvserver/store_remove_replica.go:122  [n1,s1,r33/1:/Table/53{-/1/5}] removing replica r34/1
I200401 21:32:55.964942 161140 kv/kvserver/store_remove_replica.go:122  [n3,s3,r33/3:/Table/53{-/1/5}] removing replica r34/3
I200401 21:32:55.971168 168132 sql/schema_changer.go:445  [n1,job=542986646973579265,scExec] schema change on bank (53 v2) mutation 0 starting execution...
I200401 21:32:56.044477 160816 kv/kvserver/store_remove_replica.go:122  [n2,s2,r33/2:/Table/53{-/1/5}] removing replica r34/2
I200401 21:32:56.120281 168132 sql/sqlbase/structured.go:1554  [n1,job=542986646973579265,scExec] publish: descID=53 (bank) version=3 mtime=1970-01-01 00:00:00 +0000 UTC
E200401 21:32:56.443308 165187 sql/conn_executor.go:796  [n2,client=127.0.0.1:51258,hostssl,user=root] a SQL panic has occurred while executing "UPDATE data.bank SET payload = $1 WHERE id = $2": event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}
E200401 21:32:56.444286 165187 util/log/crash_reporting.go:208  [n2,client=127.0.0.1:51258,hostssl,user=root] a panic has occurred!

More

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestBackupRestoreWithConcurrentWrites PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(ccl/backupccl).TestBackupRestoreWithConcurrentWrites failed on release-20.1@3de1723bb3f3e48f4231c48eaec11d52511feb65:

Fatal error:

panic: event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{} [recovered]
	panic: panic while executing 1 statements: UPDATE _._ SET _ = $1 WHERE _ = $2; caused by event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}

Stack:

goroutine 149158 [running]:
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).closeWrapper(0xc002915000, 0x6eaa400, 0xc0019f76c0, 0x5c53760, 0xc002c4c900)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:810 +0x4e1
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn.func1(0xc002915000, 0x6eaa400, 0xc0019f76c0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:474 +0x6f
panic(0x5c53760, 0xc002c4c900)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).txnStateTransitionsApplyWrapper(0xc002915000, 0x6e2ccc0, 0xc002c4c8e0, 0x5c52180, 0xc00596b800, 0x7fb524255050, 0xc001082180, 0x185, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:2085 +0x16b5
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd(0xc002915000, 0x6eaa400, 0xc0019f76c0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1497 +0xa77
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc002915000, 0x6eaa400, 0xc0019f76c0, 0xc001bc0d90, 0x5400, 0x15000, 0xc001bc0e30, 0xc00422bac0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1297 +0x484
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn(0xc003c98580, 0x6eaa400, 0xc0019f76c0, 0xc002915000, 0x5400, 0x15000, 0xc001bc0e30, 0xc00422bac0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:476 +0x132
github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync.func1(0xc003b2912d, 0xc002e949a0, 0x6eaa400, 0xc0019f76c0, 0xc00422bac0, 0xc003c98580, 0xc001081b00, 0x6ece740, 0xc005478cc0, 0xc0020c06c0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:594 +0x36c
created by github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:526 +0x1ea

Log preceding fatal error

I200401 22:24:49.288741 143950 server/status/runtime.go:498  [n1] runtime stats: 1.2 GiB RSS, 784 goroutines, 99 MiB/70 MiB/160 MiB GO alloc/idle/total, 98 MiB/152 MiB CGO alloc/total, 849.6 CGO/sec, 178.2/13.4 %(u/s)time, 0.6 %gc (6x), 3.6 MiB/3.6 MiB (r/w)net
I200401 22:24:49.436245 148101 sql/table.go:942  [n1,client=127.0.0.1:60234,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 22:24:49.436508 148101 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:60234,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 22:24:49.856113 145168 kv/kvserver/store_remove_replica.go:122  [n2,s2,r33/2:/Table/53{-/1/1}] removing replica r38/2
I200401 22:24:49.880254 145458 kv/kvserver/store_remove_replica.go:122  [n3,s3,r33/3:/Table/53{-/1/1}] removing replica r38/3
I200401 22:24:49.927023 143999 kv/kvserver/store_remove_replica.go:122  [n1,s1,r33/1:/Table/53{-/1/1}] removing replica r38/1
I200401 22:24:50.353757 151174 kv/kvserver/replica_command.go:663  [n1,merge,s1,r37/1:/Table/53/1/{2-3}] initiating a merge of r36:/Table/53/1/{3-4} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=2.6 KiB+2.1 KiB=4.7 KiB qps=3.43+7.70=11.13qps) below threshold (size=128 MiB, qps=1250.00))
I200401 22:24:50.845176 144003 kv/kvserver/store_remove_replica.go:122  [n1,s1,r37/1:/Table/53/1/{2-3}] removing replica r36/1
I200401 22:24:50.903639 145143 kv/kvserver/store_remove_replica.go:122  [n2,s2,r37/2:/Table/53/1/{2-3}] removing replica r36/2
I200401 22:24:50.908184 145453 kv/kvserver/store_remove_replica.go:122  [n3,s3,r37/3:/Table/53/1/{2-3}] removing replica r36/3
I200401 22:24:51.173444 148101 sql/table.go:942  [n1,client=127.0.0.1:60234,hostssl,user=root] queued new schema change job for table 53, mutation 0
I200401 22:24:51.174032 148101 sql/sqlbase/structured.go:1554  [n1,client=127.0.0.1:60234,hostssl,user=root] publish: descID=53 (bank) version=2 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 22:24:51.544799 151293 kv/kvserver/replica_command.go:663  [n3,merge,s3,r35/3:/Table/53/1/{5-6}] initiating a merge of r55:/Table/53/1/{6-7} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=9] into this range (lhs+rhs has (size=2.1 KiB+2.5 KiB=4.6 KiB qps=0.00+2.10=2.10qps) below threshold (size=128 MiB, qps=1250.00))
I200401 22:24:51.857267 145188 server/status/runtime.go:498  [n2] runtime stats: 1.2 GiB RSS, 790 goroutines, 83 MiB/86 MiB/155 MiB GO alloc/idle/total, 99 MiB/153 MiB CGO alloc/total, 930.6 CGO/sec, 180.0/14.5 %(u/s)time, 0.8 %gc (6x), 3.7 MiB/3.7 MiB (r/w)net
I200401 22:24:51.968290 148101 sql/event_log.go:132  [n1,client=127.0.0.1:60234,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 22:24:51.970090 148101 sql/event_log.go:132  [n1,client=127.0.0.1:60234,hostssl,user=root] Event: "drop_table", target: 53, info: {TableName:data.public.bank Statement:DROP TABLE data.public.bank User:root CascadeDroppedViews:[]}
I200401 22:24:51.970844 148101 jobs/registry.go:285  [n1,client=127.0.0.1:60234,hostssl,user=root] scheduled jobs [542996859261190145]
W200401 22:24:52.025283 151477 kv/kvserver/intentresolver/intent_resolver.go:428  failed to push during intent resolution: failed to push meta={id=a5c9f27f key=/Local/Range/Table/53/1/5/RangeDescriptor pri=0.00244519 epo=0 ts=1585779891.166839046,0 min=1585779891.166839046,0 seq=8} lock=true stat=PENDING rts=0,0 wto=false max=0,0
I200401 22:24:52.233003 145499 kv/kvserver/store_remove_replica.go:122  [n3,s3,r35/3:/Table/53/1/{5-6}] removing replica r55/3
I200401 22:24:52.311416 144121 kv/kvserver/store_remove_replica.go:122  [n1,s1,r35/1:/Table/53/1/{5-6}] removing replica r55/1
I200401 22:24:52.321488 145168 kv/kvserver/store_remove_replica.go:122  [n2,s2,r35/2:/Table/53/1/{5-6}] removing replica r55/2
I200401 22:24:52.487012 144897 kv/kvserver/queue.go:578  [n2,s2] rate limited in MaybeAdd (split): throttled on async limiting semaphore
I200401 22:24:52.646985 151572 kv/kvserver/replica_command.go:663  [n1,merge,s1,r37/1:/Table/53/1/{2-4}] initiating a merge of r63:/Table/53/1/{4-5} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=5.2 KiB+2.6 KiB=7.8 KiB qps=8.09+0.95=9.04qps) below threshold (size=128 MiB, qps=1250.00))
I200401 22:24:52.699121 151626 kv/kvserver/replica_command.go:663  [n3,merge,s3,r63/3:/Table/53/1/{4-5}] initiating a merge of r35:/Table/53/1/{5-7} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=10] into this range (lhs+rhs has (size=2.6 KiB+5.1 KiB=7.7 KiB qps=0.95+7.10=8.05qps) below threshold (size=128 MiB, qps=1250.00))
I200401 22:24:53.000887 145521 server/status/runtime.go:498  [n3] runtime stats: 1.3 GiB RSS, 814 goroutines, 114 MiB/60 MiB/158 MiB GO alloc/idle/total, 100 MiB/153 MiB CGO alloc/total, 962.5 CGO/sec, 185.1/15.3 %(u/s)time, 0.9 %gc (7x), 3.7 MiB/3.7 MiB (r/w)net
I200401 22:24:53.447145 151533 kv/kvserver/replica_command.go:663  [n2,merge,s2,r53/2:/Table/53/1/{8-9}] initiating a merge of r64:/{Table/53/1/9-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=1.9 KiB+2.8 KiB=4.7 KiB qps=3.72+3.94=7.66qps) below threshold (size=128 MiB, qps=1250.00))
I200401 22:24:53.765895 145247 kv/kvserver/store_remove_replica.go:122  [n3,s3,r63/3:/Table/53/1/{4-5}] removing replica r35/3
I200401 22:24:53.777883 144318 jobs/jobs.go:743  job 542996859261190145: adopting
I200401 22:24:53.820608 144168 kv/kvserver/store_remove_replica.go:122  [n1,s1,r63/1:/Table/53/1/{4-5}] removing replica r35/1
I200401 22:24:53.827646 145074 kv/kvserver/store_remove_replica.go:122  [n2,s2,r63/2:/Table/53/1/{4-5}] removing replica r35/2
I200401 22:24:54.384899 151873 kv/kvserver/replica_command.go:663  [n3,merge,s3,r63/3:/Table/53/1/{4-7}] initiating a merge of r54:/Table/53/1/{7-8} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=8] into this range (lhs+rhs has (size=7.8 KiB+3.8 KiB=12 KiB qps=6.04+0.54=6.58qps) below threshold (size=128 MiB, qps=1250.00))
I200401 22:24:55.025367 145159 kv/kvserver/store_remove_replica.go:122  [n2,s2,r53/2:/Table/53/1/{8-9}] removing replica r64/2
I200401 22:24:55.037776 144318 jobs/registry.go:1178  job 542996859261190145: resuming execution
I200401 22:24:55.075707 145479 kv/kvserver/store_remove_replica.go:122  [n3,s3,r53/3:/Table/53/1/{8-9}] removing replica r64/3
I200401 22:24:55.189539 143998 kv/kvserver/store_remove_replica.go:122  [n1,s1,r53/1:/Table/53/1/{8-9}] removing replica r64/1
I200401 22:24:55.381945 145472 kv/kvserver/store_remove_replica.go:122  [n3,s3,r63/3:/Table/53/1/{4-7}] removing replica r54/3
I200401 22:24:55.475828 144128 kv/kvserver/store_remove_replica.go:122  [n1,s1,r63/1:/Table/53/1/{4-7}] removing replica r54/1
I200401 22:24:55.490060 145140 kv/kvserver/store_remove_replica.go:122  [n2,s2,r63/2:/Table/53/1/{4-7}] removing replica r54/2
E200401 22:24:55.655837 152112 kv/kvserver/consistency_queue.go:135  [n3,consistencyChecker,s3,r64/3:/{Table/53/1/9-Max}] computing own checksum: rpc error: code = Unknown desc = r64 was not found on s3
E200401 22:24:55.656769 152112 kv/kvserver/queue.go:1089  [n3,consistencyChecker,s3,r64/3:/{Table/53/1/9-Max}] computing own checksum: rpc error: code = Unknown desc = r64 was not found on s3
I200401 22:24:55.688084 152035 jobs/registry.go:797  [n1] job 542996859261190145: stepping through state running with error <nil>
I200401 22:24:55.713583 152035 sql/schema_changer.go:445  [n1,job=542996859261190145,scExec] schema change on bank (53 v2) mutation 0 starting execution...
I200401 22:24:55.716643 151572 kv/kvserver/replica_command.go:663  [n1,merge,s1,r37/1:/Table/53/1/{2-4}] initiating a merge of r63:/Table/53/1/{4-8} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=12] into this range (lhs+rhs has (size=5.2 KiB+2.6 KiB=7.8 KiB qps=8.09+0.95=9.04qps) below threshold (size=128 MiB, qps=1250.00))
I200401 22:24:56.049365 152035 sql/sqlbase/structured.go:1554  [n1,job=542996859261190145,scExec] publish: descID=53 (bank) version=3 mtime=1970-01-01 00:00:00 +0000 UTC
I200401 22:24:56.512146 144145 kv/kvserver/store_remove_replica.go:122  [n1,s1,r37/1:/Table/53/1/{2-4}] removing replica r63/1
I200401 22:24:56.520937 145074 kv/kvserver/store_remove_replica.go:122  [n2,s2,r37/2:/Table/53/1/{2-4}] removing replica r63/2
I200401 22:24:56.526045 145479 kv/kvserver/store_remove_replica.go:122  [n3,s3,r37/3:/Table/53/1/{2-4}] removing replica r63/3
I200401 22:24:56.636527 152288 kv/kvserver/replica_command.go:663  [n1,merge,s1,r33/1:/Table/53{-/1/2}] initiating a merge of r37:/Table/53/1/{2-8} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=13] into this range (lhs+rhs has (size=5.1 KiB+17 KiB=22 KiB qps=0.00+1.53=1.53qps) below threshold (size=128 MiB, qps=1250.00))
E200401 22:24:56.668339 149158 sql/conn_executor.go:796  [n1,client=127.0.0.1:60240,hostssl,user=root] a SQL panic has occurred while executing "UPDATE data.bank SET payload = $1 WHERE id = $2": event RetriableErr{CanAutoRetry:false, IsCommit:false} inappropriate in current state NoTxn{}
E200401 22:24:56.669305 149158 util/log/crash_reporting.go:208  [n1,client=127.0.0.1:60240,hostssl,user=root] a panic has occurred!

More

Parameters:

  • GOFLAGS=-json
make stressrace TESTS=TestBackupRestoreWithConcurrentWrites PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@ajwerner
Copy link
Contributor

ajwerner commented Apr 2, 2020

Closed by #46877

@dt dt closed this as completed Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Projects
None yet
Development

No branches or pull requests

4 participants