Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importerccl: deadlock while running TestImportMultiRegion #101713

Closed
cockroach-teamcity opened this issue Apr 18, 2023 · 4 comments
Closed

importerccl: deadlock while running TestImportMultiRegion #101713

cockroach-teamcity opened this issue Apr 18, 2023 · 4 comments
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-sql-queries SQL Queries Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Apr 18, 2023

pkg/ccl/importerccl/importerccl_test.TestImportMultiRegion failed with artifacts on master @ 1cd507a7c6582fd91bb56123e62bd4fde45c4d22:

=== RUN   TestImportMultiRegion
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/48a201a2352f28545820257a52e2244a/logTestImportMultiRegion1858791545
    test_log_scope.go:79: use -show-logs to present logs inline
=== CONT  TestImportMultiRegion
    ccl_test.go:294: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/48a201a2352f28545820257a52e2244a/logTestImportMultiRegion1858791545
--- FAIL: TestImportMultiRegion (540.99s)
=== RUN   TestImportMultiRegion/avro
    --- FAIL: TestImportMultiRegion/avro (511.61s)
=== RUN   TestImportMultiRegion/avro/import-into-multi-region-global-to-multi-region-database
    ccl_test.go:282: 
        	Error Trace:	pkg/ccl/importerccl/importerccl_test/pkg/ccl/importerccl/ccl_test.go:282
        	Error:      	Received unexpected error:
        	            	pq: job 857633032011284481: could not mark as reverting: job 857633032011284481: select-job: replica unavailable: (n3,s3):3 unable to serve request to r13:/Table/1{1-2} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=4]: closed timestamp: 1681799401.653294543,0 (2023-04-18 06:30:01); raft status: {"id":"3","term":7,"vote":"3","commit":755,"lead":"3","raftState":"StateLeader","applied":755,"progress":{"1":{"match":750,"next":778,"state":"StateReplicate"},"2":{"match":755,"next":778,"state":"StateReplicate"},"3":{"match":776,"next":777,"state":"StateReplicate"}},"leadtransferee":"0"}: have been waiting 61.00s for slow proposal RequestLease [/Table/11,/Min)
        	Test:       	TestImportMultiRegion/avro/import-into-multi-region-global-to-multi-region-database
        --- FAIL: TestImportMultiRegion/avro/import-into-multi-region-global-to-multi-region-database (441.81s)

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

/cc @cockroachdb/sql-queries

This test on roachdash | Improve this report!

Jira issue: CRDB-27098

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Apr 18, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.1 milestone Apr 18, 2023
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Apr 18, 2023
@cockroach-teamcity
Copy link
Member Author

pkg/ccl/importerccl/importerccl_test.TestImportMultiRegion failed with artifacts on master @ b767225a6893ea6e81d50618d634dbba92d3b680:

github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

goroutine 467431 lock 0xc00adadf00
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1193 kvserver.(*Replica).tick ??? <<<<<
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1192 kvserver.(*Replica).tick ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:674 kvserver.(*Store).processTick ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:386 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

goroutine 467353 lock 0xc005ac59d8
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1191 kvserver.(*Replica).tick ??? <<<<<
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1190 kvserver.(*Replica).tick ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:674 kvserver.(*Store).processTick ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:386 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

goroutine 38716 lock 0xc005a520d8
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:712 kvserver.(*Replica).handleRaftReady ??? <<<<<
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:711 kvserver.(*Replica).handleRaftReady ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:645 kvserver.(*Store).processReady ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:394 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

goroutine 467241 lock 0xc0055472d8
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:712 kvserver.(*Replica).handleRaftReady ??? <<<<<
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:711 kvserver.(*Replica).handleRaftReady ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:645 kvserver.(*Store).processReady ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:394 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

goroutine 38460 lock 0xc0052280d8
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1191 kvserver.(*Replica).tick ??? <<<<<
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1190 kvserver.(*Replica).tick ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:674 kvserver.(*Store).processTick ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:386 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???



=== RUN   TestImportMultiRegion/avro/import-into-multi-region-regional-by-row-to-multi-region-database-concurrent-table-add
=== RUN   TestImportMultiRegion/avro/import-into-multi-region-regional-by-row-to-multi-region-database-wrong-value
=== RUN   TestImportMultiRegion/mysqldump
=== RUN   TestImportMultiRegion/pgdump
=== RUN   TestImportMultiRegion/avro

Parameters: TAGS=bazel,gss,deadlock

Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

@DrewKimball
Copy link
Collaborator

POTENTIAL DEADLOCK:
Previous place where the lock was grabbed
goroutine 167127 lock 0xc008e8e988
github.com/cockroachdb/cockroach/pkg/server/node.go:1513 server.(*lockedMuxStream).Send ??? <<<<<
github.com/cockroachdb/cockroach/pkg/server/node.go:1512 server.(*lockedMuxStream).Send ???
github.com/cockroachdb/cockroach/pkg/server/node.go:1495 server.(*setRangeIDEventSink).Send ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_rangefeed.go:91 kvserver.(*lockedRangefeedStream).Send ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed/registry.go:330 rangefeed.(*registration).outputLoop ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed/registry.go:351 rangefeed.(*registration).runOutputLoop ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed/processor.go:321 rangefeed.(*Processor).run.func1 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

Have been trying to lock it again for more than 5m0s
goroutine 22192508 lock 0xc008e8e988
github.com/cockroachdb/cockroach/pkg/server/node.go:1513 server.(*lockedMuxStream).Send ??? <<<<<
github.com/cockroachdb/cockroach/pkg/server/node.go:1512 server.(*lockedMuxStream).Send ???
github.com/cockroachdb/cockroach/pkg/server/node.go:1495 server.(*setRangeIDEventSink).Send ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_rangefeed.go:91 kvserver.(*lockedRangefeedStream).Send ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed/registry.go:330 rangefeed.(*registration).outputLoop ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed/registry.go:351 rangefeed.(*registration).runOutputLoop ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed/processor.go:321 rangefeed.(*Processor).run.func1 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

Here is what goroutine 167127 doing now
goroutine 167127 [select, 5 minutes]:
github.com/cockroachdb/cockroach/pkg/rpc.(*pipe).send(0x55bb1a0?, {0x74e0e80, 0xc009195320}, {0x5d3a920?, 0xc034c03a80?})
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:1414 +0x97
github.com/cockroachdb/cockroach/pkg/rpc.pipeWriter.send(...)
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:1452
github.com/cockroachdb/cockroach/pkg/rpc.serverStream.SendMsg(...)
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:1540
github.com/cockroachdb/cockroach/pkg/rpc.muxRangeFeedServerAdapter.Send(...)
	github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:1373
github.com/cockroachdb/cockroach/pkg/server.(*lockedMuxStream).Send(0xc008e8e978, 0xc02b64da00?)
	github.com/cockroachdb/cockroach/pkg/server/node.go:1515 +0xb2
github.com/cockroachdb/cockroach/pkg/server.(*setRangeIDEventSink).Send(0xc00900e2d0, 0xc00b450c00)
	github.com/cockroachdb/cockroach/pkg/server/node.go:1496 +0x93
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*lockedRangefeedStream).Send(0xc008f9eb70, 0xc0065dfde4?)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_rangefeed.go:92 +0xb2
github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed.(*registration).outputLoop(0xc0025d9e00, {0x74e0dd8, 0xc008e95140})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed/registry.go:328 +0x29e
github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed.(*registration).runOutputLoop(0xc0025d9e00, {0x74e0e80, 0xc008c86f60}, 0xc005d20b40?)
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed/registry.go:351 +0xc5
github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed.(*Processor).run.func1({0x74e0e80?, 0xc008c86f60?})
	github.com/cockroachdb/cockroach/pkg/kv/kvserver/rangefeed/processor.go:320 +0x4c
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x146
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x43b
Other goroutines holding locks:
goroutine 38767 lock 0xc003e239d8
github.com/sasha-s/go-deadlock/external/com_github_sasha_s_go_deadlock/deadlock.go:137 go-deadlock.(*RWMutex).RLock ??? <<<<<
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_create_replica.go:109 kvserver.(*Store).tryGetReplica ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_create_replica.go:172 kvserver.(*Store).tryGetOrCreateReplica ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_create_replica.go:78 kvserver.(*Store).getOrCreateReplica ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:337 kvserver.(*Store).withReplicaForRequest ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:589 kvserver.(*Store).processRequestQueue ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:373 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

goroutine 38416 lock 0xc0063ec0d8
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1191 kvserver.(*Replica).tick ??? <<<<<
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1190 kvserver.(*Replica).tick ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:674 kvserver.(*Store).processTick ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:386 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

goroutine 467721 lock 0xc005af1280
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1193 kvserver.(*Replica).tick ??? <<<<<
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:1192 kvserver.(*Replica).tick ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:674 kvserver.(*Store).processTick ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:386 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

goroutine 38637 lock 0xc0012272d8
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:712 kvserver.(*Replica).handleRaftReady ??? <<<<<
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:711 kvserver.(*Replica).handleRaftReady ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:645 kvserver.(*Store).processReady ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:394 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

goroutine 541544 lock 0xc0116302a0
github.com/cockroachdb/cockroach/pkg/util/mon/bytes_usage.go:985 mon.(*BytesMonitor).releaseBytes ??? <<<<<
github.com/cockroachdb/cockroach/pkg/util/mon/bytes_usage.go:984 mon.(*BytesMonitor).releaseBytes ???
github.com/cockroachdb/cockroach/pkg/util/mon/bytes_usage.go:829 mon.(*BoundAccount).Close ???
github.com/cockroachdb/cockroach/pkg/util/mon/bytes_usage.go:812 mon.(*BoundAccount).Clear ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:77 kvserver.(*raftReceiveQueue).drainLocked ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:68 kvserver.(*raftReceiveQueue).Drain ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:579 kvserver.(*Store).processRequestQueue ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:373 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

goroutine 467226 lock 0xc0063092d8
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:712 kvserver.(*Replica).handleRaftReady ??? <<<<<
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:711 kvserver.(*Replica).handleRaftReady ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:645 kvserver.(*Store).processReady ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:394 kvserver.(*raftSchedulerShard).worker ???
github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:299 kvserver.(*raftScheduler).Start.func2 ???
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 stop.(*Stopper).RunAsyncTaskEx.func2 ???

@DrewKimball DrewKimball changed the title pkg/ccl/importerccl/importerccl_test: TestImportMultiRegion failed importerccl: deadlock while running TestImportMultiRegion Apr 20, 2023
@DrewKimball
Copy link
Collaborator

Looks similar to #99640 and maybe #100468.

@yuzefovich
Copy link
Member

Let's close this as a dup of #101614, cc @miretskiy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

No branches or pull requests

3 participants