Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db: improve panic handling inside makeRoomForWrite #2300

Open
knz opened this issue Feb 2, 2023 · 2 comments
Open

db: improve panic handling inside makeRoomForWrite #2300

knz opened this issue Feb 2, 2023 · 2 comments

Comments

@knz
Copy link

knz commented Feb 2, 2023

Found here: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_BazelExtendedCi/8561354?showRootCauses=false&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandBuildTestsSection=true

=== RUN   TestAddNewStoresToExistingNodes
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/5b2c9b3a394428c7572d34050aad8975/logTestAddNewStoresToExistingNodes4038090786
    test_log_scope.go:79: use -show-logs to present logs inline
fatal error: sync: unlock of unlocked mutex
goroutine 1338982 [running]:
sync.fatal({0x6f37cdd?, 0xffffffff?})
  GOROOT/src/runtime/panic.go:1031 +0x1e
sync.(*Mutex).unlockSlow(0xc007fd5078, 0xffffffff)
  GOROOT/src/sync/mutex.go:229 +0x49
sync.(*Mutex).Unlock(0xc007fd5078)
  GOROOT/src/sync/mutex.go:223 +0x55
panic({0x670d600, 0x9eb4dd0})
  GOROOT/src/runtime/panic.go:890 +0x262
github.com/cockroachdb/pebble/vfs.(*diskHealthCheckingFile).timeDiskOp(0xc00d28c000, 0x2, 0xc00d6dd7a0)
  github.com/cockroachdb/pebble/vfs/external/com_github_cockroachdb_pebble/vfs/disk_health.go:262 +0x1a5
github.com/cockroachdb/pebble/vfs.(*diskHealthCheckingFile).Sync(0xc00d28c000)
  github.com/cockroachdb/pebble/vfs/external/com_github_cockroachdb_pebble/vfs/disk_health.go:219 +0x69
github.com/cockroachdb/pebble/vfs.(*enospcFile).Sync(0xc004779068)
  github.com/cockroachdb/pebble/vfs/external/com_github_cockroachdb_pebble/vfs/disk_full.go:391 +0x6d
github.com/cockroachdb/pebble.(*DB).makeRoomForWrite(0xc007fd4f00, 0x0)
  github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/db.go:1978 +0x1559
github.com/cockroachdb/pebble.(*DB).ingest.func1()
  github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/ingest.go:748 +0x2d2
github.com/cockroachdb/pebble.(*commitPipeline).AllocateSeqNum(0xc005f9d300, 0x6, 0xc00d6de0b0, 0xc00d6de0e8)
  github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/commit.go:385 +0x447
github.com/cockroachdb/pebble.(*DB).ingest(0xc007fd4f00, {0xc00097cb00, 0x6, 0x8}, 0x71b7648)
  github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/ingest.go:785 +0x5ba
github.com/cockroachdb/pebble.(*DB).IngestWithStats(0xc007fd4f00, {0xc00097cb00, 0x6, 0x8})
  github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/ingest.go:680 +0x111
github.com/cockroachdb/cockroach/pkg/storage.(*Pebble).IngestExternalFilesWithStats(0xc002b878c0, {0x0?, 0x0?}, {0xc00097cb00, 0x6, 0x8})
  github.com/cockroachdb/cockroach/pkg/storage/pebble.go:1764 +0x66
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).applySnapshot(0xc004f08000, {0x9ef3c68, _}, {{0x4b, 0x2f, 0x2d, 0x5f, 0x62, 0x99, 0x4c, ...}, ...}, ...)
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raftstorage.go:657 +0x1274
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked(_, {_, _}, {{0x4b, 0x2f, 0x2d, 0x5f, 0x62, 0x99, 0x4c, ...}, ...})
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:806 +0xa0e
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processRaftSnapshotRequest.func1({0x9ef3c68, 0xc0084f2c90}, 0xc004f08000)
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:463 +0x3b0
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).withReplicaForRequest(0x0?, {0x9ef3c68, 0xc0084f2c90}, 0xc007999148, 0xc00d6e00b8)
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:341 +0x15c
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processRaftSnapshotRequest(0xc003132000, {0x9ef3c68, 0xc0084f2c90}, 0xc0079990e0, {{0x4b, 0x2f, 0x2d, 0x5f, 0x62, 0x99, ...}, ...})
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:404 +0xe5
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).receiveSnapshot(0xc003132000, {0x9ef3c68, 0xc004bd4cc0}, 0xc0079990e0, {0x7fe4dcb6f698, 0xc00633b0c0})
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_snapshot.go:1089 +0xc6e
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).HandleSnapshot.func1({0x9ef3c68, 0xc004bd4780})
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:210 +0xfd
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTaskWithErr(0xc0082f5680, {0x9ef3c68, 0xc004bd4780}, {0xc0084f31a0?, 0x2?}, 0xc00f314b68)
  github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:322 +0x148
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).HandleSnapshot(0xc003132000, {0x9ef3bc0, 0xc000a5ad80}, 0xc0079990e0, {0x7fe4dcb6f670, 0xc00633b0c0})
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:207 +0x114
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*RaftTransport).RaftSnapshot(0xc0085ca600, {0x9f2fd58, 0xc00633b0c0})
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/raft_transport.go:375 +0x25b
github.com/cockroachdb/cockroach/pkg/kv/kvserver._MultiRaft_RaftSnapshot_Handler({0x6e10d40?, 0xc0085ca600}, {0x9f23d60?, 0xc0091109a0})
  github.com/cockroachdb/cockroach/pkg/kv/kvserver/bazel-out/k8-fastbuild/bin/pkg/kv/kvserver/kvserver_go_proto_/github.com/cockroachdb/cockroach/pkg/kv/kvserver/storage_services.pb.go:270 +0xc3
github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor.StreamServerInterceptor.func1({0x6e10d40, 0xc0085ca600}, {0x9f24270?, 0xc0065e60e0?}, 0xc00b2006f0, 0x71b0660)
  github.com/cockroachdb/cockroach/pkg/util/tracing/grpcinterceptor/grpc_interceptor.go:163 +0x67a
google.golang.org/grpc.chainStreamInterceptors.func1.1({0x6e10d40, 0xc0085ca600}, {0x9f24270, 0xc0065e60e0})
  google.golang.org/grpc/external/org_golang_google_grpc/server.go:1482 +0x106
github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func4({0x6e10d40, 0xc0085ca600}, {0x9f24270, 0xc0065e60e0}, 0xc00b2006f0, 0xc000a5ac40)
  github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:272 +0xe9
google.golang.org/grpc.chainStreamInterceptors.func1.1({0x6e10d40, 0xc0085ca600}, {0x9f24270, 0xc0065e60e0})
  google.golang.org/grpc/external/org_golang_google_grpc/server.go:1485 +0x1ea
github.com/cockroachdb/cockroach/pkg/rpc.kvAuth.streamInterceptor({{{0x63d8720?}}}, {0x6e10d40, 0xc0085ca600}, {0x9f24270, 0xc0065e60e0}, 0xc00b2006f0, 0xc000a5ac40)
  github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/auth.go:136 +0x468
google.golang.org/grpc.chainStreamInterceptors.func1.1({0x6e10d40, 0xc0085ca600}, {0x9f24270, 0xc0065e60e0})
  google.golang.org/grpc/external/org_golang_google_grpc/server.go:1485 +0x1ea
github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func2.1({0xc0082f5680?, 0x6c37be0?})
  github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:241 +0x70
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTaskWithErr(0xc0082f5680, {0x9ef3c68, 0xc004bd4510}, {0x203000?, 0x203000?}, 0xc0019ba6c0)
  github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:322 +0x148
github.com/cockroachdb/cockroach/pkg/rpc.NewServerEx.func2({0x6e10d40, 0xc0085ca600}, {0x9f24270?, 0xc0065e60e0?}, 0xc00b2006f0, 0xc000a5ac40)
  github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:240 +0x14d
google.golang.org/grpc.chainStreamInterceptors.func1.1({0x6e10d40, 0xc0085ca600}, {0x9f24270, 0xc0065e60e0})
  google.golang.org/grpc/external/org_golang_google_grpc/server.go:1485 +0x1ea
google.golang.org/grpc.chainStreamInterceptors.func1({0x6e10d40, 0xc0085ca600}, {0x9f24270, 0xc0065e60e0}, 0xc00b2006f0, 0x71b0660)
  google.golang.org/grpc/external/org_golang_google_grpc/server.go:1487 +0x276
google.golang.org/grpc.(*Server).processStreamingRPC(0xc0079981e0, {0x9f3a980, 0xc0095cd380}, 0xc007fec7e0, 0xc000aab9b0, 0xc786e00, 0x0)
  google.golang.org/grpc/external/org_golang_google_grpc/server.go:1636 +0x1ef6
google.golang.org/grpc.(*Server).handleStream(0xc0079981e0, {0x9f3a980, 0xc0095cd380}, 0xc007fec7e0, 0x0)
  google.golang.org/grpc/external/org_golang_google_grpc/server.go:1717 +0xfaf
google.golang.org/grpc.(*Server).serveStreams.func1.2()
  google.golang.org/grpc/external/org_golang_google_grpc/server.go:965 +0xed
created by google.golang.org/grpc.(*Server).serveStreams.func1
  google.golang.org/grpc/external/org_golang_google_grpc/server.go:963 +0x4de

Jira issue: PEBBLE-137

@knz knz added C-bug Something isn't working T-storage labels Feb 2, 2023
@RaduBerinde
Copy link
Member

It looks like it's coming from

panic("concurrent write operations detected on file")

so it's another manifestation of cockroachdb/cockroach#96414, cockroachdb/cockroach#96422.

The fix is in the process of being merged: cockroachdb/cockroach#96446

I will look at the code to see why the panic unwinding hit the double unlock though.

@RaduBerinde RaduBerinde self-assigned this Feb 2, 2023
@RaduBerinde
Copy link
Member

The double-unlock comes from the large code block in makeRoomForWrite which is clearly not panic friendly. I'll repurpose this bug to track improving that.

@RaduBerinde RaduBerinde changed the title sync: unlock of unlocked mutex in (*diskHealthCheckingFile).timeDiskOp() db: improve panic handling inside makeRoomForWrite Feb 2, 2023
@RaduBerinde RaduBerinde removed their assignment Apr 1, 2024
@RaduBerinde RaduBerinde added E-quick-win and removed C-bug Something isn't working labels Apr 1, 2024
@jbowens jbowens moved this to Backlog in [Deprecated] Storage Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Backlog
Development

No branches or pull requests

3 participants