storage: EngineCapacity call is progressively expensive on an import in a large cluster #83137

lunevalex · 2022-06-21T17:20:14Z

While running a large import, we noticed that the a majority of time spent in the EvalAddSSTable call was spent performing the Pebble.EngineCapacity() call. This call was getting more and more expensive due to how it computes available space in the auxiliary directory. Specifically to compute the space used by that directory the code walks the whole directory path and sums up the space used by each directory, so if there are a lot of directories this method will progressively get slower and slower over time. The profile to capture this is attached. The root cause of this slowdown is the accumulation of empty directories by the snapshot receiver. The receiver creates a new directory for every snapshot received here

cockroach/pkg/kv/kvserver/replica_sst_snapshot_storage.go

Line 50 in 2e038a1

snapDir := filepath.Join(s.dir, strconv.Itoa(int(rangeID)), snapUUID.String())

and cleans it up here

cockroach/pkg/kv/kvserver/replica_sst_snapshot_storage.go

Line 59 in 2e038a1

return s.engine.RemoveAll(s.dir)

, but this ends up leaving an empty directory per range in the auxiliary directory.

In the cluster under test a 10 node cluster with 200k+ ranges and a running import, there were 20k+ empty directories in that folder, which made the capacity call noticeably slow. A CPU profile is attached to this ticket.

Jira issue: CRDB-16880

Epic CRDB-16237

lunevalex · 2022-06-21T17:21:26Z

profile (1).pb.gz

Previously, we were creating subdirectories for ranges and range snapshots in the auxiliary directory every time we accepted a snapshot, but only cleaning up the snapshot subdirectories after a snapshot scratch space closed. This left empty parent range directories around on the FS, slowing down future calls to Pebble.Capacity() and indirectly slowing down AddSSTable in the future. This change adds code to clean up empty range directories in the aux directory if they're not being used. Some coordination and synchronization code had to be added to ensure we wouldn't remove a directory that was just created by a concurrent snapshot. Fixes cockroachdb#83137. Release note (performance improvement): Addresses issue where imports and rebalances were being slowed down due to the accumulation of empty directories from range snapshot applications.

82161: ui: Add Jest as test runner to DB Console r=nathanstilwell a=nathanstilwell DB Console is the last place Cockroach Labs is using a test runner other than [Jest](https://jestjs.io/). This PR adds Jest as the test runner intended to replace [Mocha](https://mochajs.org/). Mocha runs in a headless browser via [Karma](https://karma-runner.github.io/latest/index.html) whereas Jest will run tests in NodeJS and simulate a browser environment using [jsdom](https://github.com/jsdom/jsdom). Due to this change in environment, you will see not only files to set up the Jest test runner, but changes to some tests, some mocking of browser globals that are not included in jsdom by default, and some configuration adjustment to the `tsconfig.json`. Since configuration changes are infrequent and are highly contextual, we decided to err on the side of verbose inline documentation in configuration files. Details about individual changes to configs or tests are documented in commit messages. 84068: streamingccl: fix span use-after-finish in ingestion frontier r=samiskin a=stevendanna This fixes the following use-after-finish panic: panic: use of Span after Finish. Span: ingestfntr. Finish previously called at: <stack not captured. Set debugUseAfterFinish> goroutine 1617744 [running]: github.com/cockroachdb/cockroach/pkg/util/tracing.(*Span).detectUseAfterFinish(0xc002c5f180) github.com/cockroachdb/cockroach/pkg/util/tracing/span.go:186 +0x279 github.com/cockroachdb/cockroach/pkg/util/tracing.(*Tracer).startSpanGeneric(0xc0138b2a50, {0x72291c0, 0xc019ebbe40}, {0x68e4eb2, 0x1d}, {{0x0}, 0x0, {0x0, 0x0, {{0x0, ...}, ...}, ...}, ...}) github.com/cockroachdb/cockroach/pkg/util/tracing/tracer.go:1207 +0x997 github.com/cockroachdb/cockroach/pkg/util/tracing.(*Tracer).StartSpanCtx(0xc0138b2a50, {0x72291c0, 0xc019ebbe40}, {0x68e4eb2, 0x1d}, {0xc010cbb760, 0x1, 0x1}) github.com/cockroachdb/cockroach/pkg/util/tracing/tracer.go:1062 +0x1a7 github.com/cockroachdb/cockroach/pkg/util/tracing.ChildSpan({0x72291c0, 0xc019ebbe40}, {0x68e4eb2, 0x1d}) github.com/cockroachdb/cockroach/pkg/util/tracing/tracer.go:1577 +0x145 github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamclient.(*partitionedStreamClient).Heartbeat(0xc00b311080, {0x72291c0, 0xc019ebbe40}, 0xac93aa873318001, {0x16ffc388c7d7c42a, 0x0, 0x0}) github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamclient/partitioned_stream_client.go:83 +0xb7 github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest.(*heartbeatSender).maybeHeartbeat(0xc01980f880, {0x72291c0, 0xc019ebbe40}, {0x16ffc388c7d7c42a, 0x0, 0x0}) github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest/stream_ingestion_frontier_processor.go:180 +0x250 github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest.(*heartbeatSender).startHeartbeatLoop.func1.1() github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest/stream_ingestion_frontier_processor.go:207 +0x41b github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest.(*heartbeatSender).startHeartbeatLoop.func1({0x72291c0, 0xc019ebbe40}) github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamingest/stream_ingestion_frontier_processor.go:227 +0xb5 github.com/cockroachdb/cockroach/pkg/util/ctxgroup.Group.GoCtx.func1() github.com/cockroachdb/cockroach/pkg/util/ctxgroup/ctxgroup.go:169 +0x52 golang.org/x/sync/errgroup.(*Group).Go.func1() golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:74 +0xb4 created by golang.org/x/sync/errgroup.(*Group).Go golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:71 +0xdd I220708 05:29:42.017167 1 (gostd) testmain.go:90 [-] 1 Test //pkg/ccl/streamingccl/streamingest:streamingest_test exited with error code 2 The use after finish was caused by goroutines in the ingestion frontier processor that lived past a call to (*ProcessorBase).InternalClose, which finishes the span attached to the context passed to the processor in Start. To address this we: - ensure that we stop our heartbeat thread before calling InternalClose in ConsumerClosed, and - provide a TrailingMetaCallback so that we can perform cleaned when DrainHelper() is called. When a TrailingMetaCallback is provided, DrainHelper() calls it instead of InternalClose(), allowing us to correctly clean up before the span is closed. - use context cancellation rather than a channel to control the heartbeat loop exit to avoid having to deal with avoiding double-closes of the channel. Fixes #84054 Release note: None 84100: kvserver: Clean up empty range directories after snapshots r=nicktrav a=itsbilal Previously, we were creating subdirectories for ranges and range snapshots in the auxiliary directory every time we accepted a snapshot, but only cleaning up the snapshot subdirectories after a snapshot scratch space closed. This left empty parent range directories around on the FS, slowing down future calls to Pebble.Capacity() and indirectly slowing down AddSSTable in the future. This change adds code to clean up empty range directories in the aux directory if they're not being used. Some coordination and synchronization code had to be added to ensure we wouldn't remove a directory that was just created by a concurrent snapshot. Fixes #83137 Release note (bug fix, performance improvement): Addresses issue where imports and rebalances were being slowed down due to the accumulation of empty directories from range snapshot applications. 84170: sql/sqlstats: record QuerySummary when merging stats r=ericharmeling a=stevendanna During execution of a transaction, all statement statistics are collected in an struct local to that transaction, and then flushed to the main ApplicationStats container when the transaction finishes. Previously, when flushing, we failed to copy the QuerySummary field, leading to `metadata->'querySummary'` from being empty in most cases. Prior to ce1b42b this only affected statements in an explicit transaction. After that commit, it affected all statements. Release note (bug fix): Fix a bug that led to the querySummary field in crdb_internal.statements_statistics's metadata column being empty. 84194: opt: mark SimplifyRootOrdering as an essential rule r=mgartner a=mgartner The unoptimized query oracle, which disables rules, found a bug in the execution engine that is only possible to hit if the `SimplifyRootOrdering` rule is disabled (see #84191). Until the bug is fixed, we mark the rule as essential so that it is not disabled by these tests. Fixes #84067 Release note: None Co-authored-by: Nathan Stilwell <[email protected]> Co-authored-by: Sean Barag <[email protected]> Co-authored-by: Steven Danna <[email protected]> Co-authored-by: Bilal Akhtar <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>

Previously, we were creating subdirectories for ranges and range snapshots in the auxiliary directory every time we accepted a snapshot, but only cleaning up the snapshot subdirectories after a snapshot scratch space closed. This left empty parent range directories around on the FS, slowing down future calls to Pebble.Capacity() and indirectly slowing down AddSSTable in the future. This change adds code to clean up empty range directories in the aux directory if they're not being used. Some coordination and synchronization code had to be added to ensure we wouldn't remove a directory that was just created by a concurrent snapshot. Fixes #83137. Release note (bug fix, performance improvement): Addresses issue where imports and rebalances were being slowed down due to the accumulation of empty directories from range snapshot applications.

Previously, we were creating subdirectories for ranges and range snapshots in the auxiliary directory every time we accepted a snapshot, but only cleaning up the snapshot subdirectories after a snapshot scratch space closed. This left empty parent range directories around on the FS, slowing down future calls to Pebble.Capacity() and indirectly slowing down AddSSTable in the future. This change adds code to clean up empty range directories in the aux directory if they're not being used. Some coordination and synchronization code had to be added to ensure we wouldn't remove a directory that was just created by a concurrent snapshot. Fixes cockroachdb#83137. Release note (bug fix, performance improvement): Addresses issue where imports and rebalances were being slowed down due to the accumulation of empty directories from range snapshot applications.

lunevalex added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jun 21, 2022

blathers-crl bot added the T-storage Storage Team label Jun 21, 2022

mwang1026 added the E-quick-win Likely to be a quick win for someone experienced. label Jun 22, 2022

nicktrav assigned itsbilal Jul 5, 2022

itsbilal mentioned this issue Jul 8, 2022

kvserver: Clean up empty range directories after snapshots #84100

Merged

craig bot closed this as completed in #84100 Jul 11, 2022

craig bot closed this as completed in 73c5980 Jul 11, 2022

blathers-crl bot mentioned this issue Jul 11, 2022

release-22.1: kvserver: Clean up empty range directories after snapshots #84223

Merged

itsbilal mentioned this issue Sep 19, 2022

[release-21.2] kvserver: Clean up empty range directories after snapshots #88142

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: EngineCapacity call is progressively expensive on an import in a large cluster #83137

storage: EngineCapacity call is progressively expensive on an import in a large cluster #83137

lunevalex commented Jun 21, 2022 •

edited by exalate-issue-sync bot

Loading

lunevalex commented Jun 21, 2022

storage: EngineCapacity call is progressively expensive on an import in a large cluster #83137

storage: EngineCapacity call is progressively expensive on an import in a large cluster #83137

Comments

lunevalex commented Jun 21, 2022 • edited by exalate-issue-sync bot Loading

lunevalex commented Jun 21, 2022

lunevalex commented Jun 21, 2022 •

edited by exalate-issue-sync bot

Loading