db: adaptively throttle file deletions #2662

jbowens · 2023-06-21T22:15:56Z

Observed during a scale test that strove to write 50 MiB/s/node in user data. The goroutine count was increasing monotonically. The growth was all goroutines responsible for deleting files, which were being paced at the default rate 128 MiB of delete files per second. The already in-flight #2641 will resolve the goroutine growth, but it does not resolve the issue of falling behind in file deletions.

We should not be setting a fixed deletion rate, and instead should smooth deletions based on the rate file deletions observed. See #2641 (review)

RaduBerinde · 2023-06-22T15:20:03Z

With the new design, we can use the number of queued jobs as a signal to increase the target byte rate. I'll think about a more specific proposal.

RaduBerinde · 2023-06-22T15:21:23Z

Or perhaps the total number of bytes to be deleted rather than the number of jobs

RaduBerinde · 2023-06-22T17:57:45Z

@jbowens here is a proposal.

Currently we disable pacing when free space falls below 16GB or when obsolete to live bytes ratio goes above 20%. Otherwise we pace at the target rate (default in CRDB is 128MB/s).

New proposal: we have pairs of low/high thresholds; on one side of the thresholds we pace at the target rate and on the other side we disable pacing. In-between we scale the pacing wait time per byte linearly. We also look at the absolute value for obsolete bytes - if we have very large stores, we would have fallen behind a lot when the obsolete to live ratio is large enough.

Proposed thresholds:

	Start increasing rate at	Disable pacing at
Free space	32GB	16GB
Obsolete to live bytes ratio	5%	20%
Obsolete bytes (new)	1GB	10GB

We improve the delete pacer to be able to scale delete pacing instead of making an on/off decision. The new pacer uses low/high thresholds for various factors; on one side of the threshold we pace at the target rate and on the other side we don't throttle at all. In-between we scale the wait time per byte linearly. Thresholds are as follows: | | Start increasing rate at | Disable pacing at | | ---------------------------- | ------------------------ | ----------------- | | Free space | 32GB | 16GB | | Obsolete to live bytes ratio | 5% | 20% | | Obsolete bytes (*new*) | 1GB | 10GB | Fixes cockroachdb#2662.

jbowens · 2023-06-22T19:04:20Z

With high-throughput stores, I wonder if we might see deletions be disruptive to foreground traffic due to obsolete bytes thresholds. A fully-utilized disk with 2 GiB/s bandwidth and 12 concurrent compactions can generate significant obsolete data very quickly. Maybe it's just a matter of choosing the right thresholds.

In-between we scale the pacing wait time per byte linearly.

Can you explain the scaling a bit more?

Do you think there is value in recording the rate at which deletions have been requested (eg, in the past 5 minutes) and adjusting our pacing rate to that rate, with the Options' rate setting a floor? I'm thinking it would keep deletions well-distributed over time, while still tolerating spikes.

jbowens · 2023-06-22T19:05:32Z

Linking this issue to #18, which is a generalization of the concept of pacing and prioritizing I/O.

RaduBerinde · 2023-06-22T19:10:00Z

Can you explain the scaling a bit more?

Let's say we have thresholds 16GB and 32GB for free space. Above 32GB free space, we limit at target rate. At 16GB, we don't throttle. At the midpoint (24GB) we throttle at double the rate. At 17.6GB we throttle at 10x the rate.

Do you think there is value in recording the rate at which deletions have been requested (eg, in the past 5 minutes) and adjusting our pacing rate to that rate, with the Options' rate setting a floor? I'm thinking it would keep deletions well-distributed over time, while still tolerating spikes.

Sounds smart, will think about it.

RaduBerinde · 2023-06-22T19:52:46Z

I like your suggestion more. I will leave the current checks as safety valves that should never fire in practice and add the history-based rate increase.

Delete pacing is currently an on or off decision. If we are running out of space or have too many obsolete bytes in relation to live bytes, we disable pacing. Otherwise we pace at the configured rate (128MB by default in CRDB). This change improves pacing by keeping track of the average deletion rate over the last 5 minutes and increasing the target rate to match this rate if necessary. The intention is to avoid deletions lagging behind. Fixes cockroachdb#2662.

Delete pacing is currently an on or off decision. If we are running out of space or have too many obsolete bytes in relation to live bytes, we disable pacing. Otherwise we pace at the configured rate (128MB by default in CRDB). This change improves pacing by keeping track of the average deletion rate over the last 5 minutes and increasing the target rate to match this rate if necessary. The intention is to avoid deletions lagging behind. Fixes #2662.

Delete pacing is currently an on or off decision. If we are running out of space or have too many obsolete bytes in relation to live bytes, we disable pacing. Otherwise we pace at the configured rate (128MB by default in CRDB). This change improves pacing by keeping track of the average deletion rate over the last 5 minutes and increasing the target rate to match this rate if necessary. The intention is to avoid deletions lagging behind. Informs cockroachdb#2662.

Delete pacing is currently an on or off decision. If we are running out of space or have too many obsolete bytes in relation to live bytes, we disable pacing. Otherwise we pace at the configured rate (128MB by default in CRDB). This change improves pacing by keeping track of the average deletion rate over the last 5 minutes and increasing the target rate to match this rate if necessary. The intention is to avoid deletions lagging behind. Informs #2662.

blathers-crl bot added A-storage T-storage labels Jun 21, 2023

jbowens mentioned this issue Jun 22, 2023

db: refactor obsolete file deletion code #2641

Merged

RaduBerinde self-assigned this Jun 22, 2023

RaduBerinde mentioned this issue Jun 24, 2023

db: improve delete pacing #2673

Merged

RaduBerinde closed this as completed in #2673 Jun 28, 2023

RaduBerinde mentioned this issue Jun 29, 2023

crl-release-23.1: db: improve delete pacing #2696

Merged

RaduBerinde mentioned this issue Jun 30, 2023

add a roachtest for testing high deletion rates #2704

Open

jbowens added this to [Deprecated] Storage Jun 4, 2024

jbowens moved this to Done in [Deprecated] Storage Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

db: adaptively throttle file deletions #2662

db: adaptively throttle file deletions #2662

jbowens commented Jun 21, 2023 •

edited

Loading

RaduBerinde commented Jun 22, 2023

RaduBerinde commented Jun 22, 2023

RaduBerinde commented Jun 22, 2023 •

edited

Loading

jbowens commented Jun 22, 2023

jbowens commented Jun 22, 2023

RaduBerinde commented Jun 22, 2023

RaduBerinde commented Jun 22, 2023

db: adaptively throttle file deletions #2662

db: adaptively throttle file deletions #2662

Comments

jbowens commented Jun 21, 2023 • edited Loading

RaduBerinde commented Jun 22, 2023

RaduBerinde commented Jun 22, 2023

RaduBerinde commented Jun 22, 2023 • edited Loading

jbowens commented Jun 22, 2023

jbowens commented Jun 22, 2023

RaduBerinde commented Jun 22, 2023

RaduBerinde commented Jun 22, 2023

jbowens commented Jun 21, 2023 •

edited

Loading

RaduBerinde commented Jun 22, 2023 •

edited

Loading