Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: add garbage collection for range tombstones #76783

Closed

Conversation

erikgrinaker
Copy link
Contributor

@erikgrinaker erikgrinaker commented Feb 18, 2022

storage: add IterKeyTypePointsWithRanges for MVCCIterator

This patch adds an iteration key type IterKeyTypePointsWithRanges for
MVCCIterator which iterates only over point keys, but surfaces range
keys overlapping those points.

Release note: None

storage: clear range keys in some Engine.Clear* methods

This patch automatically clears all range keys (at any timestamp) in the
Engine methods ClearMVCCRangeAndIntents and ClearIterRange. Range
keys are not affected by ClearRawRange (which clears raw engine keys)
and ClearMVCCRange (which is intended for clearing a subset of
versions for a single key).

This is implemented via a new Engine.ExperimentalClearMVCCRangeKeys
method which calls through to Pebble's RangeKeyDelete, efficiently
deleting all range keys in a key span.

By extension, the ClearRange RPC method now also clears range keys,
both when using point deletions and range deletions.

Release note: None

kvserver: add garbage collection for range tombstones

This patch adds basic support for MVCC range tombstones in garbage
collection. It garbage collects points below a range tombstone, as well
as the range tombstones themselves. However, MVCCStats do not
currently take range tombstones into account for garbage statistics --
this will be addressed separately.

Garbage collection below range tombstones does not do anything fancy
like dropping a Pebble range tombstone when there are no newer versions
above the range tombstone -- it still uses point clears for every GCed
key. This can be optimized later.

Resolves #70414.

Release note: None

batcheval: add ExperimentalRanges parameter for GCRequest

This adds a parameter ExperimentalRanges to GCRequest, and a
corresponding ExperimentalGCRanges version gate, which allows GCing
large swathes of keys using Pebble range tombstones.

This parameter is not yet in use, but is added preemptively to allow
garbage collection to make use of it in the future when GCing below an
MVCC range tombstone with no keys above it. Since MVCC range tombstones
are experimental, this can possibly be added in a 22.1 patch release.

Release note: None

@erikgrinaker erikgrinaker requested review from a team as code owners February 18, 2022 18:22
@erikgrinaker erikgrinaker self-assigned this Feb 18, 2022
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@erikgrinaker
Copy link
Contributor Author

erikgrinaker commented Feb 18, 2022

Realized MVCCGarbageCollect() will choke on GCed points below a range tombstone, because it believes these points are still visible. Will look into this later, and add an end-to-end test.

@erikgrinaker erikgrinaker marked this pull request as draft February 18, 2022 18:29
@erikgrinaker
Copy link
Contributor Author

erikgrinaker commented Feb 18, 2022

@jbowens One of the changes here is to add an IterKeyTypePointsWithRanges key type for iterators, which only iterates over point keys but surfaces range keys that overlap those point keys. Range keys are not surfaced by themselves. This is useful e.g. for GC. Anything you might be interested in moving down to Pebble?

@erikgrinaker erikgrinaker force-pushed the gc-range-tombstone branch 3 times, most recently from 4e74115 to 67d71b2 Compare February 19, 2022 17:08
@erikgrinaker
Copy link
Contributor Author

erikgrinaker commented Feb 19, 2022

Ok, I think this should be ready for review now. I added a parameter ExperimentalRanges to GCRequest which will allow for a future optimization where we GC below MVCC range tombstones using a Pebble tombstone, but have not actually implemented this now. We may be able to get away with such an optimization during stability, since it would only apply when enabling experimental MVCC range tombstones.

This optimization could possibly also simply call ClearRange, but I figured we might want some additional checks during GC (e.g. that noone has written anything to the range since the GC request was sent).

This patch adds an iteration key type `IterKeyTypePointsWithRanges` for
`MVCCIterator` which iterates only over point keys, but surfaces range
keys overlapping those points.

Release note: None
This patch automatically clears all range keys (at any timestamp) in the
`Engine` methods `ClearMVCCRangeAndIntents` and `ClearIterRange`. Range
keys are not affected by `ClearRawRange` (which clears raw engine keys)
and `ClearMVCCRange` (which is intended for clearing a subset of
versions for a single key).

This is implemented via a new `Engine.ExperimentalClearMVCCRangeKeys`
method which calls through to Pebble's `RangeKeyDelete`, efficiently
deleting all range keys in a key span.

By extension, the `ClearRange` RPC method now also clears range keys,
both when using point deletions and range deletions.

Release note: None
This patch adds basic support for MVCC range tombstones in garbage
collection. It garbage collects points below a range tombstone, as well
as the range tombstones themselves. However, `MVCCStats` do not
currently take range tombstones into account for garbage statistics --
this will be addressed separately.

Garbage collection below range tombstones does not do anything fancy
like dropping a Pebble range tombstone when there are no newer versions
above the range tombstone -- it still uses point clears for every GCed
key. This can be optimized later.

Release note: None
This adds a parameter `ExperimentalRanges` to `GCRequest`, and a
corresponding `ExperimentalGCRanges` version gate, which allows GCing
large swathes of keys using Pebble range tombstones.

This parameter is not yet in use, but is added preemptively to allow
garbage collection to make use of it in the future when GCing below an
MVCC range tombstone with no keys above it. Since MVCC range tombstones
are experimental, this can possibly be added in a 22.1 patch release.

Release note: None
@erikgrinaker erikgrinaker marked this pull request as draft February 23, 2022 09:34
@erikgrinaker erikgrinaker deleted the gc-range-tombstone branch August 5, 2022 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kvserver: garbage collect MVCC range tombstones
2 participants