Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert v1 to v2 range-del blocks on the fly #146

Merged
merged 2 commits into from
May 28, 2019

Conversation

petermattis
Copy link
Collaborator

RocksDB generates range-del blocks that contain range tombstones that are
not fragmented and sorted. Pebble generates range-del blocks containing
tombstones that are fragmented and sorted. The fragmented and sorted
range-del blocks are suitable for serving from directly, while the
unfragmented/unsorted blocks are not. We now transform RocksDB format
range-del blocks into Pebble format blocks on the fly.

The transformation is done via a hook that allows passing a block through a
function after it has been read from disk but before it is added to the
cache.

Fixes #68

Provide key stability across blockIter positioning calls

Provide stability for the keys returned from a blockIter across positioning
calls (Seek*, First, Last, Prev, Next) when the block has a restart
interval of 1. This is achieved by noticing when a key does not share any
bytes with the previous key and changing the returned key to point directly
into the block instead of the key buffer. This behavior was already being
expected by rangedel.Truncate and by forthcoming code to transform v1
range-del blocks to v2 format on the fly.

@petermattis petermattis requested a review from ajkr May 24, 2019 18:58
@petermattis
Copy link
Collaborator Author

This change is Reviewable

@petermattis
Copy link
Collaborator Author

I need to add testing for the v1 -> v2 transformation, which is why this is still marked [WIP]. The first commit is ready to go and something I just stumbled upon. I'm not entirely settled on that approach. Perhaps it would be better for rangedel.Truncate and this now range-del block transform to copy the keys returned during block iteration.

@petermattis petermattis force-pushed the pmattis/range-del-v1-convert branch from bdc377f to bfc7eb3 Compare May 24, 2019 20:12
@petermattis petermattis changed the title [WIP] Convert v1 to v2 range-del blocks on the fly Convert v1 to v2 range-del blocks on the fly May 24, 2019
@petermattis petermattis requested a review from Ryanfsdf May 24, 2019 20:13
@petermattis
Copy link
Collaborator Author

Tests have been added. I did this by adding a private Writer.rangeDelV1Format flag that causes an sstable writer to allow unfragmented and unsorted tombstones to be added to the sstable. I think this PR is ready to go.

Provide stability for the keys returned from a blockIter across
positioning calls (Seek*, First, Last, Prev, Next) when the block has a
restart interval of 1. This is achieved by noticing when a key does not
share any bytes with the previous key and changing the returned key to
point directly into the block instead of the key buffer. This behavior
was already being expected by `rangedel.Truncate` and by forthcoming
code to transform v1 range-del blocks to v2 format on the fly.
RocksDB generates range-del blocks that contain range tombstones that
are not fragmented and sorted. Pebble generates range-del blocks
containing tombstones that are fragmented and sorted. The fragmented and
sorted range-del blocks are suitable for serving from directly, while
the unfragmented/unsorted blocks are not. We now transform RocksDB
format range-del blocks into Pebble format blocks on the fly.

The transformation is done via a hook that allows passing a block
through a function after it has been read from disk but before it is
added to the cache.

Fixes #68
@petermattis petermattis force-pushed the pmattis/range-del-v1-convert branch from bfc7eb3 to 13d77b0 Compare May 26, 2019 00:13
Copy link
Contributor

@Ryanfsdf Ryanfsdf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r1, 5 of 5 files at r2, 2 of 2 files at r3.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @ajkr)

@petermattis
Copy link
Collaborator Author

TFTR!

@petermattis petermattis merged commit ce8e5ee into master May 28, 2019
@petermattis petermattis deleted the pmattis/range-del-v1-convert branch May 28, 2019 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sstable: convert v1 range-del blocks to v2 on-the-fly
2 participants