Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db: optimize for reads without range keys #1605

Closed
jbowens opened this issue Mar 28, 2022 · 0 comments · Fixed by #1639
Closed

db: optimize for reads without range keys #1605

jbowens opened this issue Mar 28, 2022 · 0 comments · Fixed by #1639
Assignees

Comments

@jbowens
Copy link
Collaborator

jbowens commented Mar 28, 2022

Range keys are intended to be rare, and in the CockroachDB use case of MVCC Delete Range especially rare. We should ensure that a read in a part of the key space NOT covered by range keys loads zero range key blocks. There is at least one TODO documenting this optimization, but I'll sketch out something larger here.

The new keyspan.Span type supports modeling spans without range keys as a valid Span with an empty Keys slice. These empty spans will be introduced by the range key level iterator (introduced in #1576). Imagine a SeekGE(k) that finds a file with bounds m-p. The span [k,m) contains no range keys within the level. The level iterator can return a Span{Start: k, End: m, Keys: nil}.

  • The internal/keyspan.LevelIter needs to change to return these empty spans.
  • The internal/keyspan.MergingIter currently elides empty spans. That logic should be removed, allowing empty spans to be propagated up the iterator stack.
  • The DefragmentingIter defragments abutting, logically equivalent spans. Two empty spans are logically equivalent, so defragmentation will defragment empty spans. This is a problem, because defragmenting requires Next/Prev-ing adjacent spans until a nonequivalent span is encountered. The defragmenting iterator would trigger a block load at both the beginning and the end of the defragmented empty span. The defragmenting iterator should be adapted to avoid defragmenting empty spans.

Related to #1444

@itsbilal itsbilal self-assigned this Mar 28, 2022
itsbilal added a commit to itsbilal/pebble that referenced this issue Apr 11, 2022
This change updates keyspan.LevelIter to emit empty Spans (i.e.
spans with valid start/end but no values/Keys) between files.
This is used as an optimization to avoid loading the next file
after a file has been exhausted.

To preserve this optimization up the iterator stack, the
merging iter and defragmenting iter were also updated to
special-case empty spans returned by child iterators and
preserve them as-is instead of defragmenting them
or iterating beyond them.

Fixes cockroachdb#1605
itsbilal added a commit to itsbilal/pebble that referenced this issue Apr 13, 2022
This change updates keyspan.LevelIter to emit empty Spans (i.e.
spans with valid start/end but no values/Keys) between files.
This is used as an optimization to avoid loading the next file
after a file has been exhausted.

To preserve this optimization up the iterator stack, the
merging iter and defragmenting iter were also updated to
special-case empty spans returned by child iterators and
preserve them as-is instead of defragmenting them
or iterating beyond them.

The top-level Iterator was also updated to elide
rangekey boundary keys that correspond to empty spans that do
not cover any point keys.

Fixes cockroachdb#1605
itsbilal added a commit to itsbilal/pebble that referenced this issue Apr 14, 2022
This change updates keyspan.LevelIter to emit empty Spans (i.e.
spans with valid start/end but no values/Keys) between files.
This is used as an optimization to avoid loading the next file
after a file has been exhausted.

To preserve this optimization up the iterator stack, the
merging iter and defragmenting iter were also updated to
special-case empty spans returned by child iterators and
preserve them as-is instead of defragmenting them
or iterating beyond them.

The top-level Iterator was also updated to elide
rangekey boundary keys that correspond to empty spans that do
not cover any point keys.

Fixes cockroachdb#1605
itsbilal added a commit to itsbilal/pebble that referenced this issue Apr 19, 2022
This change updates keyspan.LevelIter to emit empty Spans (i.e.
spans with valid start/end but no values/Keys) between files.
This is used as an optimization to avoid loading the next file
after a file has been exhausted.

To preserve this optimization up the iterator stack, the
merging iter and defragmenting iter were also updated to
special-case empty spans returned by child iterators and
preserve them as-is instead of defragmenting them
or iterating beyond them.

The top-level Iterator was also updated to elide
rangekey boundary keys that correspond to empty spans that do
not cover any point keys.

Fixes cockroachdb#1605
itsbilal added a commit to itsbilal/pebble that referenced this issue Apr 19, 2022
This change updates keyspan.LevelIter to emit empty Spans (i.e.
spans with valid start/end but no values/Keys) between files.
This is used as an optimization to avoid loading the next file
after a file has been exhausted.

To preserve this optimization up the iterator stack, the
merging iter and defragmenting iter were also updated to
special-case empty spans returned by child iterators and
preserve them as-is instead of defragmenting them
or iterating beyond them.

The top-level Iterator was also updated to elide
rangekey boundary keys that correspond to empty spans that do
not cover any point keys.

Fixes cockroachdb#1605
itsbilal added a commit that referenced this issue Apr 19, 2022
This change updates keyspan.LevelIter to emit empty Spans (i.e.
spans with valid start/end but no values/Keys) between files.
This is used as an optimization to avoid loading the next file
after a file has been exhausted.

To preserve this optimization up the iterator stack, the
merging iter and defragmenting iter were also updated to
special-case empty spans returned by child iterators and
preserve them as-is instead of defragmenting them
or iterating beyond them.

The top-level Iterator was also updated to elide
rangekey boundary keys that correspond to empty spans that do
not cover any point keys.

Fixes #1605
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants