From 303d580233812b60226952b442d1ed2a05aad0b4 Mon Sep 17 00:00:00 2001 From: Jackson Owens Date: Wed, 21 Feb 2024 15:42:20 -0500 Subject: [PATCH] internal/base: add doc comment discussing TrySeekUsingNext --- internal/base/doc.go | 64 +++++++++++++++++++++++++++++++++++++++ internal/base/iterator.go | 27 ++++++++++------- 2 files changed, 80 insertions(+), 11 deletions(-) create mode 100644 internal/base/doc.go diff --git a/internal/base/doc.go b/internal/base/doc.go new file mode 100644 index 00000000000..d24f6015f8b --- /dev/null +++ b/internal/base/doc.go @@ -0,0 +1,64 @@ +// Copyright 2024 The LevelDB-Go and Pebble Authors. All rights reserved. Use +// of this source code is governed by a BSD-style license that can be found in +// the LICENSE file. + +// Package base defines fundamental types used across Pebble, including keys, +// iterators, etc. +// +// # Iterators +// +// The [InternalIterator] interface defines the iterator interface implemented +// by all iterators over point keys. Internal iterators are composed to form an +// "iterator stack," resulting in a single internal iterator (see mergingIter in +// the pebble package) that yields a merged view of the LSM. +// +// The SeekGE and SeekPrefixGE positioning methods take a set of flags +// [SeekGEFlags] allowing the caller to provide additional context to iterator +// implementations. The TrySeekUsingNext flag is set when the caller has +// knowledge that no action has been performed to move this iterator beyond the +// first key that would be found if this iterator were to honestly do the +// intended seek. This allows a class of optimizations where an internal +// iterator may avoid a full naive repositioning if the iterator is already +// at a proximate position. This also means every caller (including intermediary +// internal iterators within the iterator stack) must preserve this +// relationship. +// +// For example, if a range deletion deletes the remainder of a prefix, the +// merging iterator may be able to elide a SeekPrefixGE on level iterators +// beneath the range deletion. However in doing so, a TrySeekUsingNext flag +// passed by the merging iterator's client no longer transitively holds for the +// level iterators in all cases. The merging iterator assumes responsibility for +// ensuring that SeekPrefixGE is propagated to its consitutent iterators only +// when valid. +// +// Instances of TrySeekUsingNext optimizations and interactions: +// +// - The [pebble.Iterator] has a SeekPrefixGE optimization: Above the [InternalIterator] +// interface, the [pebble.Iterator]'s SeekGE method detects consecutive seeks +// to monotonically increasing keys and examines the current key. If the +// iterator is already positioned appropriately (at a key ≥ the seek key), it +// elides the entire seek of the internal iterator. +// +// The pebble mergingIter does not perform any TrySeekUsingNext optimization +// itself, but it must preserve the TrySeekUsingNext contract in its calls to +// its child iterators because it passes the TrySeekUsingNext flag as-is to its +// child iterators. It can do this because it always translates calls to its SeekGE and +// SeekPrefixGE methods as equivalent calls to every child iterator. However +// there are a few subtleties: +// +// - In some cases the calls made to child iterators may only be equivalent +// within the context of the iterator's visible sequence number. For example, +// if a range deletion tombstone is present on a level, seek keys propagated +// to lower-levelled child iterators may be adjusted without violating the +// transitivity of the TrySeekUsingNext flag and its invariants so long as +// the mergingIter is always reading state at the same visible sequence +// number. +// - The mergingIter takes care to avoid ever advancing a child iterator that's +// already positioned beyond the current iteration prefix. +// - When propagating TrySeekUsingNext to its child iterators, the mergingIter +// must propagate it to all child iterators or none. This is required because +// of the handling of range deletions. Unequal application of TrySeekUsingNext +// may cause range deletions that have already been skipped over in a level to +// go unseen, despite being relevant to other levels that do not use +// TrySeekUsingNext. +package base diff --git a/internal/base/iterator.go b/internal/base/iterator.go index 39e05e1fc7d..3ea90f2b22f 100644 --- a/internal/base/iterator.go +++ b/internal/base/iterator.go @@ -227,17 +227,22 @@ const ( // SeekGEFlagsNone is the default value of SeekGEFlags, with all flags disabled. const SeekGEFlagsNone = SeekGEFlags(0) -// TrySeekUsingNext indicates whether a performance optimization was enabled -// by a caller, indicating the caller has not done any action to move this -// iterator beyond the first key that would be found if this iterator were to -// honestly do the intended seek. For example, say the caller did a -// SeekGE(k1...), followed by SeekGE(k2...) where k1 <= k2, without any -// intermediate positioning calls. The caller can safely specify true for this -// parameter in the second call. As another example, say the caller did do one -// call to Next between the two Seek calls, and k1 < k2. Again, the caller can -// safely specify a true value for this parameter. Note that a false value is -// always safe. The callee is free to ignore the true value if its -// implementation does not permit this optimization. +// TODO(jackson): Rename TrySeekUsingNext to MonotonicallyForward or something +// similar that avoids prescribing the implementation of the optimization but +// instead focuses on the contract expected of the caller. + +// TrySeekUsingNext is set when the caller has knowledge that no action has been +// performed to move this iterator beyond the first key that would be found if +// this iterator were to honestly do the intended seek. This enables a class of +// performance optimizations within various internal iterator implementations. +// For example, say the caller did a SeekGE(k1...), followed by SeekGE(k2...) +// where k1 <= k2, without any intermediate positioning calls. The caller can +// safely specify true for this parameter in the second call. As another +// example, say the caller did do one call to Next between the two Seek calls, +// and k1 < k2. Again, the caller can safely specify a true value for this +// parameter. Note that a false value is always safe. The callee is free to +// ignore the true value if its implementation does not permit this +// optimization. // // We make the caller do this determination since a string comparison of k1, k2 // is not necessarily cheap, and there may be many iterators in the iterator