Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keyspan: simplify Truncate, Filter #3282

Merged
merged 2 commits into from
Mar 22, 2024

Conversation

RaduBerinde
Copy link
Member

@RaduBerinde RaduBerinde commented Feb 8, 2024

keyspan: simplify and reimplement Truncate

We can simplify the Truncate operator, as we will never truncate
within the same user key. It now only takes user key bounds and is
much easier to reason about.

We also reimplement it so that we can simplify Filter.

keyspan: simplify Filter

The FilterFunc contract is not adequately spelled out; the code
makes certain unverified assumptions about how the bounds can be
changed.

Now that Truncate is a separate iterator, we no longer need Filter to
allow for changing bounds. We simplify FilterFunc to just return the
list of keys that are retained. We no longer need the extra
comparisons in SeekGE/SeekLT.

@RaduBerinde RaduBerinde requested review from itsbilal, jbowens and a team February 8, 2024 18:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde
Copy link
Member Author

Interesting, I stressed the regular meta test but there is a failure in the crossversion one.

@RaduBerinde RaduBerinde marked this pull request as draft February 22, 2024 01:25
@RaduBerinde RaduBerinde marked this pull request as ready for review March 20, 2024 20:38
@RaduBerinde
Copy link
Member Author

I reworked this to use the new UserKeyBounds.

@RaduBerinde RaduBerinde force-pushed the truncate-simplify branch 2 times, most recently from ceeacba to 32db3ff Compare March 20, 2024 21:34
@RaduBerinde
Copy link
Member Author

Friendly ping.

Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r1.
Reviewable status: 6 of 9 files reviewed, 2 unresolved discussions (waiting on @itsbilal and @RaduBerinde)


internal/keyspan/truncate.go line 129 at r1 (raw file):

	for span != nil {
		if i.bounds.End.Kind == base.Inclusive && span.Contains(i.cmp, i.bounds.End.Key) {
			err := errors.AssertionFailedf("inclusive upper bound %q inside span %s", i.bounds.End.Key, span)

I think I must be being dense, but I didn't understand this assertion. If we have bounds [a, b] and call Last() on an sstable iterator, what's to prevent the span from having bounds [a,c) and hitting this condition on the first iteration?

EDIT: Ahh, I think I see—we're expecting the caller to externally guarantee this invariant. I guess I'm still confused as to how we provide that invariant?


internal/keyspan/truncate.go line 135 at r1 (raw file):

			return nil, false, err
		}
		// Intersect [span.Start, span.End) with [lower, upper).

nit: s/[lower, upper)/i.bounds/?

Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 9 files reviewed, 2 unresolved discussions (waiting on @itsbilal and @jbowens)


internal/keyspan/truncate.go line 129 at r1 (raw file):
Right, it's mentioned for Truncate:

// Note that fragment iterator Spans always have exclusive end-keys; if the
// given bounds have an inclusive end key the input iterator must not produce a
// span that contains that key.

Basically if you hit this case, you are doing something wrong. In practice this currently means that you have and inclusive end bound for a virtual table (i.e. Largest is not a sentinel) and the sst has a RangeDel/RangeKey that contains that key. This is a bug as there's no way to express that the key is "covered" by the RangeDel while staying within the bounds.

Looking at it another way, the only reason a caller would provide an inclusive end bound to Truncate is to perform this check. It's equivalent to the previous panicOnUpperTruncate, just feels more elegant.

We can simplify the Truncate operator, as we will never truncate
within the same user key. It now only takes user key bounds and is
much easier to reason about.

We also reimplement it so that we can simplify Filter.
The `FilterFunc` contract is not adequately spelled out; the code
makes certain unverified assumptions about how the bounds can be
changed.

Now that Truncate is a separate iterator, we no longer need Filter to
allow for changing bounds. We simplify `FilterFunc` to just return the
list of keys that are retained. We no longer need the extra
comparisons in `SeekGE/SeekLT`.
Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 3 files at r3, 4 of 4 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @itsbilal and @RaduBerinde)


internal/keyspan/truncate.go line 129 at r1 (raw file):

Previously, RaduBerinde wrote…

Right, it's mentioned for Truncate:

// Note that fragment iterator Spans always have exclusive end-keys; if the
// given bounds have an inclusive end key the input iterator must not produce a
// span that contains that key.

Basically if you hit this case, you are doing something wrong. In practice this currently means that you have and inclusive end bound for a virtual table (i.e. Largest is not a sentinel) and the sst has a RangeDel/RangeKey that contains that key. This is a bug as there's no way to express that the key is "covered" by the RangeDel while staying within the bounds.

Looking at it another way, the only reason a caller would provide an inclusive end bound to Truncate is to perform this check. It's equivalent to the previous panicOnUpperTruncate, just feels more elegant.

Got it, thanks

@RaduBerinde
Copy link
Member Author

TFTR!

@RaduBerinde RaduBerinde merged commit 4335ae0 into cockroachdb:master Mar 22, 2024
11 checks passed
@RaduBerinde RaduBerinde deleted the truncate-simplify branch March 22, 2024 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants