-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/base: document new Split requirements #1385
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion
internal/base/comparer.go, line 102 at r1 (raw file):
// If Compare(prefix(a), prefix(b)) == 0, then Compare(suffix(a), suffix(b)) == Compare(a, b) // // 5) If b begins with a, then prefix(b) = prefix(a).
I wasn't sure what this existing requirement means. @sumeerbhola any idea?
I believe there's no reason why you can't have a key a
and a key aaa@t10
such that prefix(a) = 'a'
and prefix(aaa@t10) = 'aaa'
but aaa@t10
begins with a
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @jbowens)
internal/base/comparer.go, line 76 at r1 (raw file):
// the user could potentially store in the database. Typically this means that // the method must only handle valid MVCC encoded keys and should panic on any // other input.
this panic comment is a bit odd. If the uses specifies a Split
function for a DB, it will call it on any key in the DB -- it has no understanding of what is an MVCC key and what is not.
internal/base/comparer.go, line 91 at r1 (raw file):
// that prefix: // // Compare(prefix(a), a) < 0 if suffix(a) > 0
should this say len(suffix(a)) > 0?
internal/base/comparer.go, line 102 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I wasn't sure what this existing requirement means. @sumeerbhola any idea?
I believe there's no reason why you can't have a key
a
and a keyaaa@t10
such thatprefix(a) = 'a'
andprefix(aaa@t10) = 'aaa'
butaaa@t10
begins witha
.
I agree. This looks incorrect. And one can have two keys that are a@10@10 and a@10, which don't have the same prefix. In the CockroachDB MVCCKey encoding I don't think there is anything that prevents a byte sequence that looks like a timestamp from appearing in the prefix.
Document the new Split requirements necessary for range keys support. See cockroachdb#1341.
705ed75
to
4486b5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @sumeerbhola)
internal/base/comparer.go, line 76 at r1 (raw file):
Previously, sumeerbhola wrote…
this panic comment is a bit odd. If the uses specifies a
Split
function for a DB, it will call it on any key in the DB -- it has no understanding of what is an MVCC key and what is not.
yeah, that looks wrong to me too. I updated it.
internal/base/comparer.go, line 91 at r1 (raw file):
Previously, sumeerbhola wrote…
should this say len(suffix(a)) > 0?
Done.
internal/base/comparer.go, line 102 at r1 (raw file):
Previously, sumeerbhola wrote…
I agree. This looks incorrect. And one can have two keys that are a@10@10 and a@10, which don't have the same prefix. In the CockroachDB MVCCKey encoding I don't think there is anything that prevents a byte sequence that looks like a timestamp from appearing in the prefix.
Good point. Removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @sumeerbhola)
internal/base/comparer.go, line 102 at r1 (raw file):
In the CockroachDB MVCCKey encoding I don't think there is anything that prevents a byte sequence that looks like a timestamp from appearing in the prefix.
Ah, wait, that would make the NextPrefix prefix-sharing optimization more difficult.
I did a little looking around and found this comment in storage/doc.go:
The binary encoding used on the MVCC keys allows arbitrary keys to be
stored in the map (no restrictions on intermediate nil-bytes, for
example), while still sorting lexicographically and guaranteeing that
all timestamp-suffixed MVCC version keys sort consecutively with the
metadata key. We use an escape-based encoding which transforms all nul
("\x00") characters in the key and is terminated with the sequence
"\x00\x01", which is guaranteed to not occur elsewhere in the encoded
value. See util/encoding/encoding.go for more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @jbowens and @sumeerbhola)
internal/base/comparer.go, line 102 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
In the CockroachDB MVCCKey encoding I don't think there is anything that prevents a byte sequence that looks like a timestamp from appearing in the prefix.
Ah, wait, that would make the NextPrefix prefix-sharing optimization more difficult.
I did a little looking around and found this comment in storage/doc.go:
The binary encoding used on the MVCC keys allows arbitrary keys to be stored in the map (no restrictions on intermediate nil-bytes, for example), while still sorting lexicographically and guaranteeing that all timestamp-suffixed MVCC version keys sort consecutively with the metadata key. We use an escape-based encoding which transforms all nul ("\x00") characters in the key and is terminated with the sequence "\x00\x01", which is guaranteed to not occur elsewhere in the encoded value. See util/encoding/encoding.go for more details.
-
This comment about escape-based encoding is quite peculiar, and seems disconnected from the rest. We don't need any such guarantees to ensure ordering, since
EngineKeyCompare
has to split and then compare. Do you think we rely on this currently in any place? -
I am not sure we can do the NextPrefix prefix-sharing optimization. An MVCCKey ends with \x00 (sentinel), followed by the WallTime using
binary.BigEndian.PutUint64
. Within theroachpb.Key
we escape \x00 to \x00\xff https://github.com/cockroachdb/cockroach/blob/978df512d26ae5eda7ed4cf8e0d2da9d241b4858/pkg/util/encoding/encoding.go#L495-L504. If we ever had the WallTime starting with \xff we would be in trouble. Because we are casting an int64 WallTime to uint64, it can never be \xff. So that seems to save us. And if \x00\x01 can only occur at the end of the roachpb.Key it adds another layer of protection. But without fully understanding the sql key encoding, I'd be very wary of doing anything, and even then it seems risky. -
The SeekGE we use for NextKey does
p.iter.SeekGE(append(p.keyBuf, 0, 0))
. So \x00\x00 at the end of the roachpb.Key. If the WallTime started with \x00 (which I think is not the case if the system is running at the current time) we would seek to an older version of the same roachpb.Key. I hope I am mistaken in my thinking here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @jbowens and @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @sumeerbhola)
internal/base/comparer.go, line 102 at r1 (raw file):
Do you think we rely on this currently in any place?
Not that I know of, but this comment here had me wondering.
I am not sure we can do the NextPrefix prefix-sharing optimization.
It definitely tempers the benefits, but I think it can be adapted: If the shared bytes is greater than the currentPrefixLen
, we Split the new key. If the new key's prefix length equals currentPrefixLen
, we skip past it.
It's unfortunate that we need to Split the key within the blockIter, but it I suspect it still helps by avoiding key comparisons in higher levels.
we would seek to an older version of the same roachpb.Key
😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @jbowens and @sumeerbhola)
internal/base/comparer.go, line 102 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Do you think we rely on this currently in any place?
Not that I know of, but this comment here had me wondering.
I am not sure we can do the NextPrefix prefix-sharing optimization.
It definitely tempers the benefits, but I think it can be adapted: If the shared bytes is greater than the
currentPrefixLen
, we Split the new key. If the new key's prefix length equalscurrentPrefixLen
, we skip past it.It's unfortunate that we need to Split the key within the blockIter, but it I suspect it still helps by avoiding key comparisons in higher levels.
we would seek to an older version of the same roachpb.Key
😬
We use the \0 and SeekGE to advance to the next key in other places too, like https://github.com/cockroachdb/cockroach/blob/d7a362c25f2819077f36807519f39dbdf32dfca0/pkg/storage/mvcc.go#L3663
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @jbowens and @sumeerbhola)
internal/base/comparer.go, line 102 at r1 (raw file):
The SeekGE we use for NextKey does p.iter.SeekGE(append(p.keyBuf, 0, 0)). So \x00\x00 at the end of the roachpb.Key. If the WallTime started with \x00 (which I think is not the case if the system is running at the current time) we would seek to an older version of the same roachpb.Key. I hope I am mistaken in my thinking here.
I was indeed mistaken. There is no bug here, thankfully. The <key>\x00\x00 is of course subject to the same engine key comparison, which will use <key>\x00 as EngineKey.Key
(with no EngineKey.Version
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @jbowens and @sumeerbhola)
Document the new Split requirements necessary for range keys support. See #1341.