-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: shared ingestion fixes #3074
*: shared ingestion fixes #3074
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 9 files reviewed, 2 unresolved discussions (waiting on @itsbilal and @sumeerbhola)
table_cache.go
line 204 at r1 (raw file):
} func createCommonReader(
It would be useful to document what isForeign
does exactly here.
It's kind of unfortunate to add it to this method which is supposed to hide away the virtual vs non-virtual differences.. Maybe this flag belongs in FileMetadata
, alongside Virtual
?
Is there a description of hideObsoletePoints
somewhere?
data_test.go
line 1158 at r1 (raw file):
props := r.Properties.String() if m != nil && m.Virtual { v = sstable.MakeVirtualReader(r, m.VirtualMeta(), false /* isForeign */)
How do we know it's not foreign? I assume it doesn't matter here, but then the flag shouldn't be isForeign
, it should be more descriptive of what it actually does.
Previously, we weren't setting hideObsoletePoints to true for compaction iters created out of shared foreign sstables. This change addresses that by passing that state through virtualState in VirtualReader through to the old Reader. Unblocks cockroachdb#3044.
Previously, we were calling checkVirtualBounds on FileMetadatas before they were installed. This would lead to panics in invariant builds as we would try to load an uninstalled FileMetadata from the table cache. This change removes the checkVirtualBounds call, making it a test-only thing. Adding a check after version installation also doesn't guarantee that refs will be nonzero, as a compaction immediately after ingestApply is possible. Only an issue with invariants builds. Unblocks cockroachdb#3044.
Previously, we were passing range keys that were sorted by trailer descending right from ScanInternal. However, the caller isn't interested in seqnum-based sorting, only kind-based trailer sorting. This change zeroes seqnums before sorting the keys. Unblocks cockroachdb#3044.
02c9ac6
to
a44b6d2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: 0 of 9 files reviewed, 1 unresolved discussion (waiting on @RaduBerinde and @sumeerbhola)
table_cache.go
line 204 at r1 (raw file):
Previously, RaduBerinde wrote…
It would be useful to document what
isForeign
does exactly here.It's kind of unfortunate to add it to this method which is supposed to hide away the virtual vs non-virtual differences.. Maybe this flag belongs in
FileMetadata
, alongsideVirtual
?Is there a description of
hideObsoletePoints
somewhere?
Added a comment. And I do have a bigger refactor in mind where commonReader just becomes an interface, and this method encapsulates the isForeign determination logic instead of having the caller do it. I'll try to do that later; this is a minimal change to fix the bugs.
And since IsForeign is a state variable that's derived from stuff that's already stored in the catalog, I think it's fair to just leave it as a calculated field instead of storing it in the FileMetadata. Though I'm open to either argument. For the sake of making this backportable, we should probably just pass in isForeign
?
hideObsoletePoints
is sort of defined in reader.go around the NewIter* methods, but not in great detail.
data_test.go
line 1158 at r1 (raw file):
Previously, RaduBerinde wrote…
How do we know it's not foreign? I assume it doesn't matter here, but then the flag shouldn't be
isForeign
, it should be more descriptive of what it actually does.
Yeah it's just a test helper function, and these tests don't create foreign SSTs, only TestIngestShared
and TestScanInternal
etc do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 9 files reviewed, 1 unresolved discussion (waiting on @itsbilal and @sumeerbhola)
table_cache.go
line 204 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Added a comment. And I do have a bigger refactor in mind where commonReader just becomes an interface, and this method encapsulates the isForeign determination logic instead of having the caller do it. I'll try to do that later; this is a minimal change to fix the bugs.
And since IsForeign is a state variable that's derived from stuff that's already stored in the catalog, I think it's fair to just leave it as a calculated field instead of storing it in the FileMetadata. Though I'm open to either argument. For the sake of making this backportable, we should probably just pass in
isForeign
?
hideObsoletePoints
is sort of defined in reader.go around the NewIter* methods, but not in great detail.
I guess this is where I'm coming from: it doesn't feel right that we should tie semantics to whether a file is external or not, even though these are correlated in the current usecases.. It should be something that is provided separately when we attach the file. Maybe one day we will want to attach an external file as a "regular" SST - e.g. starting a "cloned cluster" from saved object files.
This may or may not make sense because I still don't understand the hideObsoletePoints
semantics well.
Anyway this PR is ok to merge for now, just trying to figure out the desired long term state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 9 files reviewed, 1 unresolved discussion (waiting on @RaduBerinde and @sumeerbhola)
table_cache.go
line 204 at r1 (raw file):
Previously, RaduBerinde wrote…
I guess this is where I'm coming from: it doesn't feel right that we should tie semantics to whether a file is external or not, even though these are correlated in the current usecases.. It should be something that is provided separately when we attach the file. Maybe one day we will want to attach an external file as a "regular" SST - e.g. starting a "cloned cluster" from saved object files.
This may or may not make sense because I still don't understand the
hideObsoletePoints
semantics well.Anyway this PR is ok to merge for now, just trying to figure out the desired long term state.
Yeah I can see that. Currently we've tied "should we only read the newest internal key for each user key" (which is what hideObsoletePoints
really does), to "is this file shared foreign" (which btw is not the same as external), and does make sense to have them as two separate things. It also makes sense to have a FileMetadata field for this, seeing as that's usually the place to interpret how to read a file, while the catalog ObjectMetadata mostly speaks about where the file is.
TFTR!
Contains 3 commits, each of which contains some fixes around shared/foreign sstables.
*: set hideObsoletePoints=true for shared foreign sstables
Previously, we weren't setting hideObsoletePoints to true for
compaction iters created out of shared foreign sstables. This
change addresses that by passing that state through virtualState
in VirtualReader through to the old Reader.
Unblocks #3044.
ingest: don't call checkVirtualBounds() in ingestion
Previously, we were calling checkVirtualBounds on FileMetadatas
before they were installed. This would lead to panics in invariant
builds as we would try to load an uninstalled FileMetadata
from the table cache.
This change removes the checkVirtualBounds call, making it a test-only
thing. Adding a check after version installation also doesn't guarantee
that refs will be nonzero, as a compaction immediately after ingestApply
is possible.
Only an issue with invariants builds. Unblocks #3044.
scan_internal: Sort keys by kind before calling visitor
Previously, we were passing range keys that were sorted by trailer
descending right from ScanInternal. However, the caller isn't
interested in seqnum-based sorting, only kind-based trailer sorting.
This change zeroes seqnums before sorting the keys.
Unblocks #3044.