-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt, cat: cleaner API for mutation columns #51400
Conversation
a95abcf
to
3023579
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with some nits.
pkg/sql/opt/cat/utils.go
Outdated
@@ -122,15 +122,15 @@ func ResolveTableIndex( | |||
} | |||
|
|||
// ConvertColumnIDsToOrdinals converts a list of ColumnIDs (such as from a | |||
// tree.TableRef), to a list of ordinal positions of columns within the given | |||
// table. See tree.Table for more information on column ordinals. | |||
// tree.TableRef), to a list of ordinal positions of non-mutation columns within |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like an API bug that this only looks at non-mutation cols. Why not let it handle mutation cols as well, since the API is generic? I don't think any existing caller calls it with mutation cols, so there should be no panics.
pkg/sql/opt/cat/utils.go
Outdated
@@ -147,8 +147,8 @@ func ConvertColumnIDsToOrdinals(tab Table, columns []tree.ColumnID) (ordinals [] | |||
// FindTableColumnByName returns the ordinal of the non-mutation column having | |||
// the given name, if one exists in the given table. Otherwise, it returns -1. | |||
func FindTableColumnByName(tab Table, name tree.Name) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make this return mutation cols as well and require caller to test for mutation/system cols? The more col types we add, the more confusing it is when some generic-sounding APIs filter cols and others don't.
for i, n := 0, tab.ColumnCount(); i < n; i++ { | ||
if private.IsColumnOutput(i) { | ||
for i, n := 0, tab.AllColumnCount(); i < n; i++ { | ||
if !cat.IsMutationColumn(tab, i) && private.IsColumnOutput(i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think we can just get rid of the comment and the check for IsMutationColumn
. private.IsColumnOutput
should always return false for mutation columns, or else there's a bug somewhere else in the code. We always set private.ReturnCols
entries to zero for mutation cols.
@@ -1289,7 +1289,10 @@ func (b *logicalPropsBuilder) buildMutationProps(mutation RelExpr, rel *props.Re | |||
// null or the corresponding insert and fetch/update columns are not null. In | |||
// other words, if either the source or destination column is not null, then | |||
// the column must be not null. | |||
for i, n := 0, tab.ColumnCount(); i < n; i++ { | |||
for i, n := 0, tab.AllColumnCount(); i < n; i++ { | |||
if cat.IsMutationColumn(tab, i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The OutputCols
check should already take care of this case, since mutation cols will never be output cols by the check above. I think it's good to let the mutation builder decide which cols are output or not, and let this props builder code just blindly process the cols without checking again.
for i, cnt := 0, baseTable.ColumnCount(); i < cnt; i++ { | ||
unfilteredCols.Add(t.Table.ColumnID(i)) | ||
for i, cnt := 0, baseTable.AllColumnCount(); i < cnt; i++ { | ||
if !cat.IsMutationColumn(baseTable, i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any harm in adding mutation cols here. I think Drew just used ColumnCount
b/c it seemed like the right thing. But really we want all columns.
@@ -99,7 +99,7 @@ func TestGetStatsFromConstraint(t *testing.T) { | |||
t.Helper() | |||
|
|||
var cols opt.ColSet | |||
for i := 0; i < tab.ColumnCount(); i++ { | |||
for i := 0; i < tab.AllColumnCount(); i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how mutation cols play with stats. A good example of a case where using ColumnCount
can cause us to not consider a case that is potentially interesting to think about, or at least document.
@@ -1054,9 +1057,9 @@ func (mb *mutationBuilder) mapColumnNamesToOrdinals(names tree.NameList) util.Fa | |||
var ords util.FastIntSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add to the mapColumnNamesToOrdinals
to make it clear that it skips mutation columns?
@@ -825,11 +829,14 @@ func (mb *mutationBuilder) makeMutationPrivate(needResults bool) *memo.MutationP | |||
} | |||
|
|||
if needResults { | |||
// Only non-mutation columns are output columns. ReturnCols needs to have | |||
// DeletableColumnCount entries, but only the first ColumnCount entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to update the comment to remove ref to DeletableColumnCount
.
pkg/sql/opt/optbuilder/select.go
Outdated
@@ -577,8 +578,8 @@ func (b *Builder) addCheckConstraintsForTable(tabMeta *opt.TableMeta) { | |||
|
|||
// Find the non-nullable table columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add parenthetical to the comment, saying that mutation cols can be NULL during backfill, so they should be excluded.
@@ -626,9 +627,9 @@ func (b *Builder) addCheckConstraintsForTable(tabMeta *opt.TableMeta) { | |||
func (b *Builder) addComputedColsForTable(tabMeta *opt.TableMeta) { | |||
var tableScope *scope | |||
tab := tabMeta.Table | |||
for i, n := 0, tab.ColumnCount(); i < n; i++ { | |||
for i, n := 0, tab.AllColumnCount(); i < n; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment to explain why mutation cols should not be included here.
From an outside perspective, I like this a lot. It's much more explicit about what count to use/what columns are considered in an operation, and will also make it easier for me to catch all of the cases where we should or shouldn't include system columns. |
The catalog currently relies on a particular ordering to separate different kinds of columns (ordinary, write-only, delete-only). This is unfortunate because it restricts the underlying implementation; in particular, it makes it tricky to add a new kind of column (like the `mvcc_timestamp` system column) without potentially breaking existing code in subtle ways. This change reworks the cat API to have a single `ColumnCount` method and a separate `ColumnKind` method used to figure out the column kind. This forces all code that cares about the kind to check it explicitly rather than relying on a potentially faulty column count. Fixes cockroachdb#51323. Release note: None
Also add some tests that fail if we don't ignore mutation columns. Release note: None
3023579
to
a6ad229
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, @RaduBerinde, @rohany, and @rytaft)
pkg/sql/opt/cat/utils.go, line 125 at r1 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
It seems like an API bug that this only looks at non-mutation cols. Why not let it handle mutation cols as well, since the API is generic? I don't think any existing caller calls it with mutation cols, so there should be no panics.
The caller does care - this is used when referencing a table by ID. Without the check, we allow a mutation column (which in practice leads to a more obscure error a bit later). I added some tests and I moved this into the optbuilder as it's specific to building numeric references.
pkg/sql/opt/cat/utils.go, line 149 at r1 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
Should we make this return mutation cols as well and require caller to test for mutation/system cols? The more col types we add, the more confusing it is when some generic-sounding APIs filter cols and others don't.
Moved to optbuilder, renamed to findPublicTableColumnByName
.
pkg/sql/opt/memo/logical_props_builder.go, line 1272 at r1 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
Actually, I think we can just get rid of the comment and the check for
IsMutationColumn
.private.IsColumnOutput
should always return false for mutation columns, or else there's a bug somewhere else in the code. We always setprivate.ReturnCols
entries to zero for mutation cols.
Done.
pkg/sql/opt/memo/logical_props_builder.go, line 1293 at r1 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
The
OutputCols
check should already take care of this case, since mutation cols will never be output cols by the check above. I think it's good to let the mutation builder decide which cols are output or not, and let this props builder code just blindly process the cols without checking again.
Done.
pkg/sql/opt/memo/multiplicity_builder.go, line 122 at r1 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
I don't see any harm in adding mutation cols here. I think Drew just used
ColumnCount
b/c it seemed like the right thing. But really we want all columns.
Done.
pkg/sql/opt/optbuilder/insert.go, line 1057 at r1 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
Can you add to the
mapColumnNamesToOrdinals
to make it clear that it skips mutation columns?
Done.
pkg/sql/opt/optbuilder/mutation_builder.go, line 832 at r1 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
Need to update the comment to remove ref to
DeletableColumnCount
.
Done.
pkg/sql/opt/optbuilder/select.go, line 579 at r1 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
Add parenthetical to the comment, saying that mutation cols can be NULL during backfill, so they should be excluded.
Done.
pkg/sql/opt/optbuilder/select.go, line 630 at r1 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
Add comment to explain why mutation cols should not be included here.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I think we currently do something similar with the Index catalog, so it'd be great to apply similar changes there at some point.
Reviewed 10 of 20 files at r1, 4 of 8 files at r2, 3 of 4 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, @RaduBerinde, @rohany, and @rytaft)
bors r+ |
Build failed |
bors r+ |
Build succeeded |
This change renames AllColumnCount to ColumnCount. In cockroachdb#51400 I meant to rename before merging but I forgot. Release note: None
51462: storage: optimize MVCCGarbageCollect for large number of undeleted version r=itsbilal a=ajwerner This is a better-tested take-2 of #51184. * The first commit introduces a `SupportsPrev` method to the `Iterator` interface. * The second commit adjusts the benchmark to use a regular batch. * The third commit adjusts the logic in #51184 to fix the bugs there and adds testing. The win is now only pronounced on pebble, as might be expected. 51489: opt, cat: rename Table.AllColumnCount to Table.ColumnCount r=RaduBerinde a=RaduBerinde This change renames AllColumnCount to ColumnCount. In #51400 I meant to rename before merging but I forgot. Release note: None Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Radu Berinde <[email protected]>
NOTE I used
AllColumnCount
so that the compiler forces me to look at all callsites, and left it like this so they are also apparent in the diff. I intend to rename it toColumnCount
before merging.The catalog currently relies on a particular ordering to separate
different kinds of columns (ordinary, write-only, delete-only). This
is unfortunate because it restricts the underlying implementation; in
particular, it makes it tricky to add a new kind of column (like the
mvcc_timestamp
system column) without potentially breaking existingcode in subtle ways.
This change reworks the cat API to have a single
ColumnCount
methodand a separate
ColumnKind
method used to figure out the column kind.This forces all code that cares about the kind to check it explicitly
rather than relying on a potentially faulty column count.
Fixes #51323.
Release note: None