-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: parser, test catalog, query support for virtual columns #57622
Conversation
Support VIRTUAL keyword in the parser. We still error out if it is ever used. Release note: None
To ensure that PK columns are non-nullable, the test catalog reinitializes them to override the nullable flag. We replace this hacky mechanism with figuring out the PK columns beforehand and marking them as non-nullable in the definition before creating the table columns. Release note: None
32023d0
to
bee0662
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 13 of 13 files at r1, 1 of 1 files at r2, 5 of 5 files at r3, 2 of 2 files at r4.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)
pkg/sql/opt/optbuilder/select.go, line 529 at r4 (raw file):
b.addCheckConstraintsForTable(tabMeta) b.addComputedColsForTable(tabMeta)
b.addComputedColsForTable
assumes that all ordinary columns are in mb.outScope
. I think this is a safe assumption because mb.outScope
has all ordinary columns except when building scans for foreign key checks and foreign keys on virtual columns will not be allowed. Does that sound right to you?
We were bit by this for partial indexes (see the long comment below) so I'm hyper-sensitive. Might be worth adding a comment here to explain why this is safe.
pkg/sql/opt/testutils/testcat/create_table.go, line 537 at r3 (raw file):
false, /* hidden */ defaultExpr, computedExpr,
[nit] make InitNonVirtual
accept string
arguments for computedExpr
and defaultExpr
rather than *string
to make it consistent with InitVirtualComputed
.
pkg/sql/opt/testutils/testcat/testdata/index, line 134 at r3 (raw file):
│ └── x int not null ├── INDEX idx2 │ ├── idx_expr_1 string as (lower(z)) virtual [hidden] [virtual-computed]
Is it redundant to have virtual
and [virtual-computed]
?
Release note: None
bee0662
to
6d19420
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
pkg/sql/opt/optbuilder/select.go, line 529 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
b.addComputedColsForTable
assumes that all ordinary columns are inmb.outScope
. I think this is a safe assumption becausemb.outScope
has all ordinary columns except when building scans for foreign key checks and foreign keys on virtual columns will not be allowed. Does that sound right to you?We were bit by this for partial indexes (see the long comment below) so I'm hyper-sensitive. Might be worth adding a comment here to explain why this is safe.
I think at first we won't allow computed/default/check expressions to refer to virtual columns (I have it on my checklist in #57608. to actually check for these).
I am not sure I understand the issue with addComputedColsForTable.. it looks like it creates its own scope that contains all ordinary columns? It doesn't even have access to outScope
..
pkg/sql/opt/testutils/testcat/create_table.go, line 537 at r3 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
[nit] make
InitNonVirtual
acceptstring
arguments forcomputedExpr
anddefaultExpr
rather than*string
to make it consistent withInitVirtualComputed
.
They are optional in the non-virtual case, but computedExpr is not optional for InitVirtualComputed.
pkg/sql/opt/testutils/testcat/testdata/index, line 134 at r3 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Is it redundant to have
virtual
and[virtual-computed]
?
Sure looks like it :) Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @RaduBerinde)
pkg/sql/opt/optbuilder/select.go, line 529 at r4 (raw file):
Previously, RaduBerinde wrote…
I think at first we won't allow computed/default/check expressions to refer to virtual columns (I have it on my checklist in #57608. to actually check for these).
I am not sure I understand the issue with addComputedColsForTable.. it looks like it creates its own scope that contains all ordinary columns? It doesn't even have access to
outScope
..
It will created its own scope with all ordinary columns, but the scan only output tabColIDs
which are a subset of the table columns defined by the ordinals
argument. So these expressions could have outer columns that can't be resolved.
pkg/sql/opt/optbuilder/select.go, line 529 at r4 (raw file): Previously, mgartner (Marcus Gartner) wrote…
Sure.. but this is not the concern of the code here. We are just building these expressions here "generically" to avoid building them in multiple places. They are a property of the table, not really of the scan. It's up to the code which uses them to make sure it uses them in a way that makes sense. |
pkg/sql/opt/optbuilder/select.go, line 529 at r4 (raw file): Previously, RaduBerinde wrote…
Perhaps if we don't scan all the ordinary columns, that is an indication that we won't / shouldn't need these and we shouldn't build them? But it's more error-prone than saying that every tabMeta should have them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @RaduBerinde)
pkg/sql/opt/optbuilder/select.go, line 529 at r4 (raw file):
Previously, RaduBerinde wrote…
Perhaps if we don't scan all the ordinary columns, that is an indication that we won't / shouldn't need these and we shouldn't build them? But it's more error-prone than saying that every tabMeta should have them.
Hmm. Then maybe the fix for the similar issue with partial indexes is not the right fix? See 7ca00cc
I don't feel strongly how we do it, but we have to prevent projecting a virtual computed column expression that references columns that aren't output by the child Scan. Are you suggesting that it's the responsibility of the caller of buildScan not to pass bad ordinals
?
pkg/sql/opt/testutils/testcat/create_table.go, line 537 at r3 (raw file):
Previously, RaduBerinde wrote…
They are optional in the non-virtual case, but computedExpr is not optional for InitVirtualComputed.
Oops. Nevermind then!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
pkg/sql/opt/optbuilder/select.go, line 529 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Hmm. Then maybe the fix for the similar issue with partial indexes is not the right fix? See 7ca00cc
I don't feel strongly how we do it, but we have to prevent projecting a virtual computed column expression that references columns that aren't output by the child Scan. Are you suggesting that it's the responsibility of the caller of buildScan not to pass bad
ordinals
?
Hm I see.. But unlike addComputedColsForTable
, addPartialIndexPredicatesForTable
uses the outScope
so the situation is different.
Now that I'm getting more reacquainted with this code, can't addPartialIndexPredicatesForTable
also create its own scope with all the ordinary table columns? We may be relying on not adding them to tabMeta in certain cases, but if that's the case the check should happen where we would use them (by consulting the outer cols).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
pkg/sql/opt/optbuilder/select.go, line 529 at r4 (raw file):
Previously, RaduBerinde wrote…
Hm I see.. But unlike
addComputedColsForTable
,addPartialIndexPredicatesForTable
uses theoutScope
so the situation is different.Now that I'm getting more reacquainted with this code, can't
addPartialIndexPredicatesForTable
also create its own scope with all the ordinary table columns? We may be relying on not adding them to tabMeta in certain cases, but if that's the case the check should happen where we would use them (by consulting the outer cols).
Talking about it in terms of API, ComputedCols
is a member of TableMeta
defined as containing ScalarExprs for computed column expressions which refer to table columns using the table column IDs for this meta entry. The buildScan code just initializes it that way. After that, you do what you want with it. If you don't remap the column IDs or use it in a Project on top of an expression that doesn't produce all the necessary IDs, that's on you - it has nothing to do with the context of why the TableMeta was initialized in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @RaduBerinde)
pkg/sql/opt/optbuilder/select.go, line 529 at r4 (raw file):
Previously, RaduBerinde wrote…
Talking about it in terms of API,
ComputedCols
is a member ofTableMeta
defined as containing ScalarExprs for computed column expressions which refer to table columns using the table column IDs for this meta entry. The buildScan code just initializes it that way. After that, you do what you want with it. If you don't remap the column IDs or use it in a Project on top of an expression that doesn't produce all the necessary IDs, that's on you - it has nothing to do with the context of why the TableMeta was initialized in the first place.
Ahh I see. I like this definition. I'll work on making that change for addPartialIndexPredicatesForTable
and checking the outer cols where it's used. I think the only place is in scan_index_iter.go
. I think that will clean things up and be less error prone.
So nothing needs to change for addComputedColsForTable
, but just below here tabMeta.ComputedCols
is accessed, so I guess that's the place we'd check the outer cols, if it's necessary. And I don't think its necessary unless FKs can reference virtual columns, correct? I guess that's originally what I was trying to call out, but explained my thoughts poorly.
34374de
to
5c6d900
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
pkg/sql/opt/optbuilder/select.go, line 529 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Ahh I see. I like this definition. I'll work on making that change for
addPartialIndexPredicatesForTable
and checking the outer cols where it's used. I think the only place is inscan_index_iter.go
. I think that will clean things up and be less error prone.So nothing needs to change for
addComputedColsForTable
, but just below heretabMeta.ComputedCols
is accessed, so I guess that's the place we'd check the outer cols, if it's necessary. And I don't think its necessary unless FKs can reference virtual columns, correct? I guess that's originally what I was trying to call out, but explained my thoughts poorly.
Ohh, yeah, good point. I added a comment to the function and added an assertion (and constructed the ProjectionsItem properly while I'm at it).
I'll think about how to improve this API. I think maybe the cases where we scan specific ordinals should use a separate simplified path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 13 of 13 files at r1, 1 of 1 files at r2, 2 of 5 files at r3, 5 of 5 files at r5, 2 of 2 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @RaduBerinde)
pkg/sql/opt/optbuilder/select.go, line 544 at r6 (raw file):
virtualColIDs.ForEach(func(col opt.ColumnID) { item := b.factory.ConstructProjectionsItem(tabMeta.ComputedCols[col], col) if !item.ScalarProps().OuterCols.SubsetOf(tabColIDs) {
Is it not possible for one virtual column to depend on another? (Maybe you discussed this above, but wouldn't hurt to add an explicit comment or TODO)
This change adds support for scanning tables with virtual columns. The virtual columns are projected on top of the Scan. Release note: None
5c6d900
to
a5ebd6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @rytaft)
pkg/sql/opt/optbuilder/select.go, line 544 at r6 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Is it not possible for one virtual column to depend on another? (Maybe you discussed this above, but wouldn't hurt to add an explicit comment or TODO)
Added a TODO. It will be an internal error for now. The checklist in #57608 tracks either supporting this case, or disallowing the creation of such columns in the first place.
bors r+ |
Build succeeded: |
Previously, partial index predicates were only built and added to `TableMeta.PartialIndexPredicates` if the scope of the scan in `Builder.buildScan` included all ordinary table columns. This behavior prevented errors when building partial index predicate expressions for foriegn key check scans. These scans do not include all columns, so when building a predicate expression that referenced a table column not included, optbuilder would err with "column does not exist". This commit changes this behavior so that partial index predicates are always built and added to `TableMeta.PartialIndexPredicates`. In order to do this, `Builder.addPartialIndexPredicatesForTable` creates its own table scope. It also constructs a new scan rather than relying on the already-built scan. Constructing a scan is required for considering logical properties while normalizing partial index predicates. See discussion at cockroachdb#57622 for more context on this change. Release note: None
Previously, partial index predicates were only built and added to `TableMeta.PartialIndexPredicates` if the scope of the scan in `Builder.buildScan` included all ordinary table columns. This behavior prevented errors when building partial index predicate expressions for foriegn key check scans. These scans do not include all columns, so when building a predicate expression that referenced a table column not included, optbuilder would err with "column does not exist". This commit changes this behavior so that partial index predicates are always built and added to `TableMeta.PartialIndexPredicates`. In order to do this, `Builder.addPartialIndexPredicatesForTable` creates its own table scope. It also constructs a new scan if the provided scan does not output all column in the table scope. A scan on the table and its logical properties are required in order to fully normalize partial index predicates. See discussion at cockroachdb#57622 for more context on this change. Release note: None
57707: opt: build all partial index predicate expressions in TableMeta r=mgartner a=mgartner #### opt: build all partial index predicate expressions in TableMeta Previously, partial index predicates were only built and added to `TableMeta.PartialIndexPredicates` if the scope of the scan in `Builder.buildScan` included all ordinary table columns. This behavior prevented errors when building partial index predicate expressions for foriegn key check scans. These scans do not include all columns, so when building a predicate expression that referenced a table column not included, optbuilder would err with "column does not exist". This commit changes this behavior so that partial index predicates are always built and added to `TableMeta.PartialIndexPredicates`. In order to do this, `Builder.addPartialIndexPredicatesForTable` creates its own table scope. It also constructs a new scan rather than relying on the already-built scan. Constructing a scan is required for considering logical properties while normalizing partial index predicates. See discussion at #57622 for more context on this change. Release note: None #### optbuilder: reduce redundant building of arbiter filter expressions Previously, the arbiter predicate provided in the `WHERE` clause of an `INSERT ... ON CONFLICT` statement was rebuilt as a `memo.FiltersExpr` repetitively while the optbuilder determined arbiter indexes. Now, the expression is built only once, reducing work. Release note: None Co-authored-by: Marcus Gartner <[email protected]>
Previously, partial index predicates were only built and added to `TableMeta.PartialIndexPredicates` if the scope of the scan in `Builder.buildScan` included all ordinary table columns. This behavior prevented errors when building partial index predicate expressions for foriegn key check scans. These scans do not include all columns, so when building a predicate expression that referenced a table column not included, optbuilder would err with "column does not exist". This commit changes this behavior so that partial index predicates are always built and added to `TableMeta.PartialIndexPredicates`. In order to do this, `Builder.addPartialIndexPredicatesForTable` creates its own table scope. It also constructs a new scan if the provided scan does not output all column in the table scope. A scan on the table and its logical properties are required in order to fully normalize partial index predicates. See discussion at cockroachdb#57622 for more context on this change. Release note: None
This set of changes add parser, testcat, and query support for virtual columns. I have come full circle and realized that even though user-defined virtual columns don't help with the optimizer-side work for expression-based indexes, they help with integrating with the existing sql code. In particular, we can carve out pieces of work and write more targeted tests related to virtual columns, without having to define them through an expression-based index.
Informs #57608.
sql: support virtual columns in parser
Support VIRTUAL keyword in the parser. We still error out if it is
ever used.
Release note: None
opt: clean up hack in test catalog
To ensure that PK columns are non-nullable, the test catalog
reinitializes them to override the nullable flag. We replace this
hacky mechanism with figuring out the PK columns beforehand and
marking them as non-nullable in the definition before creating the
table columns.
Release note: None
opt: test catalog support for virtual columns
Release note: None
opt: optbuilder support for scanning virtual columns
This change adds support for scanning tables with virtual columns. The
virtual columns are projected on top of the Scan.
Release note: None