-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add json{,b}_to_record{,set}; recordtype srfs #82435
Conversation
89b0b3d
to
218cdf5
Compare
e5dbc3f
to
895bc57
Compare
Hi @mgartner, would you please take a look at the optimizer changes required for this? The crux of the difficulty was that we needed to pass in the alias information (e.g. |
1a50c69
to
5995ec1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! I just looked at the optimizer files for now
Reviewed 2 of 41 files at r1, 4 of 22 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @mgartner, and @rafiss)
pkg/sql/opt/optbuilder/select.go
line 49 at r3 (raw file):
}(inScope.atRoot) inScope.atRoot = false if b.lastAlias != nil {
Did you try using the scope
for this field instead of Builder
? Intuitively I would think scope
should be the right place for it.
pkg/sql/opt/optbuilder/srfs.go
line 146 at r3 (raw file):
// (SRF) such as generate_series() or unnest(). It synthesizes new columns in // outScope for each of the SRF's output columns. func (b *Builder) finishBuildGeneratorFunction(
You should add some tests for these functions in opt/optbuilder/testdata/srfs
that exercise this logic
pkg/sql/opt/optgen/cmd/optgen/metadata.go
line 210 at r3 (raw file):
"WindowFrame": {fullName: "memo.WindowFrame", passByVal: true}, "FKCascades": {fullName: "memo.FKCascades", passByVal: true}, "ColumnDefList": {fullName: "tree.ColumnDefList", passByVal: true},
Why do you need to add this here? Doesn't look like you've updated any opt
files
aaab2a1
to
8f7d7db
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @rafiss, and @rytaft)
pkg/sql/opt/optbuilder/select.go
line 49 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Did you try using the
scope
for this field instead ofBuilder
? Intuitively I would thinkscope
should be the right place for it.
This is a good point. I've moved the lastAlias field into scope.
pkg/sql/opt/optbuilder/srfs.go
line 146 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
You should add some tests for these functions in
opt/optbuilder/testdata/srfs
that exercise this logic
Done, added some tests that try to break the scoping logic.
pkg/sql/opt/optgen/cmd/optgen/metadata.go
line 210 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Why do you need to add this here? Doesn't look like you've updated any
opt
files
Good point, I think this was leftover from a different way I was trying to solve this problem.
8f7d7db
to
50a3dcb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 13 files at r4, 3 of 5 files at r5, 1 of 7 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @mgartner, and @rafiss)
pkg/sql/opt/optbuilder/scope.go
line 639 at r6 (raw file):
// statement, or nil if there was none. func (s *scope) getLastAlias() *tree.AliasClause { for ; s != nil; s = s.parent {
When would we want to get the last alias from the parent scope?
pkg/sql/opt/optbuilder/select.go
line 55 at r6 (raw file):
defer func() { b.lastAlias = nil }() } */
Looks like this is dead code now
pkg/sql/opt/optbuilder/select.go
line 72 at r6 (raw file):
outScope = b.buildDataSource(source.Expr, indexFlags, locking, inScope) inScope.lastAlias = previousLastAlias
This is fine, but one reason I suggested using scope
was that I thought you might be able to allocate a new scope for building the data source and only use the lastAlias
from the current scope (i.e., not look in the parent). Then you wouldn't need to save and restore the previous alias. Maybe that doesn't work, though?
50a3dcb
to
747b47d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, PTAL. It would definitely be nice to land this but I know it's a big change.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @rafiss, and @rytaft)
pkg/sql/opt/optbuilder/scope.go
line 639 at r6 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
When would we want to get the last alias from the parent scope?
Done.
pkg/sql/opt/optbuilder/select.go
line 55 at r6 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Looks like this is dead code now
Done.
pkg/sql/opt/optbuilder/select.go
line 72 at r6 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
This is fine, but one reason I suggested using
scope
was that I thought you might be able to allocate a new scope for building the data source and only use thelastAlias
from the current scope (i.e., not look in the parent). Then you wouldn't need to save and restore the previous alias. Maybe that doesn't work, though?
Good point, please take another look.
For what it's worth, the Hasura team recently reached out to us again saying that this change would improve CockroachDB support, though we haven't yet identified the priority. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimizer stuff
@rafiss will you be able to review the rest? Otherwise I can look in more detail at the rest...
Reviewed 1 of 22 files at r3, 6 of 21 files at r7.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rafiss)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i reviewed the non-opt parts, and they look good to me with one nit inline, and one question -- do you know if the syntax changes will work OK in a mixed version cluster?
as for merging, my feeling is that as long as the above is fine, this seems like a purely additive change and it's known to be highly beneficial for tools we want to support, so perhaps we should make an exception and merge it during the stability period?
Reviewed 1 of 41 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @mgartner)
pkg/sql/parser/sql.y
line 365 at r7 (raw file):
return u.val.(*tree.ColumnTableDef) } func (u *sqlSymUnion) colDef() tree.ColumnDef {
should the return values for the two new functions here be pointers?
The new generator functions will fail at runtime with a somewhat cryptic error "unexpected mismatched types/labels list in json record generator %v %v". I can fix this if you like, I don't think it's really important because it's net new functionality.
I don't see a strong reason, do you? Lots of similar methods in the parser return values, and the thing being returned here is quite small a struct. |
@rafiss is it expected that this PR would edit function OIDs? I'm not familiar with the latest here. Example failure:
I assume this is because this PR is adding 4 new functions. I'll just amend the PR with the updates, but hoping for confirmation this is all good. |
747b47d
to
a171033
Compare
I also wanted to make sure the syntax changes didn't affect any other existing commands |
This commit adds support for the family of four json{,b}_to_record{,set} functions, which deconstruct JSON input (either an object or an array) and return a row or set of rows with well-typed elements from the JSON, governed by the *table alias* that the function is invoked with. For example, the query below deconstructs the input JSON, returning each of the keys "requested" by the table alias definition. Missing keys are replaced with `NULL`. ``` SELECT * FROM json_to_record('{"a": "b", "c": true}') AS t(a TEXT, z INT) ---- b NULL ``` The logic that governs the type casting from JSON to SQL mimics the logic in the similar `json_populate_record` family of functions, and should be identical to Postgres's such logic. It's permissible to use the virtual table type to deconstruct a sub-object into a record field within the top level JSON, like this: ``` CREATE TABLE mytable (a INT, b TEXT) SELECT * FROM json_to_record('{"foo": {"a": 3, "b": "bar"}}') AS t(foo mytable) ---- (3,bar) ``` Functions like these ones that return record types in PostgreSQL must be aliased to a named and typed tuple (like `AS t(a INT, b INT)`) to be usable. As a result, this commit: 1. adds parser support for this form of alias, with types 2. pushes the alias information for an aliased expression down the optimizer's call stack so that the aliased expression can access the alias information. Previously, the aliased expression was blind to any aliases that were applied to it. 3. teaches the generator function factories to get the alias information plumbed in as well. Release note (sql change): add the json{,b}_to_record{,set} builtin functions, which transform JSON into structured SQL records.
a171033
to
dfd29e6
Compare
They do not. |
bors r+ |
Build failed (retrying...): |
Build failed (retrying...): |
Build succeeded: |
Closes #70037
This commit adds support for the family of four json{,b}_to_record{,set}
functions, which deconstruct JSON input (either an object or an array)
and return a row or set of rows with well-typed elements from the JSON,
governed by the table alias that the function is invoked with.
For example, the query below deconstructs the input JSON, returning each
of the keys "requested" by the table alias definition. Missing keys are
replaced with
NULL
.The logic that governs the type casting from JSON to SQL mimics the
logic in the similar
json_populate_record
family of functions, andshould be identical to Postgres's such logic. It's permissible to use
the virtual table type to deconstruct a sub-object into a record field
within the top level JSON, like this:
Functions like these ones that return record types in PostgreSQL must be
aliased to a named and typed tuple (like
AS t(a INT, b INT)
) to beusable. As a result, this commit:
optimizer's call stack so that the aliased expression can access the
alias information. Previously, the aliased expression was blind to
any aliases that were applied to it.
plumbed in as well.
Release note (sql change): add the json{,b}_to_record{,set} builtin
functions, which transform JSON into structured SQL records.
Release justification: low risk, high reward change to existing functionality