-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: allow UDFs and builtins with SQL bodies in prepared stmt arguments #107767
Conversation
Informs cockroachdb#99008 Release note (bug fix): A bug has been fixed that caused nodes to crash when attempting to `EXECUTE` a prepared statement with an argument that referenced a user-defined function. This bug was present since user-defined functions were introduces in version 22.2.
User-defined functions and builtin functions defined with SQL bodies can now be used as arguments when executing prepared statements. This was not previously allowed because placeholders could only be assigned from constant values that resulted from evaluating the typed expressions representing the placeholder arguments. This did not work for functions defined with SQL bodies because they cannot simply be evaluated like other AST trees and they must be built by the optimizer and turned into `tree.RoutineExpr`s. This commit lifts the restriction by falling back to building placeholders as `opt.ScalarExpr`s if evaluation of the typed expression fails. Fixes cockroachdb#99008 Release note (sql change): User-defined functions can now be used as placeholder arguments when executing prepared statements with `EXECUTE`.
Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Should this be backported to 22.2 and 23.1?
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r1, 10 of 10 files at r2, 12 of 12 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
pkg/sql/plan_opt.go
line 506 at r2 (raw file):
semaCtx := opc.p.SemaCtx() evalCtx := opc.p.EvalContext() buildPlaceholderAsScalar := func(placeholder *tree.Placeholder) (opt.ScalarExpr, error) {
[nit] Is there any way we could combine with the other definition of buildPlaceholderAsScalar
? Maybe make it a method of norm.Factory
?
We could also consider storing this function (or an interface with an equivalent method) on EvalContext
so that it could be accessed from EvalFuncExpr
. That would be more targeted, but maybe this way fixes other cases than just UDFs with SQL body. We could even make that easier by just making a new optimizer+factory instead of using an existing one. That would be inefficient, but it might not matter here.
pkg/sql/opt/norm/factory.go
line 355 at r2 (raw file):
replaceFn = func(e opt.Expr) opt.Expr { if placeholder, ok := e.(*memo.PlaceholderExpr); ok { d, err := eval.Expr(f.ctx, f.evalCtx, e.(*memo.PlaceholderExpr).Value)
Doesn't this existing code evaluate the placeholder expression for every placeholder reference in the query? Maybe we don't want this behavior, but if we decide to keep it, maybe this solution works for volatile functions after all (or at least isn't a regression vs older releases). Alternatively, could we pull the placeholder evaluation out of this replace function so that it only happens once?
pkg/sql/logictest/testdata/logic_test/udf_prepare
line 38 at r2 (raw file):
2 20 statement error pgcode 42601 variable sub-expressions are not allowed in EXECUTE parameter
Could we have a test with a UDF that has variables inside its body?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a major issue with this approach: when we're replacing placholders, we haven't initilized the optbuilder's catalog, so we're unable to resolve any datasources (or any other objects) referenced within the UDF body. I'm not sure that's easily solved, but I'll consider it before closing this.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
pkg/sql/plan_opt.go
line 506 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] Is there any way we could combine with the other definition of
buildPlaceholderAsScalar
? Maybe make it a method ofnorm.Factory
?We could also consider storing this function (or an interface with an equivalent method) on
EvalContext
so that it could be accessed fromEvalFuncExpr
. That would be more targeted, but maybe this way fixes other cases than just UDFs with SQL body. We could even make that easier by just making a new optimizer+factory instead of using an existing one. That would be inefficient, but it might not matter here.
It can't be a method of norm.Factory
because only the optbuilder
knows how to build a scalar and optbuilder
depends on norm
, so we'd have a cycle.
pkg/sql/opt/norm/factory.go
line 355 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Doesn't this existing code evaluate the placeholder expression for every placeholder reference in the query? Maybe we don't want this behavior, but if we decide to keep it, maybe this solution works for volatile functions after all (or at least isn't a regression vs older releases). Alternatively, could we pull the placeholder evaluation out of this replace function so that it only happens once?
Correct. But we appear to differ from Postgres in this behavior:
Postgres:
PREPARE p AS SELECT $1::float UNION ALL SELECT $1::FLOAT + 1;
-- PREPARE
EXECUTE p(random());
-- float8
-- ---------------------
-- 0.32451202753803443
-- 1.3245120275380344
-- (2 rows)
CRDB:
PREPARE p AS SELECT $1::float UNION ALL SELECT $1::FLOAT + 1;
-- PREPARE
EXECUTE p(random());
-- float8
-- -----------------------
-- 0.24101213558621915
-- 1.7880754437670263
-- (2 rows)
I think that CRDB's behavior is dubious. For example, you'd expect the double
prepared statement below to always yield an even number, but that's not the case.
CRDB:
PREPARE double AS SELECT $1::INT + $1::INT;
-- PREPARE
EXECUTE double(floor(random()*100));
-- ?column?
-- ------------
-- 133
-- (1 row)
I've create #108027 to track this.
I'm going to close this and break the first commit out to be merged to prevent the panic for now. Supporting UDFs in |
sql: prevent panic when using UDF as EXECUTE argument
Informs #99008
Release note (bug fix): A bug has been fixed that caused nodes to crash
when attempting to
EXECUTE
a prepared statement with an argument thatreferenced a user-defined function. This bug was present since
user-defined functions were introduces in version 22.2.
sql: allow UDFs and builtins with SQL bodies in prepared stmt arguments
User-defined functions and builtin functions defined with SQL bodies can
now be used as arguments when executing prepared statements. This was
not previously allowed because placeholders could only be assigned from
constant values that resulted from evaluating the typed expressions
representing the placeholder arguments. This did not work for functions
defined with SQL bodies because they cannot simply be evaluated like
other AST trees and they must be built by the optimizer and turned into
tree.RoutineExpr
s.This commit lifts the restriction by falling back to building
placeholders as
opt.ScalarExpr
s if evaluation of the typed expressionfails.
Fixes #99008
Release note (sql change): User-defined functions can now be used as
placeholder arguments when executing prepared statements with
EXECUTE
.opt: simplify ScalarBuilder.Build
Release note: None