-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: implement new function resolution to work with UDF #85391
sql: implement new function resolution to work with UDF #85391
Conversation
a1cc192
to
ad8726f
Compare
@miretskiy appreciate a look on the new cdc specific function resolver. I think it works the same as the old cdcCustomFunctionResolver but just used through the function resolver interface through SemaCtx. I also found there's another hack specific for CDC that we use searchpath to replace the "WrapFunction", I replaced that with a new interface which does the similar hack but just make it more clear that it's a hack, and I added a todo for myself to replace it with function resolver as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only made it through the first commit but I still think it's worth flushing.
I don't think your optimizations are safe in the general case. It feels to me like what we want to do is make an in-memory cache for the builtin overloads such that when we resolve a builtin and there are no UDFs with the same name that we don't allocate anything. We can optimize the usage of UDFs later.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan, @mgartner, and @miretskiy)
-- commits
line 4 at r1:
nit: this commit modifies
-- commits
line 9 at r1:
Can you rework and expand the last sentence to explain where exactly the cache exists?
Code quote:
resolver. A cache of Overload is added to schema layer to
hoepfully can save some effort on protobuf conversions.
pkg/sql/catalog/schemadesc/schema_desc.go
line 59 at r1 (raw file):
// list of overload signatures. This helps to avoid converting a function // signature protobuf to a ResolvedFunctionDefinition over and over again. prefixedFuncDefs map[string]*tree.ResolvedFunctionDefinition
I don't think this is kosher. You're going to have multiple different sessions accessing this cache at one. Also, you need to make sure the leasing remains coherent. This is why the hydrated cache goes to such lengths to achieve its goals.
You could do this on the mutable, but I have to imagine it doesn't matter there.
pkg/sql/sem/tree/function_definition.go
line 41 at r1 (raw file):
// overloads prefixed wit schema name. type ResolvedFunctionDefinition struct { Name string
Should Name
here be of the Name
type?
pkg/sql/sem/tree/function_definition.go
line 54 at r1 (raw file):
// It indicates that the overload is defined with the specified schema. type PrefixedOverload struct { Schema string
should Schema
here be of the Name
type?
pkg/sql/sem/tree/function_definition.go
line 284 at r1 (raw file):
// error is returned if there is any overload with different Class. func (fd *ResolvedFunctionDefinition) GetClass() (FunctionClass, error) { return getFuncClass(fd.Name, toOverloads(fd.Overloads))
these scare me because they are going to allocate and I have to imagine GetClass
gets called in the optimizer under the planning path. Should we change the prefixes to be a parallel slice to the overloads and then access them via methods?
Code quote:
func (fd *ResolvedFunctionDefinition) GetClass() (FunctionClass, error) {
return getFuncClass(fd.Name, toOverloads(fd.Overloads))
}
// GetReturnLabel returns function ReturnLabel by checking each overload and
// returns a ReturnLabel if all overloads have a ReturnLabel of the same length.
// Ambiguous error is returned if there is any overload has ReturnLabel of a
// different length. This is good enough since we don't create UDF with
// ReturnLabel.
func (fd *ResolvedFunctionDefinition) GetReturnLabel() ([]string, error) {
return getFuncReturnLabels(fd.Name, toOverloads(fd.Overloads))
}
pkg/sql/sem/tree/function_definition.go
line 380 at r1 (raw file):
// searchPath are searched. A nil is returned if no function is found. It's // caller's choice to error out if function not found. func GetBuiltinFuncDefinition(
is there any reason these functions need to be in the tree
package?
Code quote:
// PrefixBuiltinFunctionDefinition prefix all overloads in a function definition
// with a schema name. Note that this function can only be used for builtin
// function. Hence, HasUDF is set to false.
func PrefixBuiltinFunctionDefinition(
def *FunctionDefinition, schema string,
) *ResolvedFunctionDefinition {
ret := &ResolvedFunctionDefinition{
Name: def.Name,
Overloads: make([]*PrefixedOverload, 0, len(def.Definition)),
}
for _, o := range def.Definition {
ret.Overloads = append(
ret.Overloads,
&PrefixedOverload{Schema: schema, Overload: o},
)
}
return ret
}
// GetBuiltinFuncDefinitionOrFail is similar to GetBuiltinFuncDefinition but
// fail if function is not found.
func GetBuiltinFuncDefinitionOrFail(
fName *FunctionName, searchPath SearchPath,
) (*ResolvedFunctionDefinition, error) {
def, err := GetBuiltinFuncDefinition(fName, searchPath)
if err != nil {
return nil, err
}
if def == nil {
return nil, pgerror.Newf(pgcode.UndefinedFunction, "unknown function: %s", fName.String())
}
return def, nil
}
// GetBuiltinFuncDefinition search for a builtin function given a function name
// and a search path. If function name is prefixed, only the builtin functions
// in the specific schema are searched. Otherwise, all schemas on the given
// searchPath are searched. A nil is returned if no function is found. It's
// caller's choice to error out if function not found.
func GetBuiltinFuncDefinition(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass over all the commits, but did not look at the schema-y stuff in too much detail.
Reviewed 29 of 29 files at r1, 3 of 3 files at r2, 7 of 7 files at r3, 38 of 38 files at r4, 4 of 4 files at r5, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan and @miretskiy)
pkg/sql/schema_resolver.go
line 406 at r1 (raw file):
if builtinDef == nil && udfDef == nil { // Try a little harder.
nit: expand this comment
pkg/sql/catalog/schema.go
line 38 at r1 (raw file):
// GetPrefixedFuncDefinition returns a ResolvedFunctionDefinition given a // function name. GetPrefixedFuncDefinition(name string) (*tree.ResolvedFunctionDefinition, bool)
nit: Would "qualified" be a better word than "prefixed" here and in other places where you've used prefix in this PR?
pkg/sql/opt/cat/catalog.go
line 145 at r1 (raw file):
) (*tree.ResolvedFunctionDefinition, error) // ResolveFunctionByOID resolves a function overload by OID
nit: add .
the the end of the sentence.
pkg/sql/opt/exec/execbuilder/scalar.go
line 286 at r4 (raw file):
} // TODO(mgartner): Consider modifying this to use a function resolver. Or is a // function definition needed here at all?
This occurs after the query has been optimized. It builds an optimizer expression of a function back into a tree.FuncExpr that can be evaluated during execution. I don't see why we would need to resolve anything at this point - the function overload has to be resolved already.
pkg/sql/opt/testutils/testcat/function.go
line 54 at r1 (raw file):
// ResolveFunctionByOID part of the tree.FunctionReferenceResolver interface. func (tc *Catalog) ResolveFunctionByOID(ctx context.Context, oid oid.Oid) (*tree.Overload, error) { return nil, nil
nit: return an assertion error stating the resolving a function by OID is not supported by the test catalog.
pkg/sql/sem/transform/aggregates.go
line 41 at r4 (raw file):
return true, expr } fd, err := t.Func.Resolve(v.ctx, &v.searchPath, v.funcResolver)
I don't think we need a resolver here until we support user-defined aggregate functions. I'm inclined to leave this TODO here and not make any changes to this file until they can be tested. What do you think?
pkg/sql/sem/transform/expr_transform.go
line 66 at r4 (raw file):
ctx: ctx, searchPath: searchPath, funcResolver: funcResolver,
This is legacy transformation logic. I'd like to rip it out at some point. Can we just leave as-is for now?
pkg/sql/sem/tree/function_definition.go
line 39 at r1 (raw file):
// ResolvedFunctionDefinition is similar to FunctionDefinition but with all the // overloads prefixed wit schema name.
nit: wit => with
pkg/sql/sem/tree/function_definition.go
line 48 at r1 (raw file):
ExplicitSchema bool Overloads []*PrefixedOverload
nit: []PrefixedOverload
- I don't see any need to make this a slice of pointers
pkg/sql/sem/tree/function_definition.go
line 55 at r1 (raw file):
type PrefixedOverload struct { Schema string // TODO(Chengxiong): make this into a *Overload
nit: remove this TODO
pkg/sql/sem/tree/function_definition.go
line 259 at r1 (raw file):
// MergeWith is used specifically to merge two UDF definitions with // same name but from different schemas. Currently, the FunctionProperties field // is not set for UDFs.
Does it have to be from different schemas? Should it error if the schemas are the same and ExplicitSchema is true, or should it merge them successfully and set ExplicitSchema to true?
pkg/sql/sem/tree/function_definition.go
line 282 at r1 (raw file):
// GetClass returns function class by checking each overload's Class and returns // the homogeneous Class value if all overloads are the same Class. Ambiguous // error is returned if there is any overload with different Class.
PG have some logic to resolve overloads in this case instead of erroring:
marcus=# select count(10);
count
-------
1
(1 row)
marcus=# create function count(i int) returns int language sql as 'select i';
CREATE FUNCTION
marcus=# select count(10);
count
-------
10
(1 row)
Can you leave some TODOs here?
pkg/sql/sem/tree/function_definition.go
line 341 at r1 (raw file):
} // PrefixBuiltinFunctionDefinition prefix all overloads in a function definition
nit: prefix => prefixes (or use "qualifies" if you decide to follow my advice above re: using "qualified" instead of "prefixed")
pkg/sql/sem/tree/function_definition.go
line 361 at r1 (raw file):
// GetBuiltinFuncDefinitionOrFail is similar to GetBuiltinFuncDefinition but // fail if function is not found.
nit: fail => returns an error
pkg/sql/sem/tree/function_definition.go
line 379 at r1 (raw file):
// in the specific schema are searched. Otherwise, all schemas on the given // searchPath are searched. A nil is returned if no function is found. It's // caller's choice to error out if function not found.
nit: explain when an error is returned
pkg/sql/sem/tree/function_name.go
line 72 at r3 (raw file):
case *FunctionDefinition: // TODO(Chengxiong): get rid of FunctionDefinition entirely. parts := strings.Split(t.Name, ".")
FunctionDefinition.Name
is of type string
. When would it be a qualified name, if ever? And if it can actually be a qualified name, shouldn't we change the type to be tree.ObjectName
?
pkg/sql/sem/tree/function_name.go
line 86 at r3 (raw file):
return fd, nil case *UnresolvedName: if resolver == nil {
nit: replace the deleted comment with something that explains the two cases here
pkg/sql/sem/tree/overload.go
line 181 at r1 (raw file):
// IsUDF is set to true when this is a user-defined function overload. // Note: Body can be empty string even IsUDF is true. IsUDF bool
nit: can you move Body to be next to these fields, and move FunctionProperties to the last field.
pkg/sql/sem/tree/overload.go
line 1211 at r4 (raw file):
} func getFuncSig(expr *FuncExpr, typedInputExprs []TypedExpr, desiredType *types.T) string {
It would be nice to consolidate the logic with Overload.Signature and have the same formatting for both.
pkg/sql/sem/tree/type_check.go
line 1148 at r4 (raw file):
// Sort overloads by the order of their schema in search path. overloads := make([]*PrefixedOverload, len(fns))
nit: should we name this grouped overloads or something?
pkg/sql/sem/tree/type_check.go
line 1152 at r4 (raw file):
overloads[i] = fns[i].(*PrefixedOverload) } if err := sortOverloadsByPath(overloads, searchPath); err != nil {
Sorting all the overload seems overkill here. Don't we just need the overloads that match the most significat path in the search path? Can we instead iterate over the paths in the search path, and for each path look for overloads in that schema, and break when we find the first set of overloads? We'd also avoid the map allocation in the sortOverloadsByPath.
pkg/sql/sem/tree/type_check.go
line 1158 at r4 (raw file):
// Throw ambiguity error if there are more than one candidate overloads from // same schema. if len(overloads) > 1 && overloads[0].Schema == overloads[1].Schema {
We'll need to check more than the first 2 overloads, right?
pkg/sql/logictest/testdata/logic_test/builtin_function
line 8 at r4 (raw file):
query error pq: schema "foo" does not exist SELECT foo.bar()
Do we have a test with a known schema but an unknown function? If not, we should add one here.
ad8726f
to
239c867
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews! @ajwerner and @mgartner
@ajwerner I remove the not-good udf cache from schema and add a in-memory cache of resolved functions for builtin functions.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @mgartner, and @miretskiy)
Previously, ajwerner wrote…
nit: this commit modifies
fixed.
Previously, ajwerner wrote…
Can you rework and expand the last sentence to explain where exactly the cache exists?
fixed.
pkg/sql/schema_resolver.go
line 406 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: expand this comment
added more words
pkg/sql/catalog/schema.go
line 38 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: Would "qualified" be a better word than "prefixed" here and in other places where you've used prefix in this PR?
good point, I don't have strong preference, I'm leaving as it is.
pkg/sql/catalog/schemadesc/schema_desc.go
line 59 at r1 (raw file):
Previously, ajwerner wrote…
I don't think this is kosher. You're going to have multiple different sessions accessing this cache at one. Also, you need to make sure the leasing remains coherent. This is why the hydrated cache goes to such lengths to achieve its goals.
You could do this on the mutable, but I have to imagine it doesn't matter there.
this cache is deleted and an in-memory cache is added for builtin functions as suggested. Added a todo for myself to optimize UDF performance here.
pkg/sql/opt/cat/catalog.go
line 145 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: add
.
the the end of the sentence.
fixed. thanks
pkg/sql/opt/testutils/testcat/function.go
line 54 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: return an assertion error stating the resolving a function by OID is not supported by the test catalog.
ah, good call, fixed.
pkg/sql/sem/transform/aggregates.go
line 41 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I don't think we need a resolver here until we support user-defined aggregate functions. I'm inclined to leave this TODO here and not make any changes to this file until they can be tested. What do you think?
Yeah, I agree. removed.
pkg/sql/sem/transform/expr_transform.go
line 66 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
This is legacy transformation logic. I'd like to rip it out at some point. Can we just leave as-is for now?
ah, nice, removed changes here. thanks for pointing it out.
pkg/sql/sem/tree/function_definition.go
line 39 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: wit => with
fixed.
pkg/sql/sem/tree/function_definition.go
line 41 at r1 (raw file):
Previously, ajwerner wrote…
Should
Name
here be of theName
type?
good point, though I'm leaving it as string to avoid too many castings.
pkg/sql/sem/tree/function_definition.go
line 48 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit:
[]PrefixedOverload
- I don't see any need to make this a slice of pointers
I think it's actually better to use pointers here since we need to merge functions through different schemas, using pointers would avoid allocating.
pkg/sql/sem/tree/function_definition.go
line 54 at r1 (raw file):
Previously, ajwerner wrote…
should
Schema
here be of theName
type?
good point as well, similarly, I'm leaving it as string to avoid too many castings.
pkg/sql/sem/tree/function_definition.go
line 55 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: remove this TODO
good catch...
pkg/sql/sem/tree/function_definition.go
line 259 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Does it have to be from different schemas? Should it error if the schemas are the same and ExplicitSchema is true, or should it merge them successfully and set ExplicitSchema to true?
good catch. It doesn't have to be from different schemas...I removed ExplicitSchema
, it's not being used actually.
reworded the comment.
this thing was from my first version of this...but I forgot to reword the comment enough.
pkg/sql/sem/tree/function_definition.go
line 282 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
PG have some logic to resolve overloads in this case instead of erroring:
marcus=# select count(10); count ------- 1 (1 row) marcus=# create function count(i int) returns int language sql as 'select i'; CREATE FUNCTION marcus=# select count(10); count ------- 10 (1 row)
Can you leave some TODOs here?
yes, added a TODO. let me know if that sg to you.
pkg/sql/sem/tree/function_definition.go
line 284 at r1 (raw file):
Previously, ajwerner wrote…
these scare me because they are going to allocate and I have to imagine
GetClass
gets called in the optimizer under the planning path. Should we change the prefixes to be a parallel slice to the overloads and then access them via methods?
yeah, good point, Added a parallel []*Overload
to address.
pkg/sql/sem/tree/function_definition.go
line 341 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: prefix => prefixes (or use "qualifies" if you decide to follow my advice above re: using "qualified" instead of "prefixed")
changed to prefixes. thanks.
pkg/sql/sem/tree/function_definition.go
line 361 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: fail => returns an error
fixed.
pkg/sql/sem/tree/function_definition.go
line 379 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: explain when an error is returned
added.
pkg/sql/sem/tree/function_definition.go
line 380 at r1 (raw file):
Previously, ajwerner wrote…
is there any reason these functions need to be in the
tree
package?
No, my reason putting them here is that they're relevant to the "function definition" thing. I thought of putting them in builtin package but we can't use structures from tree package there.
pkg/sql/sem/tree/function_name.go
line 72 at r3 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
FunctionDefinition.Name
is of typestring
. When would it be a qualified name, if ever? And if it can actually be a qualified name, shouldn't we change the type to betree.ObjectName
?
currently, builtin function names are hard coded as one string used as a map key, which can be something like "crdb_internal.xxxx", it's one string but qualified. "information_schema.xxx" is another example.
pkg/sql/sem/tree/function_name.go
line 86 at r3 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: replace the deleted comment with something that explains the two cases here
added. thanks.
pkg/sql/sem/tree/overload.go
line 181 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: can you move Body to be next to these fields, and move FunctionProperties to the last field.
yeah, good point, also updated the // Only one of the following 6 attributes can be set.
comment above.
pkg/sql/sem/tree/overload.go
line 1211 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
It would be nice to consolidate the logic with Overload.Signature and have the same formatting for both.
ah nice, didn't know about Overload.Signature
, I added to a todo for myself later since the use cases are a bit different.
pkg/sql/sem/tree/type_check.go
line 1148 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: should we name this grouped overloads or something?
renamed it to typeCheckedOverloads
.
pkg/sql/sem/tree/type_check.go
line 1152 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Sorting all the overload seems overkill here. Don't we just need the overloads that match the most significat path in the search path? Can we instead iterate over the paths in the search path, and for each path look for overloads in that schema, and break when we find the first set of overloads? We'd also avoid the map allocation in the sortOverloadsByPath.
ah, yeah, that's true...changed to only get overloads from most significant schema.
pkg/sql/sem/tree/type_check.go
line 1158 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
We'll need to check more than the first 2 overloads, right?
changed to check if there're more than 2 remaining overloads
pkg/sql/logictest/testdata/logic_test/builtin_function
line 8 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Do we have a test with a known schema but an unknown function? If not, we should add one here.
good point! added.
pkg/sql/opt/exec/execbuilder/scalar.go
line 286 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
This occurs after the query has been optimized. It builds an optimizer expression of a function back into a tree.FuncExpr that can be evaluated during execution. I don't see why we would need to resolve anything at this point - the function overload has to be resolved already.
thanks for checking, removed.
aea4d04
to
a7f8878
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 43 of 43 files at r19, 4 of 4 files at r20, 7 of 7 files at r21, 38 of 38 files at r22, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @chengxiong-ruan, @mgartner, and @miretskiy)
pkg/sql/sem/tree/type_check.go
line 2978 at r22 (raw file):
} ret := make([]overloadImpl, 0, len(overloads))
nit: I think you can remove this allocation altogether now - you just need a variable for a single overload. If it's already been set, then you can return the ambiguity error.
679f66b
to
68cdc2c
Compare
Previously, mgartner (Marcus Gartner) wrote…
Good point, thanks for all the suggestion. I also added more edge cases checking and tests. There was one obvious edge case I didn't catch is that if a function is resolved with a qualified name (with explicit schema) but the schema is not in search path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @mgartner, and @miretskiy)
pkg/sql/sem/tree/function_definition.go
line 284 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
SGTM. I think the parallel slices pattern is useful. The complexity can be hidden from callers by not exporting the slices and providing an API that prevents misusage.
Done.
Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
oops... tried to resolve the discussion, not done... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r23, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @mgartner, and @miretskiy)
TFTR! |
Build failed (retrying...): |
looks like there's one of the jobs test shards kept failing |
Canceled. |
It looks like one of the job tests (TestStartableJobMixedVersion) hang and timeout. Added some logging and I suspect that it's the |
We fortunately prevent setting cluster settings in a transaction. That means that we have full control over the access to tables that run during the transaction which runs a cluster version upgrade. One observation is that upgrades can take a long time if we leak leases in some way. In cockroachdb#85391 we add function resolution which results in decriptors now being leased when you run the following statement because we need to resolve the function: `SET CLUSTER SETTING version = crdb_internal.node_executable_version()` Then the test ends up taking 5 minutes +/- 15% for the leases to expire. The downside here is that you don't get a transactional view of UDFs used in expressions inside `SET CLUSTER VERSION`. That seems fine to me. Release note: None
85599: ui: add MVCC range stats to Db Console r=koorosh a=koorosh This patch extends node details with `rangeKeyBytes` and `rangeValueBytes` stats. Also, it adds the same stats to Advanced Debug > Range Status section. Release note (ui change): Added "Range Key Bytes" and "Range Value Bytes" stats on Node details page. 85621: sql: fix recent race on using a random source r=yuzefovich a=yuzefovich In 9eb5ac3 we introduced a single random source to the DistSQLPlanner that is used by the PartitionSpans logic in the multi-tenant setting. However, we missed the fact that that method can be called concurrently from multiple goroutines which can lead to a race on the random source. Prior to that commit we were using a single global random source, so there was a desire to reduce contention on that. This commit achieves that desire but also fixes the data race by instantiating a new random source whenever we're creating a "resolver" function where the source is used. Fixes: #85508. Release note: None 85622: sql: release leases before running migrations r=ajwerner a=ajwerner We fortunately prevent setting cluster settings in a transaction. That means that we have full control over the access to tables that run during the transaction which runs a cluster version upgrade. One observation is that upgrades can take a long time if we leak leases in some way. In #85391 we add function resolution which results in decriptors now being leased when you run the following statement because we need to resolve the function: `SET CLUSTER SETTING version = crdb_internal.node_executable_version()` Then the test ends up taking 5 minutes +/- 15% for the leases to expire. The downside here is that you don't get a transactional view of UDFs used in expressions inside `SET CLUSTER VERSION`. That seems fine to me. Release note: None 85625: bazel: move `bazel run :gazelle` before `generate-logictest` r=dt a=rickystewart `gazelle` populates the `BUILD.bazel` files for the `staticcheck` analyzers, so we want those populated before `generate-logictest`. Release note: None Co-authored-by: Andrii Vorobiov <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>
This commit modifies the function resolver interface to have two methods. One to resolve overloads by function name. Another to get function overload by OID. An implementation of the new interface is added by extending the schema resolver. A cache of ResolvedFunctionDefinition for builtin functions tree.ResolvedBuiltinFuncDefs is constructed at init time to avoid allocating for builtin functions. Use cases of function resolution will be changed to use the new interface in following commits. Release note: None
This commit implements the new function resolver for cdc purpose. It should mimic the existing function resolution in cdc. In later commits this function resolver will be used as resolver for cdc expression evaluation. Release note: None
This commit updates the `Resolve` method of the ResolvableFunctionReference to return a ResolvedFunctionDefinition. And then all use cases of UnresolvedName.ResolveFunction are replaced with the new `Resolve` method, so that we can get rid of UnresolvedName.ResolveFunction since it's essentially the same as the builtin function resolution path of the new function resolver. Release note: None
This commit fixes most of the existing usage cases of function resolution except for getting sequence identifier from expression which can be a separate pr itself later. Release note: None
68cdc2c
to
b0b54ff
Compare
TFTR! and the fix of leaked lease! |
Build succeeded: |
I narrowed down a regression in micro-benchmark to this PR:
Note that most likely ddabd18 is to blame, but the build fails on that commit - another plug for making sure that builds succeed on each commit in the PR. I took a quick look at that commit, and I didn't see an easy way to reduce the impact. Do we want to do something about that? |
Thanks for raising this @yuzefovich. Yeah, looks like the upsert benchmark is calling |
And yes... my bad that didn't make each commit built when I broke up the pr into smaller commits. I went too far and found it hard to make each commit pass. But this is not a good excuse. This is a good lesson and I'll definitely keep this in mind. |
No worries! I remember that I initially didn't appreciate at all why each of the commits in a multi-commit PR should be compiled since all commits within the PR are merged together, but this bisecting process is probably the main reason. |
I attacked the resolution side of the allocations in #89317. There's more we can do on the type checking side. For what it's worth, that's going to be more of the bytes allocated. |
There're 4 commits:
(1) This commit modify the function resolver interface to
have two methods. One to resolve overloads by function name.
Another to get function overload by OID. An implementation
of the new interface is added by extending the schema
resolver. A cache of Overload is added to schema layer to
hoepfully can save some effort on protobuf conversions.
Use cases of function resolution will be changed to use the
new interface in following commits.
(2) This commit implement the new function resolver for cdc purpose.
It should mimic the existing function resolution in cdc.
In later commits this function resolver will be used as resolver
for cdc expression evaluation.
(3) This commit update the
Resolve
method of theResolvableFunctionReference to return a ResolvedFunctionDefinition.
And then all use cases of UnresolvedName.ResolveFunction are
replaced with the new
Resolve
method, so that we canget rid of UnresolvedName.ResolveFunction since it's essentially
the same as the builtin function resolution path of the new
function resolver.
(4) This commit fixes most of the existing usage cases of function
resolution except for getting sequence identifier from expression
which can be a separate pr itself later.