-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: fix type checking code for aggregate functions #46649
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I'm not sure about this. I think we need to make sure that we behave the same way as Postgres here, unfortunately, which i think means returning NULL
and not an error in these cases.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @nvanbenschoten, and @RaduBerinde)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten, @RaduBerinde, and @rytaft)
pkg/sql/sem/tree/type_check.go, line 854 at r1 (raw file):
// NULL is given as an argument. if !def.NullableArgs && def.FunctionProperties.Class != GeneratorClass && def.FunctionProperties.Class != AggregateClass {
I wonder if we could (in the aggregate case) returnmax(NULL::desired)
(and default desired
to String
if it's Any
).
Agreed that we should try to match Postgres' behavior... but the existing code is definitely incorrect (and performs differently from Postgres). For example, consider the regression test in this PR:
Postgres (correctly) returns a single NULL row. Prior to this commit, Cockroach would return two NULL rows. Obviously it's best to match Postgres' behavior, but I think it's still better to return an error than incorrect results. I was thinking that the change to make Cockroach choose the string overload as the default would be a separate PR, but I can just add that as another commit to this PR if that would be better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, after digging deep into the Postgres type checking code, I've added another commit based on @RaduBerinde's suggestion of casting the argument to type string. Please see the new commit message for more details.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @RaduBerinde)
pkg/sql/sem/tree/type_check.go, line 854 at r1 (raw file):
Previously, RaduBerinde wrote…
I wonder if we could (in the aggregate case) return
max(NULL::desired)
(and defaultdesired
toString
if it'sAny
).
Defaulting desired
to String
doesn't work if there are no overloads available that match, but I've done something similar...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @RaduBerinde)
Prior to this commit, any aggregate function that had an argument with unknown type was replaced with NULL. This is incorrect for scalar aggregates when the input relation has multiple rows, because after replacement, the query result has the same number of rows as the input relation. It should instead be reduced to a single row. This commit fixes the issue by avoiding replacing the aggregate with NULL. Note that for many aggregates, this change results in an ambiguous function error since the type checking code cannot choose which overload is correct for the unknown type. This is different behavior than Postgres, which defaults to type "text". Fixes cockroachdb#46196 Release justification: this is a low risk, high benefit change to existing functionality. Release note (bug fix): fixed an incorrect query result that could occur when a scalar aggregate was called with a null input.
Prior to this commit, the type checking code was not able to choose between different possible aggregate overloads if the arguments had type unknown. This commit changes the logic to match Postgres, which always prefers overloads with arguments of type string if available. Note that this commit still doesn't completely match Postgres' behavior, because it doesn't handle the case when there are no overloads available with string inputs for the arguments with unknown type. If there are no overloads with string arguments, Postgres chooses the overload with preferred type for the given category. For example, float8 is the preferred type for the numeric category in Postgres. Since we don't support the concept of preferred types within type categories, supporting this behavior will be a more involved change. For now, this commit should cover most of our supported aggregates. Release justification: low risk, high benefit change to existing functionality. Release note (sql change): the type checking code now prefers aggregate overloads with string inputs if there are multiple possible candidates due to arguments of unknown type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @RaduBerinde)
Build succeeded |
Prior to this commit, any aggregate function that had an argument with
unknown type was replaced with
NULL
. This is incorrect for scalaraggregates when the input relation has multiple rows, because after
replacement, the query result has the same number of rows as the input
relation. It should instead be reduced to a single row.
This commit fixes the issue by avoiding replacing the aggregate with
NULL
.Note that for many aggregates, this change results in an ambiguous function
error since the type checking code cannot choose which overload is correct
for the unknown type. This is different behavior than Postgres, which
defaults to type "text".
Fixes #46196
Release justification: this is a low risk, high benefit change to existing
functionality.
Release note (bug fix): fixed an incorrect query result that could occur
when a scalar aggregate was called with a null input.