-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move Coercion for MakeArray to coerce_arguments_for_signature
and introduce another one for ArrayAppend
#8317
Conversation
coerce_arguments_for_signature
and introduce another one for ArrayAppend
TypeSignature::Exact(valid_types) => vec![valid_types.clone()], | ||
TypeSignature::ArrayAppendLikeSignature => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a specialized checking for array append at the end. And I think we will need specialized check for other pattern too
@@ -181,7 +239,7 @@ fn coerced_from<'a>( | |||
Int64 | |||
if matches!( | |||
type_from, | |||
Null | Int8 | Int16 | Int32 | Int64 | UInt8 | UInt16 | UInt32 | |||
Null | Int8 | Int16 | Int32 | Int64 | UInt8 | UInt16 | UInt32 | Boolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only i64 is fixed for passing existing test in array.slt. More types should be fixed in #8302
0dbfaab
to
8483b76
Compare
if let Some(corced_type) = corced_type { | ||
Ok(corced_type) | ||
} else { | ||
internal_err!("Coercion from {acc:?} to {x:?} failed.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call unwrap_or previously so select [1, true, null]
unexpectedly correct since true is castable to 1 in arrow-rs but not in datafusion. select [true, 1, null]
failed. It is better that we just return error if not coercible.
Signed-off-by: jayzhan211 <[email protected]>
1467fda
to
3ba90b4
Compare
@alamb Ready for review! I hope this is a better solution for dealing with nulls. I had tried |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jayzhan211 -- this is very much in the right direction I think. I had some suggestions / comments about some of the specific cases but it is looking very close
datafusion/expr/src/signature.rs
Outdated
@@ -95,6 +95,8 @@ pub enum TypeSignature { | |||
VariadicEqual, | |||
/// One or more arguments with arbitrary types | |||
VariadicAny, | |||
/// A function such as `make_array` should be coerced to the same type | |||
VariadicCoerced, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain how this is different than VariadicEqual
? It seems from the comments here they are quite similar 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that VariadicEqual
is not used in any function. Maybe we can just keep one.
I thought Equal is the one that don't care about coercion. All the type should be equal like (i32, i32, i32).
Coerced is the one that ensuring the final coerced type is the same (all of the type coercible to the same one), like (i32, i64) -> i64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think current VariadicCoerced
includes VariadicEqual
use case as well. We can just have one signature
@@ -113,6 +115,8 @@ pub enum TypeSignature { | |||
/// Function `make_array` takes 0 or more arguments with arbitrary types, its `TypeSignature` | |||
/// is `OneOf(vec![Any(0), VariadicAny])`. | |||
OneOf(Vec<TypeSignature>), | |||
/// Specialized Signature for ArrayAppend and similar functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about using a more generic name. Perhaps something like
/// The first argument is an array type ([`DataType::List`], or [`DataType::LargeList`]
/// and the subsequent arguments are coerced to the List's element type
///
/// For example a call to `func(a: List(int64), b: int32, c: utf8)` would attempt to coerce
/// all the arguments to `int64`:
/// ```
/// func(a: List(int64), cast(b as int64): int64, cast(c as int64): int64)
/// ```
ArrayAndElements
There may be more general ways of expressing the array function types too 🤔
@@ -590,26 +590,6 @@ fn coerce_arguments_for_fun( | |||
.collect::<Result<Vec<_>>>()?; | |||
} | |||
|
|||
if *fun == BuiltinScalarFunction::MakeArray { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
@@ -265,10 +265,8 @@ AS VALUES | |||
(make_array([28, 29, 30], [31, 32, 33], [34, 35, 36], [28, 29, 30], [31, 32, 33], [34, 35, 36], [28, 29, 30], [31, 32, 33], [34, 35, 36], [28, 29, 30]), [28, 29, 30], [37, 38, 39], 10) | |||
; | |||
|
|||
query ? | |||
query error | |||
select [1, true, null] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an error because true
can't be coerced to an integer, right? FWIW I think that is fine and is consistent with the postgres rues:
postgres=# select array[1, true, null];
ERROR: ARRAY types integer and boolean cannot be matched
LINE 1: select array[1, true, null];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
# [[]] [[4]] | ||
query ??????? | ||
select | ||
array_append(null, 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to support array_append(null, ...
)?
Postgres does not allow this:
postgres=# select array_append(null, array[2,3]);
ERROR: could not find array type for data type integer[]
It also does't try to find a fancy type with an empty list:
postgres=# select array_append(array[], array[2,3]);
ERROR: cannot determine type of empty array
LINE 1: select array_append(array[], array[2,3]);
^
HINT: Explicitly cast to the desired type, for example ARRAY[]::integer[].
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of other sql behavior
Duckdb: [[2, 3]], [[2, 3]]
ClickHouse: null, [[2, 3]]
Postgres: error, error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for array_append([], [2,3]), it is fine to not follow postgres and return [[2, 3]] like clickhouse and duckdb.
For array_append(null, [2, 3]), I think we can follow postrgres.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clickhouse and Duckdb has the same output for array_append(make_array(null, null), 1)
too.
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @jayzhan211 .
I merged this branch up from main to ensure there are no logical conflicts, and if all tests pass I intend to merge it |
Which issue does this PR close?
Ref #7142
Close #7995
Rationale for this change
Fix Null Handing for array function. ArrayAppend first, others later.
What changes are included in this PR?
Move coercion for MakeArray to different place.
Add signature for ArrayAppend
Are these changes tested?
MakeArray - existing test
ArrayAppend - new test
Are there any user-facing changes?