-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add field
trait method to WindowUDFImpl
, remove return_type
/nullable
#12374
Conversation
impl Expr { | ||
/// Common method for window functions that applies type coercion | ||
/// to all arguments of the window function to check if it matches | ||
/// its signature. | ||
/// | ||
/// If successful, this method returns the data type and | ||
/// nullability of the window function's result. | ||
/// | ||
/// Otherwise, returns an error if there's a type mismatch between | ||
/// the window function's signature and the provided arguments. | ||
fn data_type_and_nullable_with_window_function( | ||
&self, | ||
schema: &dyn ExprSchema, | ||
window_function: &WindowFunction, | ||
) -> Result<(DataType, bool)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracted a common method to handle type coercion for all window function types (built-in, udaf and udwf) which is then reused by methods:,
data_type_and_nullable
,get_type
and,nullable
/// Return the type of the function given its input types | ||
/// | ||
/// See [`WindowUDFImpl::return_type`] for more details. | ||
pub fn return_type(&self, args: &[DataType]) -> Result<DataType> { | ||
self.inner.return_type(args) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed return_type
.
/// Returns if column values are nullable for this window function. | ||
/// Returns the field of the final result of evaluating this window function. | ||
/// | ||
/// See [`WindowUDFImpl::nullable`] for more details. | ||
pub fn nullable(&self) -> bool { | ||
self.inner.nullable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed nullable
) | ||
) | ||
})?; | ||
let (_, function_name) = self.qualified_name(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Expr::qualified_name
which also handles:
Expr::Column
and,Expr::Alias
WindowFunctionDefinition::WindowUDF(fun) => fun | ||
.field(WindowUDFFieldArgs::new(input_expr_types, display_name)) | ||
.map(|field| field.data_type().clone()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return data type for udwf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Thanks @jayzhan211 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT:
If this is "too late" for this kind of comment, please let me know and I'll delete. I hadn't seen the issue / PR work until today.
One performance consideration is that Field::new
allocates a new string on each invocation.
Could that be why WindowUDF
(and both ScalarUDF
and AggregateUDF
) preferred to keep the methods separate?
I dug into it because needing to add empty &str
to all the tests in datafusion/expr/src/expr.rs
made me think it's not the right abstraction.
Lastly, do we want to diverge Aggregate and Window UDFs?
I thought the recent trend had been to unify them, like in this PR #11550 by @timsaucer?
I don't think there is any issue given that we usually require the whole Field Note that
Maybe historically issue?
I don't think so, they should be separated, no any good reason to mix them |
@Michael-J-Ward Appreciate the extra set of eyes. Thank you, for reviewing the code 🙌
datafusion/datafusion/expr/src/expr.rs Lines 704 to 724 in 5cc7d06
The When @Michael-J-Ward @jayzhan211 Thank you. |
field
trait method to WindowUDFImpl
field
trait method to WindowUDFImpl
, remove return_type
field
trait method to WindowUDFImpl
, remove return_type
field
trait method to WindowUDFImpl
, remove return_type
/nullable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👨🍳 👌
Looks really nice to me -- thank you @jcsherin and @jayzhan211
I also have it on my list to file some more sub task tickets for #8709 to remove the rest of the built in WindowFunctions
Thanks again @jayzhan211 and @jcsherin |
fn nullable(&self) -> bool { | ||
true | ||
} | ||
/// The [`Field`] of the final result of evaluating this window function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be useful to document here how the "name" for the returned field is supposed to be set :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. It's a great suggestion. I'll implement in a follow-up PR.
Thanks @Blizzara
…llable` (apache#12374) * Adds new library `functions-window-common` * Adds `FieldArgs` struct for field of final result * Adds `field` method to `WindowUDFImpl` trait * Minor: fixes formatting * Fixes: udwf doc test * Fixes: implements missing trait items * Updates `datafusion-cli` dependencies * Fixes: formatting of `Cargo.toml` files * Fixes: implementation of `field` in udwf example * Pass `FieldArgs` argument to `field` * Use `field` in place of `return_type` for udwf * Update `field` in udwf implementations * Fixes: implementation of `field` in udwf example * Revert unrelated change * Mark `return_type` for udwf as unreachable * Delete code * Uses schema name of udwf to construct `FieldArgs` * Adds deprecated notice to `return_type` trait method * Add doc comments to `field` trait method * Reify `input_types` when creating the udwf window expression * Rename name field to `schema_name` in `FieldArgs` * Make `FieldArgs` opaque * Minor refactor * Removes `nullable` trait method from `WindowUDFImpl` * Add doc comments * Rename to `WindowUDFResultArgs` * Minor: fixes formatting * Copy edits for doc comments * Renames field to `function_name` * Rename struct to `WindowUDFFieldArgs` * Add comments for unreachable code * Copy edit for `WindowUDFImpl::field` trait method * Renames module * Fix warning: unused doc comment * Minor: rename bindings * Minor refactor * Minor: copy edit * Fixes: use `Expr::qualified_name` for window function name * Fixes: apply previous fix to `Expr::nullable` * Refactor: reuse type coercion for window functions * Fixes: clippy errors * Adds name parameter to `WindowFunctionDefinition::return_type` * Removes `return_type` field from `SimpleWindowUDF` * Add doc comment for helper method * Rewrite doc comments * Minor: remove empty comment * Remove `WindowUDFImpl::return_type` * Fixes doc test
I filed a few more tickets to hopefully get this process started. |
Which issue does this PR close?
Closes #12373.
Rationale for this change
The result field from evaluating the user-defined window function is composed from the
return_type
andnullable
trait methods inWindowUDFImpl
.This change explores folding both methods into a single trait method. The user-defined window functions have to implement only the
field
trait method which makes the intent more explicit.The current implementation for a user-defined window function (without field trait method) looks like this:
The implementation for a user-defined window function after this change:
What changes are included in this PR?
field
trait method:return_type
trait method.datafusion/datafusion/expr/src/udwf.rs
Lines 282 to 284 in a08f923
nullable
trait method which was added in Convert built-inrow_number
to user-defined window function #12030.WindowUDFFieldArgs
:Are these changes tested?
Yes, against existing tests in CI.
Are there any user-facing changes?
Yes, this is a breaking change for user-defined window functions API.