Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PartialOrd for the DF subfields/structs for the WindowFunction expr #12421

Merged
merged 9 commits into from
Sep 12, 2024

Conversation

ngli-me
Copy link
Contributor

@ngli-me ngli-me commented Sep 10, 2024

Which issue does this PR close?

Related #8932 (well, part of it).

Rationale for this change

Smaller PR for changing the Datafusion side subfields/structs relating to the WindowFunction expression type. There are still a few Arrow types that need PartialOrd implementations, looking for some direction on whether those should be added, or if the comparison for those fields can be ignored.

What changes are included in this PR?

BuiltInWindowFunction -> Derives PartialOrd
AggregateUDFImpl -> Manual implementation of PartialOrd based on name + signature fields
AggergateUDF -> Derives PartialOrd from impl (Added tests for this ordering as well)
WindowUDFImpl -> Manual implementation of PartialOrd based on name + signature fields
WindowUDF -> Derives PartialOrd from impl (Added tests for this ordering as well)
TypeSignature -> Derives PartialOrd (Added some sanity assertions for this)

Are these changes tested?

Yes, but not extensively.

Are there any user-facing changes?

Some of the subtypes will have PartialOrd applicable.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ngli-me -- this looks like a very nice incremental step. I have a question about manually implementing PartialOrd but otherwise this is 👌

@@ -127,7 +129,62 @@ pub enum TypeSignature {
Numeric(usize),
}

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
// manual implementation of `PartialOrd`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could automatically derive PartialOrd for these enums rather than implement them manually.

using derive I think would be easier to maintain over the long term (aka it is easier to add new variants to the enum without having ti update the partial ord implemetnation)

I tried this locally and it seems to work:

andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ git diff
diff --git a/datafusion/expr-common/src/signature.rs b/datafusion/expr-common/src/signature.rs
index 2043757a4..3f0f61f2f 100644
--- a/datafusion/expr-common/src/signature.rs
+++ b/datafusion/expr-common/src/signature.rs
@@ -84,7 +84,7 @@ pub enum Volatility {
 ///   DataType::Timestamp(TimeUnit::Nanosecond, Some(TIMEZONE_WILDCARD.into())),
 /// ]);
 /// ```
-#[derive(Debug, Clone, PartialEq, Eq, Hash)]
+#[derive(Debug, Clone, PartialEq, Eq, Hash, PartialOrd)]
 pub enum TypeSignature {
     /// One or more arguments of an common type out of a list of valid types.
     ///
@@ -127,7 +127,7 @@ pub enum TypeSignature {
     Numeric(usize),
 }

-#[derive(Debug, Clone, PartialEq, Eq, Hash)]
+#[derive(Debug, Clone, PartialEq, Eq, Hash, PartialOrd)]
 pub enum ArrayFunctionSignature {
     /// Specialized Signature for ArrayAppend and similar functions
     /// The first argument should be List/LargeList/FixedSizedList, and the second argument should be non-list or list.

Copy link
Contributor Author

@ngli-me ngli-me Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, for some reason I thought that derives wouldn't work with an enum with differing types, but I see it does work with the tests. Added an additional sanity assertion just to check, and removed the manual implementation for TypeSignature, thanks!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @ngli-me

@alamb
Copy link
Contributor

alamb commented Sep 12, 2024

(I think the description of this PR might need to be udpated)

@alamb
Copy link
Contributor

alamb commented Sep 12, 2024

Also, there appears to be a clippy failure

@ngli-me
Copy link
Contributor Author

ngli-me commented Sep 12, 2024

Made some adjustments to fix that clippy issue and fmt.

@alamb
Copy link
Contributor

alamb commented Sep 12, 2024

🚀
THanks again @ngli-me

@alamb alamb merged commit 389f7f7 into apache:main Sep 12, 2024
26 checks passed
@ngli-me ngli-me deleted the bug/expr_partial_ord_window_expr branch September 13, 2024 00:31
@jayzhan211
Copy link
Contributor

jayzhan211 commented Oct 8, 2024

Can we not implement PartialOrd for function? I don't think it make sense to have ordering concept for the function.
If there is some reason that we can't avoid it, can we compare with the name only? When I try to implement LogicalType for TypeSignature, I forced to implement PartialOrd as well, but I don't think it make sense

@alamb
Copy link
Contributor

alamb commented Oct 8, 2024

If there is some reason that we can't avoid it, can we compare with the name only?

I think one place it is used is to normalize the order of expressions (maybe for CSE or simplification 🤔 )

Comparing on name is probably fine too I would think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants