Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial changes to support using udaf min/max for statistics and opti… #11696

Merged
merged 3 commits into from
Jul 30, 2024

Conversation

edmondop
Copy link
Contributor

This PR tries to extract some work necessary for #10943. Min and Max are special functions that are used for statistics and optimizations, and they need a special treatment.

@github-actions github-actions bot added the core Core DataFusion crate label Jul 28, 2024
/// None in all other cases, used in certain optimizations or
/// or aggregate
///
pub fn get_minmax_desc(&self) -> Option<bool> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I suggest using an enum like enum MinMax { Min, Max } instead of a bool?

Copy link
Contributor

@jayzhan211 jayzhan211 Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the code that we need the boolean, maybe name it with is_descending: Option<bool>?

        let (field, desc) = aggr.get_minmax_desc()?;
        if desc != order.options.descending {
            return None;
        }

And, we can get fields separately

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with eb0ebe0 (#11696) although I am not sure I understood what "is descending" mean for a function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the same idea like descending in SortOptions

pub struct SortOptions {
    /// Whether to sort in descending order
    pub descending: bool,

Would is_sorted_descending: Option<bool> be much more better?

Copy link
Contributor

@jayzhan211 jayzhan211 Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let field = aggr.field()
let is_sorted_descending = aggr.is_sorted_descending()

And, I think getting field and descending separately is better than get_minmax_desc, so if we can get field only if we don't care about descending

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The goal of this PR was to simplify #10943 by implementing here the necessary method for udaf, do you think I should proceed in the refactoring now or can we do as a separate improvement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary

@edmondop edmondop requested a review from jayzhan211 July 30, 2024 00:33
@jayzhan211
Copy link
Contributor

Thanks @edmondop

@jayzhan211 jayzhan211 merged commit 35c2e7e into apache:main Jul 30, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants