Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AggregateExec: Take grouping sets into account for InputOrderMode #11301

Merged
merged 3 commits into from
Jul 7, 2024

Conversation

thinkharderdev
Copy link
Contributor

Which issue does this PR close?

Closes #11291

Rationale for this change

Fixes incorrect InputOrderMode when aggregation has grouping sets

What changes are included in this PR?

When deriving indices for ordered preset, explicitly check all expanded grouping sets

Are these changes tested?

Yes

Are there any user-facing changes?

No

No

@alamb
Copy link
Contributor

alamb commented Jul 6, 2024

I think clippy is failing on this PR due to #11302

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes senes to me. Thank you @thinkharderdev

I also double checked test coverage by running the new unit test without the changes in this PR and it indeed fails as expected


thread 'aggregates::tests::test_agg_exec_group_by_const' panicked at datafusion/physical-plan/src/aggregates/mod.rs:2360:9:
assertion `left == right` failed: 

expected:

[
    "+-----+-----+-------+----------+",
    "| a   | b   | const | 1[count] |",
    "+-----+-----+-------+----------+",
    "|     |     | 1     | 32768    |",
    "|     | 0.0 |       | 32768    |",
    "| 0.0 |     |       | 32768    |",
    "+-----+-----+-------+----------+",
]
actual:

[
    "+-----+-----+-------+----------+",
    "| a   | b   | const | 1[count] |",
    "+-----+-----+-------+----------+",
    "|     |     | 1     | 16384    |",
    "|     |     | 1     | 16384    |",
    "|     | 0.0 |       | 16384    |",
    "|     | 0.0 |       | 8192     |",
    "|     | 0.0 |       | 8192     |",
    "| 0.0 |     |       | 16384    |",
    "| 0.0 |     |       | 8192     |",
    "| 0.0 |     |       | 8192     |",
    "+-----+-----+-------+----------+",
]

} else {
InputOrderMode::Linear
};
let indices: Vec<usize> = indices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend adding a comment here (perhaps copied from the description of the PR) explaining the subtelty involving grouping sets

let aggregates: Vec<Arc<dyn AggregateExpr>> = vec![create_aggregate_expr(
count_udaf().as_ref(),
&[lit(1)],
&[Expr::Literal(ScalarValue::Int32(Some(1)))],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could make this more concise like this (and ensure the type is correct)

Suggested change
&[Expr::Literal(ScalarValue::Int32(Some(1)))],
&[lit(1i32)],

@thinkharderdev thinkharderdev merged commit e693ed7 into apache:main Jul 7, 2024
23 checks passed
@thinkharderdev thinkharderdev deleted the issue-11291 branch July 7, 2024 10:37
thinkharderdev added a commit to coralogix/arrow-datafusion that referenced this pull request Jul 8, 2024
…ache#11301)

* AggregateExec: Take grouping sets into account for InputOrderMode

* pr comments
thinkharderdev added a commit to coralogix/arrow-datafusion that referenced this pull request Jul 8, 2024
…ache#11301)

* AggregateExec: Take grouping sets into account for InputOrderMode

* pr comments
# Conflicts:
#	datafusion/physical-plan/src/aggregates/mod.rs
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
…ache#11301)

* AggregateExec: Take grouping sets into account for InputOrderMode

* pr comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect results in aggregation queries with grouping sets
2 participants