Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using display_name for Expr::Aggregation #11020

Merged
merged 1 commit into from
Jun 21, 2024
Merged

Conversation

Lordworms
Copy link
Contributor

@Lordworms Lordworms commented Jun 20, 2024

Which issue does this PR close?

Closes #10878

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Jun 20, 2024
@Lordworms
Copy link
Contributor Author

Lordworms commented Jun 20, 2024

Also happened to solve part of #10364

@Lordworms
Copy link
Contributor Author

The reason for the error is that the field name generation rules of AggregationFunction are different in logicalplan and physicalplan. The former uses display_name() and the latter uses create_physical_name.

@alamb alamb requested a review from jayzhan211 June 20, 2024 11:23
@@ -1916,6 +1916,7 @@ pub fn create_aggregate_expr_and_maybe_filter(
// unpack (nested) aliased logical expressions, e.g. "sum(col) as total"
let (name, e) = match e {
Expr::Alias(Alias { expr, name, .. }) => (name.clone(), expr.as_ref()),
Expr::AggregateFunction(_) => (e.display_name().unwrap_or(physical_name(e)?), e),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we call display_name in physical_name like what ScalarFunction does.
And even more, I think there are no such difference between logical name and physical name, maybe we can have one name for these two 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be nice if we also use display_name in impl fmt::Display for Expr

Copy link
Contributor

@jayzhan211 jayzhan211 Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems CastExpr is different in write_name and Expr's Display 🤔

And, there is an issue that expect CastExpr be shown in write name

#10274

CastExpr is removed in #3222, but I didn't find the rationale of doing this

If the issue is due to the schema change, then it will be quite tricky #10276

We can reuse display_name in physical_name to fix the regression first

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems CastExpr is different in write_name and Expr's Display 🤔

And, there is an issue that expect CastExpr be shown in write name

#10274

CastExpr is removed in #3222, but I didn't find the rationale of doing this

If the issue is due to the schema change, then it will be quite tricky #10276

We can reuse display_name in physical_name to fix the regression first

Yes, I think calling display_name like physical name is a good idea, I tried it before, but there are some tests failed, there might be some reason that the display_name is different from physical name, I think for this issue, only change AggregateFunction is enough, I can take other related issue and fix them in the following PR, Thanks for review !


ctx.register_table("test", Arc::new(table))?;

let sql = "SELECT MIN(a) FILTER (WHERE a > 1) AS x FROM test";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is this test for 🤔

Is it possible to move to slt?

statement ok
set datafusion.sql_parser.dialect = 'Postgres';

statement ok
create table t (a float) as values (1), (2), (3);

query TT
explain select min(a) filter (where a > 1) as x from t;
----
logical_plan
01)Projection: MIN(t.a) FILTER (WHERE t.a > Int64(1)) AS x
02)--Aggregate: groupBy=[[]], aggr=[[MIN(t.a) FILTER (WHERE t.a > Float32(1)) AS MIN(t.a) FILTER (WHERE t.a > Int64(1))]]
03)----TableScan: t projection=[a]
physical_plan
01)ProjectionExec: expr=[MIN(t.a) FILTER (WHERE t.a > Int64(1))@0 as x]
02)--AggregateExec: mode=Single, gby=[], aggr=[MIN(t.a) FILTER (WHERE t.a > Int64(1))]
03)----MemoryExec: partitions=1, partition_sizes=[1]

statement ok
drop table t;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is this test for 🤔

Is it possible to move to slt?

statement ok
set datafusion.sql_parser.dialect = 'Postgres';

statement ok
create table t (a float) as values (1), (2), (3);

query TT
explain select min(a) filter (where a > 1) as x from t;
----
logical_plan
01)Projection: MIN(t.a) FILTER (WHERE t.a > Int64(1)) AS x
02)--Aggregate: groupBy=[[]], aggr=[[MIN(t.a) FILTER (WHERE t.a > Float32(1)) AS MIN(t.a) FILTER (WHERE t.a > Int64(1))]]
03)----TableScan: t projection=[a]
physical_plan
01)ProjectionExec: expr=[MIN(t.a) FILTER (WHERE t.a > Int64(1))@0 as x]
02)--AggregateExec: mode=Single, gby=[], aggr=[MIN(t.a) FILTER (WHERE t.a > Int64(1))]
03)----MemoryExec: partitions=1, partition_sizes=[1]

statement ok
drop table t;

This test is for the original issue #10878

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is this test for 🤔
Is it possible to move to slt?

statement ok
set datafusion.sql_parser.dialect = 'Postgres';

statement ok
create table t (a float) as values (1), (2), (3);

query TT
explain select min(a) filter (where a > 1) as x from t;
----
logical_plan
01)Projection: MIN(t.a) FILTER (WHERE t.a > Int64(1)) AS x
02)--Aggregate: groupBy=[[]], aggr=[[MIN(t.a) FILTER (WHERE t.a > Float32(1)) AS MIN(t.a) FILTER (WHERE t.a > Int64(1))]]
03)----TableScan: t projection=[a]
physical_plan
01)ProjectionExec: expr=[MIN(t.a) FILTER (WHERE t.a > Int64(1))@0 as x]
02)--AggregateExec: mode=Single, gby=[], aggr=[MIN(t.a) FILTER (WHERE t.a > Int64(1))]
03)----MemoryExec: partitions=1, partition_sizes=[1]

statement ok
drop table t;

This test is for the original issue #10878

I guess the test you write only test if the value is correct or not, didn't test if the name is expected

@Lordworms
Copy link
Contributor Author

An unrelated issue with CI/CD

@jayzhan211
Copy link
Contributor

Thanks @Lordworms

@jayzhan211 jayzhan211 merged commit a4799c0 into apache:main Jun 21, 2024
23 checks passed
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jun 22, 2024
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jun 22, 2024
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregate with FILTER clause and alias errors
2 participants