Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new physical rule CombinePartialFinalAggregate #5837

Merged
merged 12 commits into from
Apr 12, 2023

Conversation

mingmwang
Copy link
Contributor

@mingmwang mingmwang commented Apr 3, 2023

Which issue does this PR close?

Closes #5836
Closes #5774.

Rationale for this change

Improve the performance of Aggregate

What changes are included in this PR?

  1. Implement PartialEq for AggregateExpr
  2. Add a new Aggregate mode: AggregateMode:Single
  3. Add a new rule CombinePartialFinalAggregate to combine the adjacent Partial and Final AggregateExecs

Are these changes tested?

TPCH-q17

cargo run --bin tpch -- benchmark datafusion --iterations 1 --path ./parquet_data --format parquet --query 17 -n 1 --disable-statistics --debug

Before this PR

=== Physical plan with metrics ===
ProjectionExec: expr=[CAST(SUM(lineitem.l_extendedprice)@0 AS Float64) / 7 as avg_yearly], metrics=[output_rows=1, elapsed_compute=4.708µs, spill_count=0, spilled_bytes=0, mem_used=0]
  AggregateExec: mode=Final, gby=[], aggr=[SUM(lineitem.l_extendedprice)], metrics=[output_rows=1, elapsed_compute=2.25µs, spill_count=0, spilled_bytes=0, mem_used=0]
    AggregateExec: mode=Partial, gby=[], aggr=[SUM(lineitem.l_extendedprice)], metrics=[output_rows=1, elapsed_compute=8.125µs, spill_count=0, spilled_bytes=0, mem_used=0]
      ProjectionExec: expr=[l_extendedprice@1 as l_extendedprice], metrics=[output_rows=587, elapsed_compute=292ns, spill_count=0, spilled_bytes=0, mem_used=0]
        CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=587, elapsed_compute=23.22µs, spill_count=0, spilled_bytes=0, mem_used=0]
          HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(Column { name: "p_partkey", index: 2 }, Column { name: "l_partkey", index: 0 })], filter=BinaryExpr { left: CastExpr { expr: Column { name: "l_quantity", index: 0 }, cast_type: Decimal128(30, 15), cast_options: CastOptions { safe: false } }, op: Lt, right: CastExpr { expr: Column { name: "__value", index: 1 }, cast_type: Decimal128(30, 15), cast_options: CastOptions { safe: false } } }, metrics=[output_rows=200000, build_input_batches=1, input_rows=200000, output_batches=25, input_batches=25, build_input_rows=6088, build_mem_used=514520, build_time=441.023722ms, join_time=1.100547ms]
            ProjectionExec: expr=[l_quantity@1 as l_quantity, l_extendedprice@2 as l_extendedprice, p_partkey@3 as p_partkey], metrics=[output_rows=6088, elapsed_compute=375ns, spill_count=0, spilled_bytes=0, mem_used=0]
              CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=6088, elapsed_compute=2.5µs, spill_count=0, spilled_bytes=0, mem_used=0]
                HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(Column { name: "l_partkey", index: 0 }, Column { name: "p_partkey", index: 0 })], metrics=[output_rows=204, build_input_batches=733, input_rows=204, output_batches=1, input_batches=1, build_input_rows=6001215, build_mem_used=517375864, build_time=428.277166ms, join_time=4.338959ms]
                  ParquetExec: limit=None, partitions={1 group: [[Users/mingmwang/gitrepo/apache/arrow-datafusion/benchmarks/parquet_data/lineitem/part-0.parquet]]}, projection=[l_partkey, l_quantity, l_extendedprice], metrics=[output_rows=6001215, elapsed_compute=1ns, spill_count=0, spilled_bytes=0, mem_used=0, predicate_evaluation_errors=0, bytes_scanned=45261954, pushdown_rows_filtered=0, row_groups_pruned=0, num_predicate_creation_errors=0, page_index_rows_filtered=0, page_index_eval_time=2ns, time_elapsed_processing=171.43936ms, pushdown_eval_time=2ns, time_elapsed_scanning_total=180.158936ms, time_elapsed_opening=1.045458ms, time_elapsed_scanning_until_data=4.35775ms]
                  ProjectionExec: expr=[p_partkey@0 as p_partkey], metrics=[output_rows=204, elapsed_compute=208ns, spill_count=0, spilled_bytes=0, mem_used=0]
                    CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=204, elapsed_compute=13.712µs, spill_count=0, spilled_bytes=0, mem_used=0]
                      FilterExec: p_brand@1 = Brand#23 AND p_container@2 = MED BOX, metrics=[output_rows=204, elapsed_compute=2.42554ms, spill_count=0, spilled_bytes=0, mem_used=0]
                        ParquetExec: limit=None, partitions={1 group: [[Users/mingmwang/gitrepo/apache/arrow-datafusion/benchmarks/parquet_data/part/part-0.parquet]]}, predicate=p_brand@3 = Brand#23 AND p_container@6 = MED BOX, pruning_predicate=p_brand_min@0 <= Brand#23 AND Brand#23 <= p_brand_max@1 AND p_container_min@2 <= MED BOX AND MED BOX <= p_container_max@3, projection=[p_partkey, p_brand, p_container], metrics=[output_rows=200000, elapsed_compute=1ns, spill_count=0, spilled_bytes=0, mem_used=0, predicate_evaluation_errors=0, bytes_scanned=744742, pushdown_rows_filtered=0, row_groups_pruned=0, num_predicate_creation_errors=0, page_index_rows_filtered=0, page_index_eval_time=2ns, time_elapsed_processing=5.893912ms, pushdown_eval_time=2ns, time_elapsed_scanning_total=8.75517ms, time_elapsed_opening=2.033041ms, time_elapsed_scanning_until_data=2.282542ms]
            ProjectionExec: expr=[l_partkey@0 as l_partkey, 0.2 * CAST(AVG(lineitem.l_quantity)@1 AS Float64) as __value], metrics=[output_rows=200000, elapsed_compute=604.917µs, spill_count=0, spilled_bytes=0, mem_used=0]
              AggregateExec: mode=Final, gby=[l_partkey@0 as l_partkey], aggr=[AVG(lineitem.l_quantity)], metrics=[output_rows=200000, elapsed_compute=59.743373ms, spill_count=0, spilled_bytes=0, mem_used=0]
                AggregateExec: mode=Partial, gby=[l_partkey@0 as l_partkey], aggr=[AVG(lineitem.l_quantity)], metrics=[output_rows=200000, elapsed_compute=2.991533862s, spill_count=0, spilled_bytes=0, mem_used=0]
                  ParquetExec: limit=None, partitions={1 group: [[Users/mingmwang/gitrepo/apache/arrow-datafusion/benchmarks/parquet_data/lineitem/part-0.parquet]]}, projection=[l_partkey, l_quantity], metrics=[output_rows=6001215, elapsed_compute=1ns, spill_count=0, spilled_bytes=0, mem_used=0, predicate_evaluation_errors=0, bytes_scanned=24308735, pushdown_rows_filtered=0, row_groups_pruned=0, num_predicate_creation_errors=0, page_index_rows_filtered=0, page_index_eval_time=2ns, time_elapsed_processing=96.39778ms, pushdown_eval_time=2ns, time_elapsed_scanning_total=3.083229128s, time_elapsed_opening=1.210625ms, time_elapsed_scanning_until_data=2.773209ms]

After this PR:

ProjectionExec: expr=[CAST(SUM(lineitem.l_extendedprice)@0 AS Float64) / 7 as avg_yearly], metrics=[output_rows=1, elapsed_compute=3.292µs, spill_count=0, spilled_bytes=0, mem_used=0]
  AggregateExec: mode=Single, gby=[], aggr=[SUM(lineitem.l_extendedprice)], metrics=[output_rows=1, elapsed_compute=7.041µs, spill_count=0, spilled_bytes=0, mem_used=0]
    ProjectionExec: expr=[l_extendedprice@1 as l_extendedprice], metrics=[output_rows=587, elapsed_compute=209ns, spill_count=0, spilled_bytes=0, mem_used=0]
      CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=587, elapsed_compute=23.677µs, spill_count=0, spilled_bytes=0, mem_used=0]
        HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(Column { name: "p_partkey", index: 2 }, Column { name: "l_partkey", index: 0 })], filter=BinaryExpr { left: CastExpr { expr: Column { name: "l_quantity", index: 0 }, cast_type: Decimal128(30, 15), cast_options: CastOptions { safe: false } }, op: Lt, right: CastExpr { expr: Column { name: "__value", index: 1 }, cast_type: Decimal128(30, 15), cast_options: CastOptions { safe: false } } }, metrics=[output_rows=200000, build_input_rows=6088, input_rows=200000, output_batches=25, build_input_batches=1, input_batches=25, build_mem_used=514520, join_time=1.089208ms, build_time=444.242481ms]
          ProjectionExec: expr=[l_quantity@1 as l_quantity, l_extendedprice@2 as l_extendedprice, p_partkey@3 as p_partkey], metrics=[output_rows=6088, elapsed_compute=375ns, spill_count=0, spilled_bytes=0, mem_used=0]
            CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=6088, elapsed_compute=2.46µs, spill_count=0, spilled_bytes=0, mem_used=0]
              HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(Column { name: "l_partkey", index: 0 }, Column { name: "p_partkey", index: 0 })], metrics=[output_rows=204, build_input_rows=6001215, input_rows=204, output_batches=1, build_input_batches=733, input_batches=1, build_mem_used=517375864, join_time=3.597959ms, build_time=432.300961ms]
                ParquetExec: limit=None, partitions={1 group: [[Users/mingmwang/gitrepo/apache/arrow-datafusion/benchmarks/parquet_data/lineitem/part-0.parquet]]}, projection=[l_partkey, l_quantity, l_extendedprice], metrics=[output_rows=6001215, elapsed_compute=1ns, spill_count=0, spilled_bytes=0, mem_used=0, num_predicate_creation_errors=0, predicate_evaluation_errors=0, pushdown_rows_filtered=0, row_groups_pruned=0, page_index_rows_filtered=0, bytes_scanned=45261954, page_index_eval_time=2ns, time_elapsed_processing=172.17813ms, pushdown_eval_time=2ns, time_elapsed_opening=1.031708ms, time_elapsed_scanning_total=180.707651ms, time_elapsed_scanning_until_data=4.388ms]
                ProjectionExec: expr=[p_partkey@0 as p_partkey], metrics=[output_rows=204, elapsed_compute=375ns, spill_count=0, spilled_bytes=0, mem_used=0]
                  CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=204, elapsed_compute=13.627µs, spill_count=0, spilled_bytes=0, mem_used=0]
                    FilterExec: p_brand@1 = Brand#23 AND p_container@2 = MED BOX, metrics=[output_rows=204, elapsed_compute=2.366041ms, spill_count=0, spilled_bytes=0, mem_used=0]
                      ParquetExec: limit=None, partitions={1 group: [[Users/mingmwang/gitrepo/apache/arrow-datafusion/benchmarks/parquet_data/part/part-0.parquet]]}, predicate=p_brand@3 = Brand#23 AND p_container@6 = MED BOX, pruning_predicate=p_brand_min@0 <= Brand#23 AND Brand#23 <= p_brand_max@1 AND p_container_min@2 <= MED BOX AND MED BOX <= p_container_max@3, projection=[p_partkey, p_brand, p_container], metrics=[output_rows=200000, elapsed_compute=1ns, spill_count=0, spilled_bytes=0, mem_used=0, num_predicate_creation_errors=0, predicate_evaluation_errors=0, pushdown_rows_filtered=0, row_groups_pruned=0, page_index_rows_filtered=0, bytes_scanned=744742, page_index_eval_time=2ns, time_elapsed_processing=5.880331ms, pushdown_eval_time=2ns, time_elapsed_opening=1.377625ms, time_elapsed_scanning_total=8.683539ms, time_elapsed_scanning_until_data=2.290333ms]
          ProjectionExec: expr=[l_partkey@0 as l_partkey, 0.2 * CAST(AVG(lineitem.l_quantity)@1 AS Float64) as __value], metrics=[output_rows=200000, elapsed_compute=585.001µs, spill_count=0, spilled_bytes=0, mem_used=0]
            AggregateExec: mode=Single, gby=[l_partkey@0 as l_partkey], aggr=[AVG(lineitem.l_quantity)], metrics=[output_rows=200000, elapsed_compute=2.682685209s, spill_count=0, spilled_bytes=0, mem_used=0]
              ParquetExec: limit=None, partitions={1 group: [[Users/mingmwang/gitrepo/apache/arrow-datafusion/benchmarks/parquet_data/lineitem/part-0.parquet]]}, projection=[l_partkey, l_quantity], metrics=[output_rows=6001215, elapsed_compute=1ns, spill_count=0, spilled_bytes=0, mem_used=0, num_predicate_creation_errors=0, predicate_evaluation_errors=0, pushdown_rows_filtered=0, row_groups_pruned=0, page_index_rows_filtered=0, bytes_scanned=24308735, page_index_eval_time=2ns, time_elapsed_processing=98.302065ms, pushdown_eval_time=2ns, time_elapsed_opening=1.118791ms, time_elapsed_scanning_total=2.782967665s, time_elapsed_scanning_until_data=2.825542ms]

Before this PR:

Query 17 iteration 0 took 3395.1 ms and returned 1 rows
Query 17 iteration 1 took 3598.1 ms and returned 1 rows
Query 17 iteration 2 took 3554.1 ms and returned 1 rows
Query 17 avg time: 3515.76 ms

After this PR:

Query 17 iteration 0 took 3486.8 ms and returned 1 rows
Query 17 iteration 1 took 3211.4 ms and returned 1 rows
Query 17 iteration 2 took 3201.6 ms and returned 1 rows
Query 17 avg time: 3299.93 ms

Are there any user-facing changes?

@github-actions github-actions bot added core Core DataFusion crate physical-expr Physical Expressions labels Apr 3, 2023
@mingmwang
Copy link
Contributor Author

Will add some UT soon.

@yjshen
Copy link
Member

yjshen commented Apr 3, 2023

I understand the collapsing rule as it removes the requirement of creating a RecordBatch from states and then reading them back for final evaluation.

As for naming this new aggregation mode, I find Complete more descriptive when displayed as output, but I have no strong preference.

ProjectionExec: expr=[l_partkey@0 as l_partkey, ....
  AggregateExec: mode=Single...
      ParquetExec ...
ProjectionExec: expr=[l_partkey@0 as l_partkey, ...
  AggregateExec: mode=Complete...
      ParquetExec ...

@Dandandan
Copy link
Contributor

As far as I can see, this only works for single partitions as input and not repartitioning in between (e.g. no concurrency), could you confirm?

@mingmwang
Copy link
Contributor Author

As far as I can see, this only works for single partitions as input and not repartitioning in between (e.g. no concurrency), could you confirm?

No always. We will see the adjacent Partial + Final Aggregator for normal join and aggregation on the same key.
I will add more UTs and intg tests tomorrow to show the cases:

select distinct(t1.t1_id) from t1 inner join t2 on t1.t1_id = t2.t2_id;
AggregateExec: mode=Single, gby=[t1_id@0 as t1_id], aggr=[]",
      ProjectionExec: expr=[t1_id@0 as t1_id]",
        CoalesceBatchesExec: target_batch_size=4096",
          HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column { name: \"t1_id\", index: 0 }, Column { name: \"t2_id\", index: 0 })]",
            CoalesceBatchesExec: target_batch_size=4096",
              RepartitionExec: partitioning=Hash([Column { name: \"t1_id\", index: 0 }], 2), input_partitions=2",
                RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1",
                  MemoryExec: partitions=1, partition_sizes=[1]",
            CoalesceBatchesExec: target_batch_size=4096",
              RepartitionExec: partitioning=Hash([Column { name: \"t2_id\", index: 0 }], 2), input_partitions=2",
                RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1",
                  MemoryExec: partitions=1, partition_sizes=[1]",

@Dandandan
Copy link
Contributor

As far as I can see, this only works for single partitions as input and not repartitioning in between (e.g. no concurrency), could you confirm?

No always. We will see the adjacent Partial + Final Aggregator for normal join and aggregation on the same key. I will add more UTs and intg tests tomorrow to show the cases:

select distinct(t1.t1_id) from t1 inner join t2 on t1.t1_id = t2.t2_id;
AggregateExec: mode=Single, gby=[t1_id@0 as t1_id], aggr=[]",
      ProjectionExec: expr=[t1_id@0 as t1_id]",
        CoalesceBatchesExec: target_batch_size=4096",
          HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column { name: \"t1_id\", index: 0 }, Column { name: \"t2_id\", index: 0 })]",
            CoalesceBatchesExec: target_batch_size=4096",
              RepartitionExec: partitioning=Hash([Column { name: \"t1_id\", index: 0 }], 2), input_partitions=2",
                RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1",
                  MemoryExec: partitions=1, partition_sizes=[1]",
            CoalesceBatchesExec: target_batch_size=4096",
              RepartitionExec: partitioning=Hash([Column { name: \"t2_id\", index: 0 }], 2), input_partitions=2",
                RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1",
                  MemoryExec: partitions=1, partition_sizes=[1]",

Ah yes - in the case the underlying partition is already hash-repartitioned on the key. Makes sense, thanks.

@mingmwang
Copy link
Contributor Author

@Dandandan @yjshen @alamb
Would you please help to review and approve this PR?

@alamb
Copy link
Contributor

alamb commented Apr 10, 2023

I will review this PR carefully today

@alamb alamb added the api change Changes the API exposed to users of the crate label Apr 10, 2023
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the code carefully. I have some suggestions on testing and documentation which I think would improve this PR but are not absolutely required to merge.

Thank you @mingmwang and sorry for the delay in reviewing

.expr
.iter()
.zip(other.expr.iter())
.all(|((expr1, name1), (expr2, name2))| expr1.eq(expr2) && name1 == name2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered why this needed to be manually derived, so I tried removing it and got this error:

error[E0369]: binary operation `==` cannot be applied to type `Vec<(Arc<dyn PhysicalExpr>, std::string::String)>`
  --> datafusion/core/src/physical_plan/aggregates/mod.rs:91:5
   |
88 | #[derive(Clone, Debug, Default, PartialEq)]
   |                                 --------- in this derive macro expansion
...
91 |     expr: Vec<(Arc<dyn PhysicalExpr>, String)>,
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: this error originates in the derive macro `PartialEq` (in Nightly builds, run with -Z macro-backtrace for more info)

Copy link
Contributor Author

@mingmwang mingmwang Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like if Struct contains any boxed Trait Object, we can not use the PartialEq derive macros.

rust-lang/rust#39128

@@ -65,6 +65,8 @@ pub enum AggregateMode {
/// with Hash repartitioning on the group keys. If a group key is
/// duplicated, duplicate groups would be produced
FinalPartitioned,
/// Single aggregate is a combination of Partial and Final aggregate mode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Single aggregate is a combination of Partial and Final aggregate mode
/// Applies the entire logical aggregation operation in a single operator,
/// as opposed to Partial / Final modes which apply the logical aggregation using
/// two operators.

let physical_plan = dataframe.create_physical_plan().await?;
let expected =
vec![
"AggregateExec: mode=Single, gby=[t1_id@0 as t1_id], aggr=[]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct that this plan can use a single aggregate because is is already partitioned on the group key (t1_id) after the join

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@@ -31,3 +34,17 @@ pub fn get_accum_scalar_values_as_arrays(
.map(|s| s.to_array_of_size(1))
.collect::<Vec<_>>())
}

pub fn down_cast_any_ref(any: &dyn Any) -> &dyn Any {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please document what this function does (with an example) given it is a new pub function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just add more comments. I have an example in the count unitest.

#[test]
    fn count_eq() -> Result<()> {
        let count = Count::new(lit(1i8), "COUNT(1)".to_string(), DataType::Int64);
        let arc_count: Arc<dyn AggregateExpr> = Arc::new(Count::new(
            lit(1i8),
            "COUNT(1)".to_string(),
            DataType::Int64,
        ));
        let box_count: Box<dyn AggregateExpr> = Box::new(Count::new(
            lit(1i8),
            "COUNT(1)".to_string(),
            DataType::Int64,
        ));
        let count2 = Count::new(lit(1i8), "COUNT(2)".to_string(), DataType::Int64);

        assert!(arc_count.eq(&box_count));
        assert!(box_count.eq(&arc_count));
        assert!(arc_count.eq(&count));
        assert!(count.eq(&box_count));
        assert!(count.eq(&arc_count));

        assert!(count2.ne(&arc_count));

        Ok(())
 }

datafusion/physical-expr/src/aggregate/mod.rs Show resolved Hide resolved
@mingmwang
Copy link
Contributor Author

mingmwang commented Apr 12, 2023

The group expression comparing between the partial and final aggregation is problematic, because the column indexes might be different.

@mingmwang
Copy link
Contributor Author

@yahoNanJing @alamb
Please help to move this PR to Draft.

@yahoNanJing yahoNanJing marked this pull request as draft April 12, 2023 03:08
@yahoNanJing
Copy link
Contributor

Convert it to draft

) {
final_input
.as_any()
.downcast_ref::<AggregateExec>()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's no RepartitionExec, it means the distribution of AggregateExec with final mode and AggregateExec with partial mode are the same. Therefore, there's no need to do two-phase aggregations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mingmwang for introducing this rule, which will significantly improve the query performances for the SQL patterns shown in UTs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the performance improve will not that significant, because usually the Final aggregation step is not that heavy.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Apr 12, 2023
@mingmwang mingmwang marked this pull request as ready for review April 12, 2023 06:37
Copy link
Contributor

@yahoNanJing yahoNanJing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yahoNanJing yahoNanJing merged commit 1600a30 into apache:main Apr 12, 2023
korowa pushed a commit to korowa/arrow-datafusion that referenced this pull request Apr 13, 2023
* add CombinePartialFinalAggregate rule

* Implement PartialEq for AggregateExpr

* fix compile error

* refine logic in the rule

* add UT

* resolve review comments

* fix compare grouping columns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement PartialEq for AggregateExpr Allow combining adjacent partial and final AggregateExec
5 participants