Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to support recursive unnest in physical plan #11577

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
cc661ea
chore: poc
duongcongtoai Jul 20, 2024
d7d45b1
fix unnest struct
duongcongtoai Jul 21, 2024
f8aa23a
UT for memoization
duongcongtoai Jul 21, 2024
cf202a8
remove unnessary projection
duongcongtoai Jul 21, 2024
1e74176
chore: temp test case
duongcongtoai Jul 22, 2024
b3ff0d7
Merge remote-tracking branch 'origin/main' into 11198-fix-unnest-mult…
duongcongtoai Jul 23, 2024
d557ff3
multi depth unnest supported
duongcongtoai Jul 27, 2024
cd497fe
chore: add map of original column and transformed col
duongcongtoai Jul 28, 2024
d2cc80c
transformation map to physical layer
duongcongtoai Jul 28, 2024
7efee19
prototype for recursive array length
duongcongtoai Jul 28, 2024
a489502
chore: some compile err
duongcongtoai Jul 30, 2024
80c1cf3
finalize input type in physical layer
duongcongtoai Jul 31, 2024
7acb056
chore: refactor unnest builder
duongcongtoai Aug 1, 2024
4d32187
add unnesting type inferred
duongcongtoai Aug 3, 2024
e41923f
fix compile err
duongcongtoai Aug 3, 2024
eca58f5
fail test in builder
duongcongtoai Aug 3, 2024
e2fae71
Compile err
duongcongtoai Aug 4, 2024
825e270
chore: detect some bugs
duongcongtoai Aug 4, 2024
721c92f
some work
duongcongtoai Aug 12, 2024
d11ed20
support recursive unnest in physical layer
duongcongtoai Aug 13, 2024
5b9ce5e
UT for new build batch function
duongcongtoai Aug 15, 2024
366d8ae
compile err
duongcongtoai Aug 15, 2024
a06dbad
Merge remote-tracking branch 'origin/main' into 11198-fix-unnest-mult…
duongcongtoai Aug 15, 2024
fd01450
fix unnesting into empty arrays
duongcongtoai Aug 17, 2024
2b6e70f
some comment
duongcongtoai Aug 17, 2024
2b49fa0
fix unnest struct
duongcongtoai Aug 17, 2024
222087e
some note
duongcongtoai Aug 20, 2024
f269e2f
chore: fix all test failure
duongcongtoai Sep 1, 2024
39fab44
fix projection pushdown
duongcongtoai Sep 1, 2024
444d741
custom rewriter for recursive unnest
duongcongtoai Sep 1, 2024
bba148a
simplify
duongcongtoai Sep 2, 2024
c10812d
rm unnecessary projection
duongcongtoai Sep 3, 2024
5778058
chore: better comments
duongcongtoai Sep 5, 2024
70bccdb
more comments
duongcongtoai Sep 5, 2024
cc8169a
chore: better comments
duongcongtoai Sep 7, 2024
89e4547
remove breaking api
duongcongtoai Sep 8, 2024
eb215b1
rename
duongcongtoai Sep 8, 2024
1deb566
more unit test
duongcongtoai Sep 8, 2024
2fb9842
remove debug
duongcongtoai Sep 8, 2024
ae163d8
Merge remote-tracking branch 'origin/main' into 11198-fix-unnest-mult…
duongcongtoai Sep 8, 2024
88b8edd
clean up
duongcongtoai Sep 8, 2024
00dab2f
fix proto
duongcongtoai Sep 10, 2024
784bf3e
fix dataframe
duongcongtoai Sep 10, 2024
4d3f508
fix clippy
duongcongtoai Sep 10, 2024
926efd5
cargo fmt
duongcongtoai Sep 10, 2024
bf40688
fix some test
duongcongtoai Sep 10, 2024
1e54422
fix all test
duongcongtoai Sep 10, 2024
abfd4fc
fix unnest in join
duongcongtoai Sep 10, 2024
eb4cb44
fix doc and tests
duongcongtoai Sep 11, 2024
7aee552
chore: better doc
duongcongtoai Sep 11, 2024
df4fee3
better doc
duongcongtoai Sep 11, 2024
f96f6e9
tune comment
duongcongtoai Sep 11, 2024
c695b5e
rm todo
duongcongtoai Sep 11, 2024
f73921f
refactor
duongcongtoai Sep 12, 2024
c3b68c6
chore: reserve test
duongcongtoai Sep 14, 2024
43e7a04
add a basic test
duongcongtoai Sep 14, 2024
6dae034
chore: more document
duongcongtoai Sep 22, 2024
e72bb61
doc on ColumnUnnestType List
duongcongtoai Sep 22, 2024
2e53740
Merge remote-tracking branch 'origin/main' into 11198-fix-unnest-mult…
duongcongtoai Sep 22, 2024
bc3f4bf
chore: add partialord to new types
duongcongtoai Sep 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion datafusion/core/src/physical_planner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ use datafusion_physical_expr::aggregate::{AggregateExprBuilder, AggregateFunctio
use datafusion_physical_expr::expressions::Literal;
use datafusion_physical_expr::LexOrdering;
use datafusion_physical_plan::placeholder_row::PlaceholderRowExec;
use datafusion_physical_plan::unnest::ListUnnest;
use datafusion_sql::utils::window_expr_common_partition_keys;

use async_trait::async_trait;
Expand Down Expand Up @@ -848,9 +849,16 @@ impl DefaultPhysicalPlanner {
}) => {
let input = children.one()?;
let schema = SchemaRef::new(schema.as_ref().to_owned().into());
let list_column_indices = list_type_columns
.iter()
.map(|(index, unnesting)| ListUnnest {
index_in_input_schema: *index,
depth: unnesting.depth,
})
.collect();
Arc::new(UnnestExec::new(
input,
list_type_columns.clone(),
list_column_indices,
struct_type_columns.clone(),
schema,
options.clone(),
Expand Down
2 changes: 1 addition & 1 deletion datafusion/core/tests/dataframe/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1391,7 +1391,7 @@ async fn unnest_with_redundant_columns() -> Result<()> {
let optimized_plan = df.clone().into_optimized_plan()?;
let expected = vec![
"Projection: shapes.shape_id [shape_id:UInt32]",
" Unnest: lists[shape_id2] structs[] [shape_id:UInt32, shape_id2:UInt32;N]",
" Unnest: lists[shape_id2|depth=1] structs[] [shape_id:UInt32, shape_id2:UInt32;N]",
" Aggregate: groupBy=[[shapes.shape_id]], aggr=[[array_agg(shapes.shape_id) AS shape_id2]] [shape_id:UInt32, shape_id2:List(Field { name: \"item\", data_type: UInt32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} });N]",
" TableScan: shapes projection=[shape_id] [shape_id:UInt32]",
];
Expand Down
3 changes: 1 addition & 2 deletions datafusion/expr/src/expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2346,8 +2346,7 @@ impl fmt::Display for Expr {
},
Expr::Placeholder(Placeholder { id, .. }) => write!(f, "{id}"),
Expr::Unnest(Unnest { expr }) => {
// TODO: use Display instead of Debug, there is non-unique expression name in projection issue.
write!(f, "UNNEST({expr:?})")
write!(f, "UNNEST({expr})")
}
}
}
Expand Down
Loading