-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update arrow 47.0.0 in DataFusion #7587
Conversation
e9a91dc
to
0a7a1f9
Compare
0a7a1f9
to
c00628d
Compare
datafusion/substrait/Cargo.toml
Outdated
@@ -35,7 +35,7 @@ itertools = "0.11" | |||
object_store = "0.7.0" | |||
prost = "0.11" | |||
prost-types = "0.11" | |||
substrait = "0.13.1" | |||
substrait = "0.14.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: unfortunately even the latest subtrait is not yet upgraded to prost 0.12
2805834
to
48a47ef
Compare
f48848d
to
f93d255
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"| logical_plan | Sort: t.a ASC NULLS LAST, t.b DESC NULLS FIRST |", | ||
"| | TableScan: t projection=[a, b] |", | ||
"| physical_plan | SortExec: expr=[a@0 ASC NULLS LAST,b@1 DESC] |", | ||
"| | MemoryExec: partitions=1, partition_sizes=[5], output_ordering=a@0 ASC NULLS LAST,b@1 ASC NULLS LAST |", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is because we no longer need the small batch size, right?
use std::sync::Arc; | ||
|
||
/// Applies a binary [`Datum`] kernel `f` to `lhs` and `rhs` | ||
pub(crate) fn apply( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is likely obvious to you, but I think it would help to provide some context here of why this function exists
Something like
pub(crate) fn apply( | |
/// | |
/// This is used to provide a single function implementation that works for all combinations of | |
/// 2 `ColumnarValue` inputs, and maps the [`ColumnarValue`] abstraction of DataFusion to | |
/// the [`Datum`] abstraction of arrow-rs | |
pub(crate) fn apply( |
/// Converter for the group values | ||
row_converter: CardinalityAwareRowConverter, | ||
row_converter: RowConverter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was removed as the upstream interning was removed in apache/arrow-rs#4811
@@ -199,6 +203,20 @@ impl GroupValues for GroupValuesRows { | |||
} | |||
}; | |||
|
|||
// TODO: Materialize dictionaries in group keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if this is described in a ticket yet (I think the answer is no, but I am not 100% sure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No a ticket would be a good addition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #7647
Thanks for the ping @alamb . So just trying to understand the implication of this PR on dictionary.
This is the only thing that's changed? and we should expect expressions operating on dictionary arrays still behave the same? Thanks! |
There should be no externally visible changes, correct |
* Update arrow 47.0.0 * Downgrade prost substrait * Fix deprecations * Prost deprecations * Update pbjson-build * Format * Remove CardinalityAwareRowConverter * Further fixes * Ignore spill tests * Fix tests * Fix Clippy * Update pin * Review feedback * Clippy
Which issue does this PR close?
Closes #.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?
Removes the dictionary_expressions feature as no longer necessary