-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(planner): Allowing setting sort order of parquet files without specifying the schema #12466
Changes from 4 commits
2b39944
8a65625
356a5b5
a3042a1
95e0341
6d432a3
fc59587
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1136,14 +1136,29 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { | |
schema: &DFSchemaRef, | ||
planner_context: &mut PlannerContext, | ||
) -> Result<Vec<Vec<SortExpr>>> { | ||
// Ask user to provide a schema if schema is empty. | ||
let mut all_results = vec![]; | ||
if !order_exprs.is_empty() && schema.fields().is_empty() { | ||
return plan_err!( | ||
"Provide a schema before specifying the order while creating a table." | ||
); | ||
let mut results = vec![]; | ||
for expr in order_exprs { | ||
for ordered_expr in expr { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've noticed that a lot of this codebase prefers There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be nice to use It also took me a while to get used to the map/collect pattern. At first I thought it was just functional language hipster stuff, but then I realized that it is often a key optimization (When possible, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good I think that I'll modify this to use a map/collect so I can be hip (and get a single allocation) 😎 |
||
let order_expr = ordered_expr.expr.to_owned(); | ||
let order_expr = self.sql_expr_to_logical_expr( | ||
order_expr, | ||
schema, | ||
planner_context, | ||
)?; | ||
let nulls_first = ordered_expr.nulls_first.unwrap_or(true); | ||
devanbenz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
let asc = ordered_expr.asc.unwrap_or(true); | ||
let sort_expr = SortExpr::new(order_expr, asc, nulls_first); | ||
results.push(sort_expr); | ||
} | ||
let sort_results = &results; | ||
all_results.push(sort_results.to_owned()); | ||
} | ||
|
||
return Ok(all_results); | ||
} | ||
|
||
let mut all_results = vec![]; | ||
for expr in order_exprs { | ||
// Convert each OrderByExpr to a SortExpr: | ||
let expr_vec = | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -228,3 +228,13 @@ OPTIONS ( | |
format.delimiter '|', | ||
has_header false, | ||
compression gzip); | ||
|
||
# Create an external parquet table and infer schema to order by | ||
|
||
# query should succeed | ||
statement ok | ||
CREATE EXTERNAL TABLE t STORED AS parquet LOCATION '../../parquet-testing/data/alltypes_plain.parquet' WITH ORDER (id); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you also add a test that shows the table is actually ordered correctly? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can do |
||
|
||
# query should fail with bad column | ||
statement error | ||
CREATE EXTERNAL TABLE t STORED AS parquet LOCATION '../../parquet-testing/data/alltypes_plain.parquet' WITH ORDER (foo); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another reason this will fail is that there is already a table named There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thank you @devanbenz -- perfect