-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor EliminateDuplicatedExpr
optimizer pass to avoid clone
#10218
Conversation
|
||
// use this structure to avoid initial clone | ||
#[derive(Eq, Clone, Debug)] | ||
struct SortExprWrapper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap the Expr in a Wrapper to support specialized comparison
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very clever 👏
.iter() | ||
.map(|e| match e { | ||
Expr::Sort(ExprSort { expr, .. }) => { | ||
Expr::Sort(ExprSort::new(expr.clone(), true, false)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid the normalized clone here
Ok(None) | ||
} else { | ||
Ok(Some(LogicalPlan::Sort(Sort { | ||
expr: dedup_expr.into_iter().cloned().collect::<Vec<_>>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid another clone here
input: sort.input.clone(), | ||
fetch: sort.fetch, | ||
}))) | ||
let mut index_set = IndexSet::new(); // use index_set instead of Hashset to preserve order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use index_set to preserve the original order of sort
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also quite clever
I think you can avoid a Vec here if you skip normalized_sort_keys
and create unique_exprs
directly, as you did below.
Something like
let unique_exprs: Vec<Expr> = sort
.expr
.into_iter()
// use SortExpr wrapper to ignore sort options
.map(|e| SortExprWrapper { expr: e })
.collect::<IndexSet<_>>()
.into_iter()
.map(|wrapper| wrapper.expr)
.collect();
3c31bec
to
09cd01d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Lordworms -- this is really nice and quite clever
I have a few suggestions on how to make this PR better but I also think we could do it as a follow on too.
@@ -35,78 +35,107 @@ impl EliminateDuplicatedExpr { | |||
Self {} | |||
} | |||
} | |||
|
|||
// use this structure to avoid initial clone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// use this structure to avoid initial clone | |
/// Wrap the Expr in a Wrapper to support specialized comparison. | |
/// | |
/// Ignores the sort options for `SortExpr` because if the expression is the same | |
/// the subsequent exprs are never matched | |
/// | |
/// For example, `ORDER BY a ASC a DESC` is the same | |
// as `ORDER BY a ASC` (the second `a DESC` is never compared) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
|
||
// use this structure to avoid initial clone | ||
#[derive(Eq, Clone, Debug)] | ||
struct SortExprWrapper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very clever 👏
input: sort.input.clone(), | ||
fetch: sort.fetch, | ||
}))) | ||
let mut index_set = IndexSet::new(); // use index_set instead of Hashset to preserve order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also quite clever
I think you can avoid a Vec here if you skip normalized_sort_keys
and create unique_exprs
directly, as you did below.
Something like
let unique_exprs: Vec<Expr> = sort
.expr
.into_iter()
// use SortExpr wrapper to ignore sort options
.map(|e| SortExprWrapper { expr: e })
.collect::<IndexSet<_>>()
.into_iter()
.map(|wrapper| wrapper.expr)
.collect();
EliminateDuplicatedExpr
optimizer pass to avoid clone
Thanks again @Lordworms -- this looks great |
Which issue does this PR close?
part of #9637
Closes #.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?