Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Expr::try_as_col, deprecate Expr::try_into_col (speed up optimizer) #10448

Merged
merged 2 commits into from
May 13, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented May 10, 2024

Which issue does this PR close?

Part of #9637

I ran into this while working on #10444

Rationale for this change

There are places in the code that check if an expr is a column by using Expr::try_into_col() which has two potential performance problems:

  1. It requires cloning the Column (and thus a string) to check if an expr is a Expr::Column
  2. It requires creating and ignoring a DataFusionError to check if an expr is not a Expr::Column

What changes are included in this PR?

  1. Add Expr::try_as_col
  2. Deprecate Expr::try_into_col
  3. Change code to use Expr::try_into_col

Are these changes tested?

By existing tests

Are there any user-facing changes?

A new function / less required copying

I doubt we will see any significant performance changes however it does reduce allocations during planning

@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules labels May 10, 2024
@alamb alamb marked this pull request as ready for review May 11, 2024 20:03
@alamb
Copy link
Contributor Author

alamb commented May 12, 2024

Thanks @sadboy 🙏

@alamb alamb changed the title Add Expr::try_as_col, deprecate Expr::try_into_col Add Expr::try_as_col, deprecate Expr::try_into_col (speed up optimizer) May 13, 2024
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most places seem to still be cloning it, but I guess baby steps 😅

@alamb
Copy link
Contributor Author

alamb commented May 13, 2024

Most places seem to still be cloning it, but I guess baby steps 😅

Yes indeed -- baby steps. Thank you for the review @tustvold

Another thing that would likely help non trivially would be to introduce a version of Expr::to_columns that does not requiring cloneing each Column. I'll file at ticket about that too

accumu.insert(l.try_into_col()?);
accumu.insert(r.try_into_col()?);
let Some(l) = l.try_as_col().cloned() else {
return internal_err!(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think making the error here (rather than in try_into_col) results in a more specific error message that will be easier to debug if we ever hit it

@alamb alamb merged commit 53de994 into apache:main May 13, 2024
24 checks passed
@alamb alamb deleted the alamb/as_column branch May 13, 2024 13:10
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants