Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional algebraic simplifications #1162

Closed
alamb opened this issue Oct 21, 2021 · 4 comments · Fixed by #1208
Closed

Add additional algebraic simplifications #1162

alamb opened this issue Oct 21, 2021 · 4 comments · Fixed by #1208
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Oct 21, 2021

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Now that we have a nice ConstantPropagator and Simplifier framework, there are additional predicate rewrites we can do to improve query performance

Describe the solution you'd like
Implement additional algebraic simplification / rewrite rules to constant_folding.rs after #1153 has been merged

Types of standard rewrites I think it would be good to have:

true OR <expr> --> true
false OR <expr> --> expr
true AND <expr> --> <expr>
false AND <expr> --> false

There is also an interesting one in #1153 that would rotatate expr trees to isolate volatile functions:

(rand() + 1) + 2 --> rand() + (1 + 2)

The reason for doing that rewrite is that then the constant evaluator could convert to rand() + 3

Describe alternatives you've considered
There are various automated rewriting rules such as egg / tokomak (drafted in #1066 and #441 by @Dandandan and @pjmore ) which might be better / more powerful than implementing our own rules

Additional context
I am sure other query optimization systems have a good set of rewrite rules that we could investigate. I bet both postgres and spark have interesting optimizations to do.

@alamb alamb added the enhancement New feature or request label Oct 21, 2021
@alamb
Copy link
Contributor Author

alamb commented Oct 28, 2021

@matthewmturner I think you might find some interesting work here if you wanted

Specifically the idea would be to add the simplification rules in the description of this PR to the logic in Simplify here:

https://github.com/apache/arrow-datafusion/blob/2c8e65bcca9fe41ad80116d99ad974c86cb59654/datafusion/src/optimizer/constant_folding.rs#L140-L230

And then write tests following the pattern here: https://github.com/apache/arrow-datafusion/blob/2c8e65bcca9fe41ad80116d99ad974c86cb59654/datafusion/src/optimizer/constant_folding.rs#L267-L279

@matthewmturner
Copy link
Contributor

@alamb thanks! will check it out.

@matthewmturner
Copy link
Contributor

@alamb any next steps here that I can work on?

@alamb
Copy link
Contributor Author

alamb commented Nov 2, 2021

@matthewmturner -- I am not sure -- it might be worth looking at #1066 and #441 to see if the rules in that provide any additional inspiration

There are also transformations of stuff like (2 + (1 + rand())) which could be written to (2 + 1) + rand() and then constant folded where it wouldn't be folded before.

THough I think the most interesting / useful thing might be #1160 but that is likely a bit more involved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants