refactor: remove predicate fusion and pushdown #3031
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR removes predicate pushdown and fusion in
ibis/expr/analysis.py
in an effort to greatly simplify expression construction and increase
maintainability.
This is the first in a series of backwards-incompatible changes that
will culminate in the simplication of the core relational operators (#2919).
There are a number of benefits to this:
We can revisit adding fusion back later, after designing a more robust and
modular API for optimizing at expression construction time.
There's one major breaking change:
Filter expressions that refer to a non-immediate child table are no longer
valid and will raise an error (projections are almost all still valid,
removing projection fusion will come in a follow-up).
An example of this is:
previously this would fuse
t.a == "foo"
andt.b == 1
into a conjunctionand return
t[(t.a == "foo") & (t.b == 1)]
.The behavior is still available through other means, like the
&
operator, passinga list of predicates, and if you must chain, you need to use a lambda.
Non-breaking changes:
SQL queries contain more nesting. Queries that compose a lot of filters will
contain more subqueries.
I don't view this as problematic, as most database engines having fairly
comprehensive subquery unnesting capabilities.