consider: should we use sqlglot's optimizer after compilation? #8789

NickCrews · 2024-03-27T01:21:00Z

Is your feature request related to a problem?

As I worked on #8788, it got me wondering if there might be some optimizations that we don't do, but sqlglot would catch. If we upstream work there too, then others would benefit as well. Also, per #8770, this might help with optimizing queries that are impossible to optimize in the pre-compilation stage.

Describe the solution you'd like

It looks like we already use sqlglot.optimizer.optimize already in ibis/expr/sql.py`, but maybe that only affects SubQueries? Can we apply it in more cases?

What version of ibis are you running?

main

What backend(s) are you using, if any?

No response

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

kszucs · 2024-03-27T14:05:15Z

Note that ibis now has a pretty convenient rewrite system which we can use to do various simplifications and optimizations. We haven't been focusing on these tasks yet, but I suggest you to take a look at the **/rewrites.py files in the meantime.

NickCrews · 2024-03-27T16:44:58Z

@kszucs sorry I wasn't very clear there. Often, as in #8770, inefficiencies are added back into the SQL during translation ibis->sqlglot, AFTER rewrites.py has done its magic, so rewrites.py isn't useful to us here.

kszucs · 2024-03-27T17:08:33Z

What I mean is that we can and should improve rewrites.py to avoid those inefficiencies.

NickCrews · 2024-03-27T17:34:50Z

I might be missing something here. Take a look at https://github.com/ibis-project/ibis/pull/8788/files#diff-cb487151de645f8616c295a297cf1bb685b77b639163934c40073dc8b692b83eR472-R488.

There, we have a self.if_ to ensure that all indexes are nonnegative, because postgres can't handle negative indexing. We only add this branching during compilation, because some backends like duckb can handle negative indexing natively, so in our internal representation in ArraySlice, we want to allow storing negative start and stop. This means that this branching is only added during compilation, rewrites.py wouldn't ever see any branching. So, if we ever want to remove/simplify that branching using static analysis, it can only happen after compilation. Does that make sense?

NickCrews · 2024-03-27T20:15:41Z

errr, or maybe I'm missing something and the rewrites can happen on a per-backend basis?

kszucs · 2024-03-28T00:05:04Z

the rewrites can happen on a per-backend basis?

Yes. We can also heavily reorganize the expressions as we do in the pandas backend for example.

NickCrews · 2024-03-28T01:50:38Z

ah, great, sorry for the noise! I'll take a look at solving these problems using that system. We can revisit using sqlglot's optimizer/simplifier if that is not sufficient.

NickCrews · 2024-03-28T15:35:10Z

@kszucs I'm realizing that the rewrites needed would have to be destructive/non-idempotent. Ie adding one to an index. Does the rewrite system support "apply this rewrite only once", or am I given no guarantees about if my rewrite is applied once or N times?

kszucs · 2024-03-28T17:27:23Z

We control the number of passes and replacements and currently we do them in a single pass.

On the other hand if we were having multiple passes, idempotency depends on both the pattern and the replacement. The pattern wouldn't match the rewrite outcome again then no need to worry about it, otherwise a new node type can be introduced with the data.

NickCrews added the feature Features or general enhancements label Mar 27, 2024

github-project-automation bot added this to Ibis planning and roadmap Mar 27, 2024

github-project-automation bot moved this to backlog in Ibis planning and roadmap Mar 27, 2024

NickCrews closed this as completed Mar 28, 2024

github-project-automation bot moved this from backlog to done in Ibis planning and roadmap Mar 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consider: should we use sqlglot's optimizer after compilation? #8789

consider: should we use sqlglot's optimizer after compilation? #8789

NickCrews commented Mar 27, 2024

kszucs commented Mar 27, 2024

NickCrews commented Mar 27, 2024

kszucs commented Mar 27, 2024 •

edited

Loading

NickCrews commented Mar 27, 2024 •

edited

Loading

NickCrews commented Mar 27, 2024

kszucs commented Mar 28, 2024 •

edited

Loading

NickCrews commented Mar 28, 2024

NickCrews commented Mar 28, 2024

kszucs commented Mar 28, 2024

consider: should we use sqlglot's optimizer after compilation? #8789

consider: should we use sqlglot's optimizer after compilation? #8789

Comments

NickCrews commented Mar 27, 2024

Is your feature request related to a problem?

Describe the solution you'd like

What version of ibis are you running?

What backend(s) are you using, if any?

Code of Conduct

kszucs commented Mar 27, 2024

NickCrews commented Mar 27, 2024

kszucs commented Mar 27, 2024 • edited Loading

NickCrews commented Mar 27, 2024 • edited Loading

NickCrews commented Mar 27, 2024

kszucs commented Mar 28, 2024 • edited Loading

NickCrews commented Mar 28, 2024

NickCrews commented Mar 28, 2024

kszucs commented Mar 28, 2024

kszucs commented Mar 27, 2024 •

edited

Loading

NickCrews commented Mar 27, 2024 •

edited

Loading

kszucs commented Mar 28, 2024 •

edited

Loading