Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization rules related to limit push-downs may cause unexpected optimization behavior #5069

Closed
czpmango opened this issue Dec 16, 2022 · 3 comments
Labels
affects/none PR/issue: this bug affects none version. severity/none Severity of bug type/enhancement Type: make the code neat or more efficient

Comments

@czpmango
Copy link
Contributor

Your Environments (required)
All versions for now.

Describe the bug (required)
Normal limit push-downs don't cause problems, such as PushLimitDownProjectRule.This rule pushes the limit completely down under the project operator, and the plan transforms as follows:
Old plan: ... -> Limit -> Project -> ...
Optimized plan: ... -> Project -> Limit -> ...

However, the operator related to storage access may cause problems, such as PushLimitDownAppendVerticesRule. The plan transforms as follows:
Old plan: ... -> Limit -> AppendVertices -> ...
Optimized plan: ... -> Limit -> AppendVertices(limit) -> ...

This kind of rules is essentially embedding rather than push-down.
The remained limit operator may block the possibility of other optimization rules being applied, and the order of rules loading may result in different execution plans.

For example, we discuss the optimization of rules involving PushLimitDownProjectRule, PushLimitDownAppendVerticesRule and EliminateAppendVerticesRule as follows:

Old plan: ... -> Limit -> Project -> AppendVertices -> ...

If rule EliminateAppendVerticesRule is applied first, the optimized plan looks like this:
Optimized plan: ... -> Project -> Limit -> ...

If rule PushLimitDownProjectRule is applied first, the optimized plan looks like this:
Optimized plan: ... -> Project -> Limit -> AppendVertices(limit) -> ...

As we can see, different orders in which rules are applied can lead to unexpected plan transformations.This can even cause some correctness problems, such as those related to hanging edges.

Expected behavior

  1. Avoid correctness or self-consistency problems caused by optimizations
  2. Avoid situations where one optimization rule blocks the application of other optimization rules
  3. Avoid rule load order impact on final optimization behavior

Additional context
Similar issues may exist for other rules.

@czpmango czpmango added the type/bug Type: something is unexpected label Dec 16, 2022
@github-actions github-actions bot added affects/none PR/issue: this bug affects none version. severity/none Severity of bug labels Dec 16, 2022
@czpmango czpmango changed the title Optimization rules related to limit push-downs may cause unexpected optimizer behavior Optimization rules related to limit push-downs may cause unexpected optimization behavior Dec 16, 2022
@Sophie-Xie Sophie-Xie added this to the v3.4.0 milestone Dec 16, 2022
@xtcyclist xtcyclist added type/enhancement Type: make the code neat or more efficient and removed type/bug Type: something is unexpected labels Dec 21, 2022
@xtcyclist
Copy link
Contributor

xtcyclist commented Dec 21, 2022

This issue looks more like an enhancement. To address it, I think we need to first establish a set of theorems on the types of transformations that the optimizer could safely apply without causing bugs or losing good optimziation opportunities. I think it's ok to have exclusive optimization branches, but false exclusion is to be fixed. The optimization maneuvers we want should be subset of all such safe transformations.

I recommend to address similar issues after 3.4. I added it to v3.5.0 for now.

@xtcyclist xtcyclist modified the milestones: v3.4.0, v3.5.0 Dec 21, 2022
@czpmango
Copy link
Contributor Author

This issue looks more like an enhancement. To address it, I think we need to first establish a set of theorems on the types of transformations that the optimizer could safely apply without causing bugs or losing good optimziation opportunities. I think it's ok to have exclusive optimization branches, but false exclusion is to be fixed. The optimization maneuvers we want should be subset of all such safe transformations.

I recommend to address similar issues after 3.4. I added it to v3.5.0 for now.

Basically agree, but there may be a self-consistency problem for scenarios with hanging edges.

@xtcyclist xtcyclist removed this from the v3.5.0 milestone Mar 29, 2023
@xtcyclist xtcyclist removed their assignment Mar 29, 2023
@xtcyclist
Copy link
Contributor

I'm closing this issue and removing it from the 3.5 milestone due to resource constraints. We welcome anyone from the community to address this issue lateron.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/none PR/issue: this bug affects none version. severity/none Severity of bug type/enhancement Type: make the code neat or more efficient
Projects
None yet
Development

No branches or pull requests

3 participants