-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: split disjunction in join conditions in more cases #97695
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-sql-queries
SQL Queries Team
Comments
rytaft
added
the
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
label
Feb 27, 2023
rytaft
added a commit
to rytaft/cockroach
that referenced
this issue
Feb 27, 2023
Prior to this commit, when a join condition included a disjunction (e.g. a OR b), in some cases we could remove the disjunction by splitting the join into a UNION of joins to create a more efficient plan. However, we were only performing this transformation if at least one side of the OR predicate contained an equijoin predicate (e.g., t1.col1 = t2.col1). There were other cases where we could have improved the plan by splitting the disjunction, but we did not do so. This commit improves our ability to optimize joins with disjunctions in the join condition when there is the possibility to push one or both sides of the disjunction below the join. This commit adds logic to detect these cases and splits the disjunction to make predicate push-down possible. Fixes cockroachdb#97695 Release note (performance improvement): the optimizer now creates a better query plan in some cases where an inner, semi, or anti join contains a join predicate with a disjuction (OR condition). In cases where one or both sides of the OR condition contains a conjunction with at least one conjunct that references a single table, the optimizer now splits the disjunction so that the conjunct referencing a single table can be pushed below the join.
rytaft
added a commit
to rytaft/cockroach
that referenced
this issue
Feb 28, 2023
Prior to this commit, when a join condition included a disjunction (e.g. a OR b), in some cases we could remove the disjunction by splitting the join into a UNION of joins to create a more efficient plan. However, we were only performing this transformation if at least one side of the OR predicate contained an equijoin predicate (e.g., t1.col1 = t2.col1). There were other cases where we could have improved the plan by splitting the disjunction, but we did not do so. This commit improves our ability to optimize joins with disjunctions in the join condition when there is the possibility to push one or both sides of the disjunction below the join. This commit removes the requirement that the disjunction contains an equijoin predicate, and instead splits the disjunction in all cases where it is possible to do so, thus enabling more optimization opportunities. Fixes cockroachdb#97695 Release note (performance improvement): if the session setting optimizer_use_improved_split_disjunction_for_joins is true, the optimizer now creates a better query plan in some cases where an inner, semi, or anti join contains a join predicate with a disjuction (OR condition).
craig bot
pushed a commit
that referenced
this issue
Feb 28, 2023
97696: opt: split disjunction in join conditions in more cases r=rytaft a=rytaft **sql: add session setting `optimizer_use_improved_split_disjunction_for_joins`** This commit adds a new session setting, `optimizer_use_improved_split_disjunction_for_joins`, which will be used in the next commit. Release note (sql change): added a new session setting, `optimizer_use_improved_split_disjunction_for_joins`, which enables the optimizer to split disjunctions (`OR` expressions) in more cases in join conditions by building a `UNION` of two join expressions. If this setting is true, all disjunctions in inner, semi, and anti joins will be split. If false, only disjunctions potentially containing an equijoin condition will be split. **opt: split disjunction in join conditions in more cases** Prior to this commit, when a join condition included a disjunction (e.g. `a OR b`), in some cases we could remove the disjunction by splitting the join into a `UNION` of joins to create a more efficient plan. However, we were only performing this transformation if at least one side of the `OR` predicate contained an equijoin predicate (e.g., `t1.col1 = t2.col1`). There were other cases where we could have improved the plan by splitting the disjunction, but we did not do so. This commit improves our ability to optimize joins with disjunctions in the join condition when there is the possibility to push one or both sides of the disjunction below the join. This commit removes the requirement that the disjunction contains an equijoin predicate, and instead splits the disjunction in all cases where it is possible to do so, thus enabling more optimization opportunities. Fixes #97695 Release note (performance improvement): if the session setting `optimizer_use_improved_split_disjunction_for_joins` is true, the optimizer now creates a better query plan in some cases where an inner, semi, or anti join contains a join predicate with a disjuction (`OR` condition). Co-authored-by: Rebecca Taft <[email protected]>
rytaft
added a commit
that referenced
this issue
Mar 1, 2023
Prior to this commit, when a join condition included a disjunction (e.g. a OR b), in some cases we could remove the disjunction by splitting the join into a UNION of joins to create a more efficient plan. However, we were only performing this transformation if at least one side of the OR predicate contained an equijoin predicate (e.g., t1.col1 = t2.col1). There were other cases where we could have improved the plan by splitting the disjunction, but we did not do so. This commit improves our ability to optimize joins with disjunctions in the join condition when there is the possibility to push one or both sides of the disjunction below the join. This commit removes the requirement that the disjunction contains an equijoin predicate, and instead splits the disjunction in all cases where it is possible to do so, thus enabling more optimization opportunities. Fixes #97695 Release note (performance improvement): if the session setting optimizer_use_improved_split_disjunction_for_joins is true, the optimizer now creates a better query plan in some cases where an inner, semi, or anti join contains a join predicate with a disjuction (OR condition).
This was referenced Mar 1, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-sql-queries
SQL Queries Team
Is your feature request related to a problem? Please describe.
When a join condition includes a disjunction (e.g.
a OR b
), in some cases we can remove the disjunction by splitting the join into aUNION
of joins to create a more efficient plan. However, we currently only perform this transformation if one side of theOR
predicate contains an equijoin predicate (e.g.,t1.col1 = t2.col1
). There are other cases where we could improve the plan by splitting the disjunction, but we don't currently do so. For example, consider the following:Note that because we do not split the disjunction, we cannot push any predicate below the join, and we perform full table scans of both
a
andb
.Describe the solution you'd like
We should be able to split the disjunction in this case to allow us to push predicates down and perform constrained scans of
a
andb
. For example, we should be able to produce the following plan for the above query:Jira issue: CRDB-24828
The text was updated successfully, but these errors were encountered: