-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: take into account ON conditions for join costing #34810
Comments
I think merge joins have the same issue. |
This is a very limited fix for cockroachdb#34810. The core problem is that we don't take into account that if we have an ON condition, not only there's a cost to evaluate it on each row, but we are generating more internal rows to get a given number of output rows. I attempted to do a more general fix (for all join types), where I tried to estimate the "internal" number of rows using `unknownFilterSelectivity` for each ON condition. There were two problems: - in some cases (especially with lookup joins) we have an extra ON condition that doesn't actually do anything: `ab JOIN xy ON a=x AND a=10` becomes `ab JOIN xy ON a=x AND a=10 AND x=10` becomes and `a=10` could remain as an ON condition. This results in bad query plans in important cases (e.g. TPCC) where it prefers to do an extra lookup join (due to a non-covering index) just because of that condition. - we don't have the equality columns readily available for hash join (and didn't want to extract them each time we cost). In the future we may split the planning into a logical and physical stage, and we should then separate the logical joins from hash join. For 19.1, we simply simply add a cost for lookup joins that is proportional to the number of remaining ON conditions. This is the least disruptive method that still fixes the case observed in cockroachdb#34810. I will leave the issue open to address this properly in the next release. Note that although hash joins and merge joins have the same issue in principle, in practice we always generate these expressions with equality on all possible columns. Release note: None
This is a very limited fix for cockroachdb#34810. The core problem is that we don't take into account that if we have an ON condition, not only there's a cost to evaluate it on each row, but we are generating more internal rows to get a given number of output rows. I attempted to do a more general fix (for all join types), where I tried to estimate the "internal" number of rows using `unknownFilterSelectivity` for each ON condition. There were two problems: - in some cases (especially with lookup joins) we have an extra ON condition that doesn't actually do anything: `ab JOIN xy ON a=x AND a=10` becomes `ab JOIN xy ON a=x AND a=10 AND x=10` becomes and `a=10` could remain as an ON condition. This results in bad query plans in important cases (e.g. TPCC) where it prefers to do an extra lookup join (due to a non-covering index) just because of that condition. - we don't have the equality columns readily available for hash join (and didn't want to extract them each time we cost). In the future we may split the planning into a logical and physical stage, and we should then separate the logical joins from hash join. For 19.1, we simply simply add a cost for lookup joins that is proportional to the number of remaining ON conditions. This is the least disruptive method that still fixes the case observed in cockroachdb#34810. I will leave the issue open to address this properly in the next release. Note that although hash joins and merge joins have the same issue in principle, in practice we always generate these expressions with equality on all possible columns. Release note: None
35321: opt: propagate set operation output types to input columns r=rytaft a=rytaft This commit updates the `optbuilder` logic for set operations in which the types of the input columns do not match the types of the output columns. This can happen if a column on one side has type Unknown, but the corresponding column on the other side has a known type such as Int. The known type must be propagated to the side with the unknown type to prevent errors in the execution engine related to decoding types. If there are any column types on either side that don't match the output, then the `optbuilder` propagates the output types of the set operation down to the input columns by wrapping the side with mismatched types in a Project operation. The Project operation passes through columns that already have the correct type, and creates cast expressions for those that don't. Fixes #34524 Release note (bug fix): Fixed an error that happened when executing some set operations containing only nulls in one of the input columns. 35587: opt: add more cost for lookup joins with more ON conditions r=RaduBerinde a=RaduBerinde This is a very limited fix for #34810. The core problem is that we don't take into account that if we have an ON condition, not only there's a cost to evaluate it on each row, but we are generating more internal rows to get a given number of output rows. I attempted to do a more general fix (for all join types), where I tried to estimate the "internal" number of rows using `unknownFilterSelectivity` for each ON condition. There were two problems: - in some cases (especially with lookup joins) we have an extra ON condition that doesn't actually do anything: `ab JOIN xy ON a=x AND a=10` becomes `ab JOIN xy ON a=x AND a=10 AND x=10` becomes and `a=10` could remain as an ON condition. This results in bad query plans in important cases (e.g. TPCC) where it prefers to do an extra lookup join (due to a non-covering index) just because of that condition. - we don't have the equality columns readily available for hash join (and didn't want to extract them each time we cost). In the future we may split the planning into a logical and physical stage, and we should then separate the logical joins from hash join. For 19.1, we simply simply add a cost for lookup joins that is proportional to the number of remaining ON conditions. This is the least disruptive method that still fixes the case observed in #34810. I will leave the issue open to address this properly in the next release. Note that although hash joins and merge joins have the same issue in principle, in practice we always generate these expressions with equality on all possible columns. Release note: None 35630: storage/tscache: Pick up andy-kimball/arenaskl fix r=nvanbenschoten a=nvanbenschoten Fixes #31624. Fixes #35557. This commit picks up andy-kimball/arenaskl#4. I strongly suspect that the uint32 overflow fixed in that PR was the cause of the two index out of bounds panics. See that commit for more details. The PR also fixes a bug in memory recylcling within the tscache. I confirmed on adriatic that over 900 64MB arenas had been allocated since it was last wiped. 35644: opt: use correct ordering for insert input in execbuilder r=RaduBerinde a=RaduBerinde We were setting up a projection on the Insert's input but we were accidentally using the parent Insert's ordering instead of that of the input. Fixes #35564. Release note (bug fix): Fixed a "column not in input" crash when `INSERT ... RETURNING` is used inside a clause that requires an ordering. 35651: jobs, sql, ui: Create `AutoCreateStats` job type r=celiala a=celiala With #34279, enabling the cluster setting `sql.stats.experimental_automatic_collection.enabled` has the potential to create many CreateStats jobs, which can cause the Jobs view on the AdminUI to become cluttered. This commit creates a new `AutoCreateStats` job type for these auto-created CreateStats jobs, so that users are able to still see their own manual runs of CREATE STATISTICS, via the pre-existing `CreateStats` type. cc @danhhz, @piyush-singh, @rolandcrosby ![jobs-auto-create-stats](https://user-images.githubusercontent.com/3051672/54212467-5cea2c80-44b9-11e9-9c11-db749814f019.gif) Release note (admin ui change): AutoCreateStats type added to Jobs page to filter automatic statistics jobs. Fixes #34377. Co-authored-by: Rebecca Taft <[email protected]> Co-authored-by: Radu Berinde <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Celia La <[email protected]>
A fix has been merged but I am keeping the issue open for a more general change. |
Pasting from #35587: The core problem is that we don't take into account that if we have an I attempted to do a more general fix (for all join types), where I
For 19.1, we simply simply add a cost for lookup joins that is Note that although hash joins and merge joins have the same issue in |
Adding more folks to the issue because this came up recently and we need more brainstorming on this. |
A big issue with lookup joins is that we don't track the number of rows scanned during the lookup. I wonder if it makes sense to add an extra field to the |
The problem is that |
Yeah, I think this means we need more logic in the Coster. There may be some shared code between the StatisticsBuilder and the Coster. |
Similar to how @ridwanmsharif used |
Issue cockroachdb#34810 tracks taking into account the internal row count of lookup joins. We currently have a hack in place to always prefer looking indexes that constrain more columns. We encountered a case where this adjustment doesn't work: when the estimated row count is very very small (which happens when there are a lot of conditions), the per-row cost adjustment ends up not making a difference (this is because of limited floating point precision, and the "tolerance" built into `Cost.Less()`). To address this, we also add a constant per-ON condition cost which isn't scaled by the row count. Release note (bug fix): Fixed bug in the optimizer causing a bad index for lookup join in some cases.
39016: opt: improve per-ON condition cost adjustment for lookup join r=RaduBerinde a=RaduBerinde Issue #34810 tracks taking into account the internal row count of lookup joins. We currently have a hack in place to always prefer looking indexes that constrain more columns (see #35587). We encountered a case where this adjustment doesn't work: when the estimated row count is very very small (which happens when there are a lot of conditions), the per-row cost adjustment ends up not making a difference (this is because of limited floating point precision, and the "tolerance" built into `Cost.Less()`). To address this, we also add a constant per-ON condition cost which isn't scaled by the row count. Release note (bug fix): Fixed bug in the optimizer causing a bad index for lookup join in some cases. Co-authored-by: Radu Berinde <[email protected]>
Issue cockroachdb#34810 tracks taking into account the internal row count of lookup joins. We currently have a hack in place to always prefer looking indexes that constrain more columns. We encountered a case where this adjustment doesn't work: when the estimated row count is very very small (which happens when there are a lot of conditions), the per-row cost adjustment ends up not making a difference (this is because of limited floating point precision, and the "tolerance" built into `Cost.Less()`). To address this, we also add a constant per-ON condition cost which isn't scaled by the row count. Release note (bug fix): Fixed bug in the optimizer causing a bad index for lookup join in some cases.
Issue cockroachdb#34810 tracks taking into account the internal row count of lookup joins. We currently have a hack in place to always prefer looking indexes that constrain more columns. We encountered a case where this adjustment doesn't work: when the estimated row count is very very small (which happens when there are a lot of conditions), the per-row cost adjustment ends up not making a difference (this is because of limited floating point precision, and the "tolerance" built into `Cost.Less()`). To address this, we also add a constant per-ON condition cost which isn't scaled by the row count. Release note (bug fix): Fixed bug in the optimizer causing a bad index for lookup join in some cases.
We've been bit by this again, this time with FK checks. For TPCC, some of the lookup joins inside FK checks don't use the correct index. @rytaft would you be able to look at fixing this? |
… joins This commit updates the costing of both lookup joins and merge joins to take into account the number of rows processed by the operator. This number may be larger than the number of output rows if an additional filter is applied as part of the ON condition that is not used to determine equality columns for the join. For example, consider the query `SELECT * FROM abc JOIN def ON a = e AND b = 3;` Assuming there is no index on b, if a lookup join is used to execute this query, the number of rows processed is actually the same as the query `SELECT * FROM abc JOIN def ON a = e;` The difference is that the filter b=3 must also be applied to every row in the first query. The coster now takes this into account when determining the cost of lookup joins and merge joins. Informs cockroachdb#34810 Release note: None
… joins This commit updates the costing of both lookup joins and merge joins to take into account the number of rows processed by the operator. This number may be larger than the number of output rows if an additional filter is applied as part of the ON condition that is not used to determine equality columns for the join. For example, consider the query `SELECT * FROM abc JOIN def ON a = e AND b = 3;` Assuming there is no index on b, if a lookup join is used to execute this query, the number of rows processed is actually the same as the query `SELECT * FROM abc JOIN def ON a = e;` The difference is that the filter b=3 must also be applied to every row in the first query. The coster now takes this into account when determining the cost of lookup joins and merge joins. Informs cockroachdb#34810 Release note: None
… joins This commit updates the costing of both lookup joins and merge joins to take into account the number of rows processed by the operator. This number may be larger than the number of output rows if an additional filter is applied as part of the ON condition that is not used to determine equality columns for the join. For example, consider the query `SELECT * FROM abc JOIN def ON a = e AND b = 3;` Assuming there is no index on b, if a lookup join is used to execute this query, the number of rows processed is actually the same as the query `SELECT * FROM abc JOIN def ON a = e;` The difference is that the filter b=3 must also be applied to every row in the first query. The coster now takes this into account when determining the cost of lookup joins and merge joins. Informs cockroachdb#34810 Release note: None
… joins This commit updates the costing of both lookup joins and merge joins to take into account the number of rows processed by the operator. This number may be larger than the number of output rows if an additional filter is applied as part of the ON condition that is not used to determine equality columns for the join. For example, consider the query `SELECT * FROM abc JOIN def ON a = e AND b = 3;` Assuming there is no index on b, if a lookup join is used to execute this query, the number of rows processed is actually the same as the query `SELECT * FROM abc JOIN def ON a = e;` The difference is that the filter b=3 must also be applied to every row in the first query. The coster now takes this into account when determining the cost of lookup joins and merge joins. Informs cockroachdb#34810 Release note: None
Zendesk ticket #3600 has been linked to this issue. |
Repro:
Create the schema and statistics shown below, and then run the following query:
EXPECTED: This query should use the
idx_abcd
index, because it's more selective.ACTUAL: This query uses the
idx_abd
index.The problem is that the coster does not take into account any remainder ON predicates. While
idx_abcd
has fewer remainder predicates, it is actually considered as more expensive thanidx_abd
. To fix this, we'll need to incorporate remainder predicates into the cost estimation somehow.The text was updated successfully, but these errors were encountered: