Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: take into account ON conditions for join costing #34810

Closed
andy-kimball opened this issue Feb 11, 2019 · 10 comments · Fixed by #40248
Closed

opt: take into account ON conditions for join costing #34810

andy-kimball opened this issue Feb 11, 2019 · 10 comments · Fixed by #40248
Assignees
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Milestone

Comments

@andy-kimball
Copy link
Contributor

Repro:
Create the schema and statistics shown below, and then run the following query:

SELECT w, x, y, z
FROM wxyz
INNER JOIN abcde
ON w = a AND x = b AND y = c
WHERE w = 'foo' AND x = '2AB23800-06B1-4E19-A3BB-DF3768B808D2'
ORDER BY d
LIMIT 10;

EXPECTED: This query should use the idx_abcd index, because it's more selective.
ACTUAL: This query uses the idx_abd index.

The problem is that the coster does not take into account any remainder ON predicates. While idx_abcd has fewer remainder predicates, it is actually considered as more expensive than idx_abd. To fix this, we'll need to incorporate remainder predicates into the cost estimation somehow.

CREATE TABLE abcde (
	a TEXT NOT NULL,
	b UUID NOT NULL,
	c UUID NOT NULL,
	d VARCHAR(255) NOT NULL,
	e TEXT NOT NULL,
	CONSTRAINT "primary" PRIMARY KEY (a, b, c),
	UNIQUE INDEX idx_abd (a, b, d),
	UNIQUE INDEX idx_abcd (a, b, c, d)
);

ALTER TABLE abcde INJECT STATISTICS '[
  {
    "columns": ["a"],
    "created_at": "2019-02-08 04:10:40.001179+00:00",
    "row_count": 250000,
    "distinct_count": 1
  },
  {
    "columns": ["b"],
    "created_at": "2019-02-08 04:10:40.119954+00:00",
    "row_count": 250000,
    "distinct_count": 2
  },
  {
    "columns": ["d"],
    "created_at": "2019-02-08 04:10:40.119954+00:00",
    "row_count": 250000,
    "distinct_count": 125000
  }
]';

CREATE TABLE wxyz (
	w TEXT NOT NULL,
	x UUID NOT NULL,
	y UUID NOT NULL,
	z TEXT NOT NULL,
	CONSTRAINT "primary" PRIMARY KEY (w, x, y),
	CONSTRAINT "foreign" FOREIGN KEY (w, x, y) REFERENCES abcde (a, b, c)
);

ALTER TABLE wxyz INJECT STATISTICS '[
  {
    "columns": ["w"],
    "created_at": "2019-02-08 04:10:40.001179+00:00",
    "row_count": 10000,
    "distinct_count": 1
  },
  {
    "columns": ["x"],
    "created_at": "2019-02-08 04:10:40.119954+00:00",
    "row_count": 10000,
    "distinct_count": 1
  },
  {
    "columns": ["y"],
    "created_at": "2019-02-08 04:10:40.119954+00:00",
    "row_count": 10000,
    "distinct_count": 2500
  }
]';
@andy-kimball andy-kimball added the C-performance Perf of queries or internals. Solution not expected to change functional behavior. label Feb 11, 2019
@andy-kimball andy-kimball added this to the 2.2 milestone Feb 11, 2019
@andy-kimball andy-kimball changed the title opt: Wrong index used for LookupJoin when it's a subset of another index opt: Slower index used for LookupJoin Feb 11, 2019
@RaduBerinde
Copy link
Member

I think merge joins have the same issue.

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Mar 10, 2019
This is a very limited fix for cockroachdb#34810.

The core problem is that we don't take into account that if we have an
ON condition, not only there's a cost to evaluate it on each row, but
we are generating more internal rows to get a given number of output
rows.

I attempted to do a more general fix (for all join types), where I
tried to estimate the "internal" number of rows using
`unknownFilterSelectivity` for each ON condition. There were two
problems:
 - in some cases (especially with lookup joins) we have an extra ON
   condition that doesn't actually do anything:
     `ab JOIN xy ON a=x AND a=10` becomes
     `ab JOIN xy ON a=x AND a=10 AND x=10` becomes
   and `a=10` could remain as an ON condition. This results in bad
   query plans in important cases (e.g. TPCC) where it prefers to do
   an extra lookup join (due to a non-covering index) just because of
   that condition.

 - we don't have the equality columns readily available for hash join
   (and didn't want to extract them each time we cost). In the future
   we may split the planning into a logical and physical stage, and we
   should then separate the logical joins from hash join.

For 19.1, we simply simply add a cost for lookup joins that is
proportional to the number of remaining ON conditions. This is the
least disruptive method that still fixes the case observed in cockroachdb#34810.
I will leave the issue open to address this properly in the next
release.

Note that although hash joins and merge joins have the same issue in
principle, in practice we always generate these expressions with
equality on all possible columns.

Release note: None
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Mar 11, 2019
This is a very limited fix for cockroachdb#34810.

The core problem is that we don't take into account that if we have an
ON condition, not only there's a cost to evaluate it on each row, but
we are generating more internal rows to get a given number of output
rows.

I attempted to do a more general fix (for all join types), where I
tried to estimate the "internal" number of rows using
`unknownFilterSelectivity` for each ON condition. There were two
problems:
 - in some cases (especially with lookup joins) we have an extra ON
   condition that doesn't actually do anything:
     `ab JOIN xy ON a=x AND a=10` becomes
     `ab JOIN xy ON a=x AND a=10 AND x=10` becomes
   and `a=10` could remain as an ON condition. This results in bad
   query plans in important cases (e.g. TPCC) where it prefers to do
   an extra lookup join (due to a non-covering index) just because of
   that condition.

 - we don't have the equality columns readily available for hash join
   (and didn't want to extract them each time we cost). In the future
   we may split the planning into a logical and physical stage, and we
   should then separate the logical joins from hash join.

For 19.1, we simply simply add a cost for lookup joins that is
proportional to the number of remaining ON conditions. This is the
least disruptive method that still fixes the case observed in cockroachdb#34810.
I will leave the issue open to address this properly in the next
release.

Note that although hash joins and merge joins have the same issue in
principle, in practice we always generate these expressions with
equality on all possible columns.

Release note: None
craig bot pushed a commit that referenced this issue Mar 12, 2019
35321: opt: propagate set operation output types to input columns r=rytaft a=rytaft

This commit updates the `optbuilder` logic for set operations in which
the types of the input columns do not match the types of the output
columns. This can happen if a column on one side has type Unknown,
but the corresponding column on the other side has a known type such
as Int. The known type must be propagated to the side with the unknown
type to prevent errors in the execution engine related to decoding
types.

If there are any column types on either side that don't match the output,
then the `optbuilder` propagates the output types of the set operation down
to the input columns by wrapping the side with mismatched types in a
Project operation. The Project operation passes through columns that
already have the correct type, and creates cast expressions for those
that don't.

Fixes #34524

Release note (bug fix): Fixed an error that happened when executing
some set operations containing only nulls in one of the input columns.

35587: opt: add more cost for lookup joins with more ON conditions r=RaduBerinde a=RaduBerinde

This is a very limited fix for #34810.

The core problem is that we don't take into account that if we have an
ON condition, not only there's a cost to evaluate it on each row, but
we are generating more internal rows to get a given number of output
rows.

I attempted to do a more general fix (for all join types), where I
tried to estimate the "internal" number of rows using
`unknownFilterSelectivity` for each ON condition. There were two
problems:
 - in some cases (especially with lookup joins) we have an extra ON
   condition that doesn't actually do anything:
     `ab JOIN xy ON a=x AND a=10` becomes
     `ab JOIN xy ON a=x AND a=10 AND x=10` becomes
   and `a=10` could remain as an ON condition. This results in bad
   query plans in important cases (e.g. TPCC) where it prefers to do
   an extra lookup join (due to a non-covering index) just because of
   that condition.

 - we don't have the equality columns readily available for hash join
   (and didn't want to extract them each time we cost). In the future
   we may split the planning into a logical and physical stage, and we
   should then separate the logical joins from hash join.

For 19.1, we simply simply add a cost for lookup joins that is
proportional to the number of remaining ON conditions. This is the
least disruptive method that still fixes the case observed in #34810.
I will leave the issue open to address this properly in the next
release.

Note that although hash joins and merge joins have the same issue in
principle, in practice we always generate these expressions with
equality on all possible columns.

Release note: None

35630: storage/tscache: Pick up andy-kimball/arenaskl fix r=nvanbenschoten a=nvanbenschoten

Fixes #31624.
Fixes #35557.

This commit picks up andy-kimball/arenaskl#4.

I strongly suspect that the uint32 overflow fixed in that PR was the
cause of the two index out of bounds panics. See that commit for more
details.

The PR also fixes a bug in memory recylcling within the tscache. I confirmed
on adriatic that over 900 64MB arenas had been allocated since it was last
wiped.

35644: opt: use correct ordering for insert input in execbuilder r=RaduBerinde a=RaduBerinde

We were setting up a projection on the Insert's input but we were
accidentally using the parent Insert's ordering instead of that of the
input.

Fixes #35564.

Release note (bug fix): Fixed a "column not in input" crash when
`INSERT ... RETURNING` is used inside a clause that requires an
ordering.

35651: jobs, sql, ui: Create `AutoCreateStats` job type r=celiala a=celiala

With #34279, enabling the cluster setting
`sql.stats.experimental_automatic_collection.enabled` has the potential
to create many CreateStats jobs, which can cause the Jobs view on the
AdminUI to become cluttered.

This commit creates a new `AutoCreateStats` job type for these auto-created
CreateStats jobs, so that users are able to still see their own manual runs
of CREATE STATISTICS, via the pre-existing `CreateStats` type.

cc @danhhz, @piyush-singh, @rolandcrosby 

![jobs-auto-create-stats](https://user-images.githubusercontent.com/3051672/54212467-5cea2c80-44b9-11e9-9c11-db749814f019.gif)

Release note (admin ui change): AutoCreateStats type added to
Jobs page to filter automatic statistics jobs.

Fixes #34377.

Co-authored-by: Rebecca Taft <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Celia La <[email protected]>
@RaduBerinde
Copy link
Member

A fix has been merged but I am keeping the issue open for a more general change.

@RaduBerinde RaduBerinde modified the milestones: 19.1, Later Mar 12, 2019
@RaduBerinde RaduBerinde changed the title opt: Slower index used for LookupJoin opt: take into account ON conditions for join costing Mar 12, 2019
@RaduBerinde
Copy link
Member

RaduBerinde commented Jun 26, 2019

Pasting from #35587:

The core problem is that we don't take into account that if we have an
ON condition, not only there's a cost to evaluate it on each row, but
we are generating more internal rows to get a given number of output
rows.

I attempted to do a more general fix (for all join types), where I
tried to estimate the "internal" number of rows using
unknownFilterSelectivity for each ON condition. There were two
problems:

  • in some cases (especially with lookup joins) we have an extra ON
    condition that doesn't actually do anything:
    ab JOIN xy ON a=x AND a=10 becomes
    ab JOIN xy ON a=x AND a=10 AND x=10
    and a=10 could remain as an ON condition. This results in bad
    query plans in important cases (e.g. TPCC) where it prefers to do
    an extra lookup join (due to a non-covering index) just because of
    that condition.

  • we don't have the equality columns readily available for hash join
    (and didn't want to extract them each time we cost). In the future
    we may split the planning into a logical and physical stage, and we
    should then separate the logical joins from hash join.

For 19.1, we simply simply add a cost for lookup joins that is
proportional to the number of remaining ON conditions. This is the
least disruptive method that still fixes the case observed in #34810.
I will leave the issue open to address this properly in the next
release.

Note that although hash joins and merge joins have the same issue in
principle, in practice we always generate these expressions with
equality on all possible columns.

@RaduBerinde
Copy link
Member

Adding more folks to the issue because this came up recently and we need more brainstorming on this.

@rytaft
Copy link
Collaborator

rytaft commented Jun 26, 2019

A big issue with lookup joins is that we don't track the number of rows scanned during the lookup. I wonder if it makes sense to add an extra field to the Statistics struct. In addition to RowCount (the number of output rows), we could have RowsProcessed (the number of rows processed by the operator). In most cases RowsProcessed is equal to the number of input rows, but it would be different for lookup joins.

@RaduBerinde
Copy link
Member

The problem is that Statistics are per group and RowProcessed changes depending on the particular expression (e.g. a lookup join that constrains columns x,y and one that constrains just column x could be in the same group).

@andy-kimball
Copy link
Contributor Author

Yeah, I think this means we need more logic in the Coster. There may be some shared code between the StatisticsBuilder and the Coster.

@rytaft
Copy link
Collaborator

rytaft commented Jul 1, 2019

Similar to how @ridwanmsharif used RequestColStat, we can add a utility function like GetRowsProcessed which calls into the statisticsBuilder for a given relational expression.

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Jul 21, 2019
Issue cockroachdb#34810 tracks taking into account the internal row count of
lookup joins. We currently have a hack in place to always prefer
looking indexes that constrain more columns. We encountered a case
where this adjustment doesn't work: when the estimated row count is
very very small (which happens when there are a lot of conditions),
the per-row cost adjustment ends up not making a difference (this is
because of limited floating point precision, and the "tolerance" built
into `Cost.Less()`). To address this, we also add a constant per-ON
condition cost which isn't scaled by the row count.

Release note (bug fix): Fixed bug in the optimizer causing a bad index
for lookup join in some cases.
craig bot pushed a commit that referenced this issue Jul 22, 2019
39016: opt: improve per-ON condition cost adjustment for lookup join r=RaduBerinde a=RaduBerinde

Issue #34810 tracks taking into account the internal row count of
lookup joins. We currently have a hack in place to always prefer
looking indexes that constrain more columns (see #35587). We encountered a case
where this adjustment doesn't work: when the estimated row count is
very very small (which happens when there are a lot of conditions),
the per-row cost adjustment ends up not making a difference (this is
because of limited floating point precision, and the "tolerance" built
into `Cost.Less()`). To address this, we also add a constant per-ON
condition cost which isn't scaled by the row count.

Release note (bug fix): Fixed bug in the optimizer causing a bad index
for lookup join in some cases.

Co-authored-by: Radu Berinde <[email protected]>
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Jul 22, 2019
Issue cockroachdb#34810 tracks taking into account the internal row count of
lookup joins. We currently have a hack in place to always prefer
looking indexes that constrain more columns. We encountered a case
where this adjustment doesn't work: when the estimated row count is
very very small (which happens when there are a lot of conditions),
the per-row cost adjustment ends up not making a difference (this is
because of limited floating point precision, and the "tolerance" built
into `Cost.Less()`). To address this, we also add a constant per-ON
condition cost which isn't scaled by the row count.

Release note (bug fix): Fixed bug in the optimizer causing a bad index
for lookup join in some cases.
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Jul 23, 2019
Issue cockroachdb#34810 tracks taking into account the internal row count of
lookup joins. We currently have a hack in place to always prefer
looking indexes that constrain more columns. We encountered a case
where this adjustment doesn't work: when the estimated row count is
very very small (which happens when there are a lot of conditions),
the per-row cost adjustment ends up not making a difference (this is
because of limited floating point precision, and the "tolerance" built
into `Cost.Less()`). To address this, we also add a constant per-ON
condition cost which isn't scaled by the row count.

Release note (bug fix): Fixed bug in the optimizer causing a bad index
for lookup join in some cases.
@RaduBerinde
Copy link
Member

We've been bit by this again, this time with FK checks. For TPCC, some of the lookup joins inside FK checks don't use the correct index. @rytaft would you be able to look at fixing this?

rytaft added a commit to rytaft/cockroach that referenced this issue Aug 27, 2019
… joins

This commit updates the costing of both lookup joins and merge joins to
take into account the number of rows processed by the operator. This number
may be larger than the number of output rows if an additional filter is
applied as part of the ON condition that is not used to determine equality
columns for the join.

For example, consider the query
  `SELECT * FROM abc JOIN def ON a = e AND b = 3;`

Assuming there is no index on b, if a lookup join is used to execute this
query, the number of rows processed is actually the same as the query
  `SELECT * FROM abc JOIN def ON a = e;`

The difference is that the filter b=3 must also be applied to every row in
the first query. The coster now takes this into account when determining
the cost of lookup joins and merge joins.

Informs cockroachdb#34810

Release note: None
rytaft added a commit to rytaft/cockroach that referenced this issue Aug 28, 2019
… joins

This commit updates the costing of both lookup joins and merge joins to
take into account the number of rows processed by the operator. This number
may be larger than the number of output rows if an additional filter is
applied as part of the ON condition that is not used to determine equality
columns for the join.

For example, consider the query
  `SELECT * FROM abc JOIN def ON a = e AND b = 3;`

Assuming there is no index on b, if a lookup join is used to execute this
query, the number of rows processed is actually the same as the query
  `SELECT * FROM abc JOIN def ON a = e;`

The difference is that the filter b=3 must also be applied to every row in
the first query. The coster now takes this into account when determining
the cost of lookup joins and merge joins.

Informs cockroachdb#34810

Release note: None
rytaft added a commit to rytaft/cockroach that referenced this issue Sep 1, 2019
… joins

This commit updates the costing of both lookup joins and merge joins to
take into account the number of rows processed by the operator. This number
may be larger than the number of output rows if an additional filter is
applied as part of the ON condition that is not used to determine equality
columns for the join.

For example, consider the query
  `SELECT * FROM abc JOIN def ON a = e AND b = 3;`

Assuming there is no index on b, if a lookup join is used to execute this
query, the number of rows processed is actually the same as the query
  `SELECT * FROM abc JOIN def ON a = e;`

The difference is that the filter b=3 must also be applied to every row in
the first query. The coster now takes this into account when determining
the cost of lookup joins and merge joins.

Informs cockroachdb#34810

Release note: None
rytaft added a commit to rytaft/cockroach that referenced this issue Sep 3, 2019
… joins

This commit updates the costing of both lookup joins and merge joins to
take into account the number of rows processed by the operator. This number
may be larger than the number of output rows if an additional filter is
applied as part of the ON condition that is not used to determine equality
columns for the join.

For example, consider the query
  `SELECT * FROM abc JOIN def ON a = e AND b = 3;`

Assuming there is no index on b, if a lookup join is used to execute this
query, the number of rows processed is actually the same as the query
  `SELECT * FROM abc JOIN def ON a = e;`

The difference is that the filter b=3 must also be applied to every row in
the first query. The coster now takes this into account when determining
the cost of lookup joins and merge joins.

Informs cockroachdb#34810

Release note: None
@craig craig bot closed this as completed in 48ed5f0 Sep 3, 2019
@RoachietheSupportRoach
Copy link
Collaborator

Zendesk ticket #3600 has been linked to this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants