Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt, sql: improve automatic statistics for small tables #56615

Open
rytaft opened this issue Nov 12, 2020 · 9 comments
Open

opt, sql: improve automatic statistics for small tables #56615

rytaft opened this issue Nov 12, 2020 · 9 comments
Labels
A-sql-optimizer SQL logical planning and optimizations. A-sql-table-stats Table statistics (and their automatic refresh). C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) C-performance Perf of queries or internals. Solution not expected to change functional behavior. O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs P-3 Issues/test failures with no fix SLA T-sql-queries SQL Queries Team

Comments

@rytaft
Copy link
Collaborator

rytaft commented Nov 12, 2020

We have recently had reports of the optimizer choosing a bad plan due to inaccurate or misleading statistics on small tables. This can happen for a couple of reasons:

  • The table is brand new. We always collect statistics when a table is created, so unless it was created with data already in the table using CREATE TABLE AS ..., the first statistics collection will show that there are 0 rows. To avoid constantly refreshing small tables, we have a setting that prevents refreshing until at least 500 rows have changed. So we won't refresh the stats for this table again until at least 500 rows have been inserted. Therefore, the table could have 499 rows, but the optimizer will still think it has 0 rows. (We actually assume that all tables have at least one row to prevent creating really bad plans, but an efficient plan for 1 row could be very different from an efficient plan for 499 rows).
  • The size of the table is reflected correctly by the statistics, but scanning the table is much more expensive than we estimate it to be because there are many tombstones in the storage layer. This can happen when a small table is very frequently updated.

To fix the first issue, we have discussed a couple of ideas:

  • Always assume that every table has at least 500 rows. This has the advantage of being simple, and will probably do the right thing most of the time. It would also help ensure plan stability for small tables (regardless of whether they were created recently or not), and might help fix the second issue above related to tombstones. It could have some unintended consequences, though (e.g., not using a lookup/index join when we should)
  • We could disable the 500-row check when triggering automatic statistics refreshes until we have at least 4-5 stats refreshes completed. This would give us more accurate stats for brand new tables, but might not help in the case where a table has been around for a while. It would not help with the tombstone issue above either.

To fix the second issue, there are some additional options:

  • We could actually include the number of tombstones per table as an additional statistic that we collect, and/or collect the average number of keys/bytes scanned or skipped over per table row
  • We could have some notion of "plan stability", which we could get by modifying the plan cache somehow. For example, instead of just invalidating the plan cache when stats change, we could save the plan for comparison against a new plan, and select the old plan if it's within some small cost factor
  • We could also just increase the cost of unconstrained, unlimited scans, but that seems like it could cause other problems and lead us to pick worse plans in cases where the table doesn't actually have a lot of tombstones

To deal effectively with this issue, we'll probably want to implement some combination of the above ideas.

cc @RaduBerinde, @nvanbenschoten

Epic: CRDB-16930

Jira issue: CRDB-2921

Jira issue: CRDB-13904

@rytaft rytaft added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-sql-optimizer SQL logical planning and optimizations. labels Nov 12, 2020
@rytaft
Copy link
Collaborator Author

rytaft commented Nov 12, 2020

Always assuming 500 rows seems like a nice and simple solution, but it does have a few complications/drawbacks in addition to the ones mentioned above. In particular, we need to figure out what to do about the column statistics. Column statistics consist of histograms, distinct counts, and null counts, and we use them in the optimizer to estimate the selectivity of predicates, estimate the cardinality of GROUP BY operations, and more. How will we change the histograms, distinct counts, and null counts to match up with the row count of 500? If the stats show that we have X rows, do we simply multiply everything by 500/X?

Presumably we would want to ignore any existing column statistics if the row count is very small (e.g., 0 or 1 rows), but at what point should we actually use those stats? When there are at least 5 rows? 10 rows?

This complication could be one reason to prefer the alternative idea listed above of allowing more frequent refreshes of small tables if there are fewer than 4-5 recent refreshes.

@nvanbenschoten
Copy link
Member

It could have some unintended consequences, though (e.g., not using a lookup/index join when we should)

Could you explain why always assuming a table has at least 500 rows would result in not using a lookup join when we should? If anything, I would expect it to have the opposite effect – using a lookup join when we shouldn't.

We could also just increase the cost of unconstrained, unlimited scans, but that seems like it could cause other problems and lead us to pick worse plans in cases where the table doesn't actually have a lot of tombstones

Are there cases that we know about where this would result in worse plans?

@rytaft
Copy link
Collaborator Author

rytaft commented Nov 12, 2020

Could you explain why always assuming a table has at least 500 rows would result in not using a lookup join when we should? If anything, I would expect it to have the opposite effect – using a lookup join when we shouldn't.

That could happen if the small table is the input to the lookup join. We might choose a hash or merge join because we think the lookup join will be too expensive, when in fact that would be the better plan.

Are there cases that we know about where this would result in worse plans?

I don't know of a particular issue/example, but I would think that if we had to choose between doing a full table scan of a small table v. doing a bunch of index/lookup joins so that we could do a constrained scan of a larger table, that would not be worth it.

There is definitely some tuning of the overall cost model that we can and should do, but it's dangerous to change the costs of different operators in isolation since it changes the relative costs of everything else. I'm not saying that increasing the cost of unconstrained scans is a bad idea, I just don't think we should do it without considering all the plan changes that will inevitably result.

@RaduBerinde
Copy link
Member

Could you explain why always assuming a table has at least 500 rows would result in not using a lookup join when we should? If anything, I would expect it to have the opposite effect – using a lookup join when we shouldn't.

That could happen if the small table is the input to the lookup join. We might choose a hash or merge join because we think the lookup join will be too expensive, when in fact that would be the better plan.

I see, like joining a 10 row table with a 10000 row table. 10 lookup joins is clearly better than a full scan of the big table, but 500 lookup joins could be too expensive. Maybe 500 is too much, maybe it should be around the smallest number where we prefer index joins for single-key accesses (over full table scans).

We could actually include the number of tombstones per table as an additional statistic that we collect, and/or collect the average number of keys/bytes scanned or skipped over per table row

One problem with this is that the tombsones can expire without any mutations to the table, and we won't know to refresh the stats. Maybe it should be a historical average of what we've seen rather than what was there when the stats last ran.

Increasing the cost of unconstrained scans seems too arbitrary to me. A constrained scan can have the same problem (especially something like (/NULL - ])

@rytaft
Copy link
Collaborator Author

rytaft commented Nov 12, 2020

Maybe 500 is too much, maybe it should be around the smallest number where we prefer index joins for single-key accesses (over full table scans).

That could work -- we could do a hybrid solution for the first issue: pick a minimum value that's larger than 1 but smaller than 500, and also trigger stats refreshes more frequently for small tables

Maybe it should be a historical average of what we've seen rather than what was there when the stats last ran.

That makes sense. We're already keeping the last 4-5 stats refreshes, so we could calculate the average over all of those refreshes.

To add another data point about how it's difficult to find the correct relative cost between operators: Here's an example where a full table scan is a better plan than doing a constrained scan + index join: #46677. This PR was an attempt to fix similar issues by making index joins more expensive: #54768.

@rytaft
Copy link
Collaborator Author

rytaft commented Jan 25, 2021

Another idea from for the problem of no stats right when a new cluster starts up: determine some crude stats after data is first loaded based on the sizes of the tables on disk. The import process should also know how many rows were created.

@RaduBerinde
Copy link
Member

I have been playing around with various changes. Increasing the row count artificially is problematic - you have to figure out a way to do that to the distinct counts as well somehow, otherwise it doesn't work in many cases (e.g. it won't help with choosing lookup joins). A change like this causes many plan changes and seems pretty risky. I am also worried that it would lead to bad plans when tables are legitimately small.

A less promising change seems to be adding a small penalty for full table scans. It is enough to dissuade unnecessary full scans on small tables (e.g. for FKs). It still changes a few plans that will need to be checked.

@RaduBerinde RaduBerinde self-assigned this Feb 25, 2021
@rytaft
Copy link
Collaborator Author

rytaft commented Feb 26, 2021

A change like this causes many plan changes and seems pretty risky.

This is what I was finding as well when I was working on the fix to #60493 (I initially tried to fix it by artificially inflating the row count).

I think increasing the cost of full scans could help.

I also want to explore the option of triggering more frequent stats refreshes for small tables, perhaps until there are at least 4 stats refreshes (which is how much history we keep around). I might have time to try that out next week..

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Mar 9, 2021
The main motivation for this change is to avoid seemingly bad plans
when the tables are empty or very small, in particular FK checks doing
full scans instead of using an index. This has caused confusion with
customers and users on multiple occasions.

In addition, even if the plans are good for small tables, they are
risky as the tables could get bigger than our stats show. Also, there
are other hidden costs to scanning more than we need (like contention).

This change makes a small cost adjustment - it adds the cost of 10
rows to all full scans. This is enough to discourage full scans on
small tables, and to prefer constrained scans even when the row count
is the same.

Informs cockroachdb#56615.

Release justification: low risk, high benefit changes to existing
functionality.

Release note (performance improvement): fixed cases where the
optimizer was doing unnecessary full table scans when the table was
very small (according to the last collected statistics).
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Mar 9, 2021
The main motivation for this change is to avoid seemingly bad plans
when the tables are empty or very small, in particular FK checks doing
full scans instead of using an index. This has caused confusion with
customers and users on multiple occasions.

In addition, even if the plans are good for small tables, they are
risky as the tables could get bigger than our stats show. Also, there
are other hidden costs to scanning more than we need (like contention).

This change makes a small cost adjustment - it adds the cost of 10
rows to all full scans. This is enough to discourage full scans on
small tables, and to prefer constrained scans even when the row count
is the same.

Informs cockroachdb#56615.

Release justification: low risk, high benefit changes to existing
functionality.

Release note (performance improvement): fixed cases where the
optimizer was doing unnecessary full table scans when the table was
very small (according to the last collected statistics).
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Mar 9, 2021
The main motivation for this change is to avoid seemingly bad plans
when the tables are empty or very small, in particular FK checks doing
full scans instead of using an index. This has caused confusion with
customers and users on multiple occasions.

In addition, even if the plans are good for small tables, they are
risky as the tables could get bigger than our stats show. Also, there
are other hidden costs to scanning more than we need (like contention).

This change makes a small cost adjustment - it adds the cost of 10
rows to all full scans. This is enough to discourage full scans on
small tables, and to prefer constrained scans even when the row count
is the same.

Informs cockroachdb#56615.

Release justification: low risk, high benefit changes to existing
functionality.

Release note (performance improvement): fixed cases where the
optimizer was doing unnecessary full table scans when the table was
very small (according to the last collected statistics).
craig bot pushed a commit that referenced this issue Mar 10, 2021
61680: opt: increase cost of full scans r=RaduBerinde a=RaduBerinde

The main motivation for this change is to avoid seemingly bad plans
when the tables are empty or very small, in particular FK checks doing
full scans instead of using an index. This has caused confusion with
customers and users on multiple occasions.

In addition, even if the plans are good for small tables, they are
risky as the tables could get bigger than our stats show. Also, there
are other hidden costs to scanning more than we need (like contention).

This change makes a relatively small cost adjustment - it adds the
cost of 10 rows to full scans. This is enough to discourage full scans
on small tables, and to prefer constrained scans even when the row
count is the same.

Informs #56615.
Fixes #56661.

Release justification: low risk, high benefit changes to existing
functionality.

Release note (performance improvement): fixed cases where the
optimizer was doing unnecessary full table scans when the table was
very small (according to the last collected statistics).

Co-authored-by: Radu Berinde <[email protected]>
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Mar 10, 2021
The main motivation for this change is to avoid seemingly bad plans
when the tables are empty or very small, in particular FK checks doing
full scans instead of using an index. This has caused confusion with
customers and users on multiple occasions.

In addition, even if the plans are good for small tables, they are
risky as the tables could get bigger than our stats show. Also, there
are other hidden costs to scanning more than we need (like contention).

This change makes a small cost adjustment - it adds the cost of 10
rows to all full scans. This is enough to discourage full scans on
small tables, and to prefer constrained scans even when the row count
is the same.

Informs cockroachdb#56615.

Release justification: low risk, high benefit changes to existing
functionality.

Release note (performance improvement): fixed cases where the
optimizer was doing unnecessary full table scans when the table was
very small (according to the last collected statistics).
@RaduBerinde
Copy link
Member

#61680 is the change that we wanted in 21.1 for this issue. Moving out of the 21.1 bucket.

@michae2 michae2 added the A-sql-table-stats Table statistics (and their automatic refresh). label Mar 27, 2023
@mgartner mgartner moved this to Backlog (DO NOT ADD NEW ISSUES) in SQL Queries Jul 24, 2023
@rytaft rytaft added O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs P-3 Issues/test failures with no fix SLA labels Feb 5, 2024
@exalate-issue-sync exalate-issue-sync bot removed the P-3 Issues/test failures with no fix SLA label Feb 5, 2024
@rytaft rytaft added the P-3 Issues/test failures with no fix SLA label Feb 5, 2024
@mgartner mgartner moved this from Backlog (DO NOT ADD NEW ISSUES) to New Backlog in SQL Queries Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-optimizer SQL logical planning and optimizations. A-sql-table-stats Table statistics (and their automatic refresh). C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) C-performance Perf of queries or internals. Solution not expected to change functional behavior. O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs P-3 Issues/test failures with no fix SLA T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

5 participants