You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The automatic stats docs (https://www.cockroachlabs.com/docs/v19.1/cost-based-optimizer.html#controlling-automatic-statistics) mention two of the settings used to control automatic stats: sql.stats.automatic_collection.fraction_stale_rows and sql.stats.automatic_collection.min_stale_rows, but they don't give enough info to really understand how they work. One thing that would be helpful would be a direct link to those settings on the settings page, but it would also help to add a bit more info in the "Controlling automatic statistics" section. Some points we should emphasize:
The target number of stale rows in a particular table before a refresh is actually equal to target_stale_rows = fraction_stale_rows * <size of table> + min_stale_rows.
We don't necessarily perform a refresh when exactly that target number of rows are stale, because the decision to start a refresh is probabilistic. After each mutation operation (INSERT, UPDATE, or DELETE), the probability of a refresh is P(refresh) = (# rows updated, inserted, deleted)/target_stale_rows. The result is that on average, we will perform a refresh after target_stale_rows are stale, but it may be higher or lower.
To ensure that we always have stats and they aren't too stale, we also always perform a refresh if there are no stats or if it's been a long time since the last refresh (a "long time" is defined as 2x the average time between refreshes over the last 4 refreshes).
I don't think we necessarily need all this detail, but I've put it here so you can judge what to include or not. Please let me know if you have any questions -- thanks!
cc @RaduBerinde
The text was updated successfully, but these errors were encountered:
... and node restarts after stats deletion to clear caches.
Summary of changes:
- Add a new subsection to CBO page, 'Controlling statistics refresh
rate', where we describe the cases when stats are refreshed in more
detail.
- To match the structure of the above, we break the instructions for
deleting stats into a new section 'Turning off statistics'
- Finally, tweak stats deletion instructions on CBO page and CREATE
STATS page so both say that nodes must be restarted
post-stats-deletion to clear caches.
Fixes#4809, #4872.
... and node restarts after stats deletion to clear caches.
Summary of changes:
- Add a new subsection to CBO page, 'Controlling statistics refresh
rate', where we describe the cases when stats are refreshed in more
detail.
- To match the structure of the above, we break the instructions for
deleting stats into a new section 'Turning off statistics'
- Finally, tweak stats deletion instructions on CBO page and CREATE
STATS page so both say that nodes must be restarted
post-stats-deletion to clear caches.
Fixes#4809, #4872.
... and node restarts after stats deletion to clear caches.
Summary of changes:
- Add a new subsection to CBO page, 'Controlling statistics refresh
rate', where we describe the cases when stats are refreshed in more
detail.
- To match the structure of the above, we break the instructions for
deleting stats into a new section 'Turning off statistics'
- Finally, tweak stats deletion instructions on CBO page and CREATE
STATS page so both say that nodes must be restarted
post-stats-deletion to clear caches.
Fixes#4809, #4872.
The automatic stats docs (https://www.cockroachlabs.com/docs/v19.1/cost-based-optimizer.html#controlling-automatic-statistics) mention two of the settings used to control automatic stats:
sql.stats.automatic_collection.fraction_stale_rows
andsql.stats.automatic_collection.min_stale_rows
, but they don't give enough info to really understand how they work. One thing that would be helpful would be a direct link to those settings on the settings page, but it would also help to add a bit more info in the "Controlling automatic statistics" section. Some points we should emphasize:target_stale_rows = fraction_stale_rows * <size of table> + min_stale_rows
.P(refresh) = (# rows updated, inserted, deleted)/target_stale_rows
. The result is that on average, we will perform a refresh aftertarget_stale_rows
are stale, but it may be higher or lower.I don't think we necessarily need all this detail, but I've put it here so you can judge what to include or not. Please let me know if you have any questions -- thanks!
cc @RaduBerinde
The text was updated successfully, but these errors were encountered: