Skip to content

Commit

Permalink
Update docs on automatic stats refresh rate
Browse files Browse the repository at this point in the history
... and node restarts after stats deletion to clear caches.

Summary of changes:

- Add a new subsection to CBO page, 'Controlling statistics refresh
  rate', where we describe the cases when stats are refreshed in more
  detail.

- To match the structure of the above, we break the instructions for
  deleting stats into a new section 'Turning off statistics'

- Finally, tweak stats deletion instructions on CBO page and CREATE
  STATS page so both say that nodes must be restarted
  post-stats-deletion to clear caches.

Fixes #4809, #4872.
  • Loading branch information
rmloveland committed Jun 17, 2019
1 parent 514a08f commit ee681b7
Show file tree
Hide file tree
Showing 4 changed files with 42 additions and 8 deletions.
2 changes: 2 additions & 0 deletions _includes/v19.1/misc/delete-statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,6 @@ To delete a named set of statistics (e.g, one named "my_stats"), run a query lik
> DELETE FROM system.table_statistics WHERE name = 'my_stats';
~~~

After deleting statistics, restart the nodes in your cluster to clear the statistics caches.

For more information about the `DELETE` statement, see [`DELETE`](delete.html).
2 changes: 2 additions & 0 deletions _includes/v19.2/misc/delete-statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,6 @@ To delete a named set of statistics (e.g, one named "my_stats"), run a query lik
> DELETE FROM system.table_statistics WHERE name = 'my_stats';
~~~

After deleting statistics, restart the nodes in your cluster to clear the statistics caches.

For more information about the `DELETE` statement, see [`DELETE`](delete.html).
23 changes: 19 additions & 4 deletions v19.1/cost-based-optimizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,20 @@ The cost-based optimizer can often find more performant query plans if it has ac

For best query performance, most users should leave automatic statistics enabled with the default settings. The information provided in this section is useful for troubleshooting or performance tuning by advanced users.

To control how often the automatic statistics jobs run on your cluster, adjust the following [cluster settings](cluster-settings.html). They define the target number of rows in a table that should be stale before statistics on that table are refreshed.
#### Controlling statistics refresh rate

- `sql.stats.automatic_collection.fraction_stale_rows`
- `sql.stats.automatic_collection.min_stale_rows`
Statistics are refreshed in the following cases:

1. When there are no statistics.
2. When it's been a long time since the last refresh, where "long time" is defined according to a moving average of the time across the last several refreshes.
3. After each mutation operation ([`INSERT`](insert.html), [`UPDATE`](update.html), or [`DELETE`](delete.html)), the probability of a refresh is calculated using a formula that takes the [cluster settings](cluster-settings.html) shown below as inputs. These settings define the target number of rows in a table that should be stale before statistics on that table are refreshed.

| Setting | Details |
|------------------------------------------------------+--------------------------------------------------------------------------------------|
| `sql.stats.automatic_collection.fraction_stale_rows` | Target fraction of stale rows per table that will trigger a statistics refresh |
| `sql.stats.automatic_collection.min_stale_rows` | Target minimum number of stale rows per table that will trigger a statistics refresh |

#### Turning off statistics

If you need to turn off automatic statistics collection, follow the steps below:

Expand All @@ -83,7 +93,12 @@ If you need to turn off automatic statistics collection, follow the steps below:

2. Use the [`SHOW STATISTICS`](show-statistics.html) statement to view automatically generated statistics.

3. Delete the automatically generated statistics using the instructions in [Delete statistics](create-statistics.html#delete-statistics).
3. Delete the automatically generated statistics using the following statement:

{% include copy-clipboard.html %}
~~~ sql
> DELETE FROM system.table_statistics WHERE true;
~~~

4. Restart the nodes in your cluster to clear the statistics caches.

Expand Down
23 changes: 19 additions & 4 deletions v19.2/cost-based-optimizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,20 @@ By default, CockroachDB generates table statistics automatically as tables are u

For best query performance, most users should leave automatic statistics enabled with the default settings. The information provided in this section is useful for troubleshooting or performance tuning by advanced users.

To control how often the automatic statistics jobs run on your cluster, adjust the following [cluster settings](cluster-settings.html). They define the target number of rows in a table that should be stale before statistics on that table are refreshed.
#### Controlling statistics refresh rate

- `sql.stats.automatic_collection.fraction_stale_rows`
- `sql.stats.automatic_collection.min_stale_rows`
Statistics are refreshed in the following cases:

1. When there are no statistics.
2. When it's been a long time since the last refresh, where "long time" is defined according to a moving average of the time across the last several refreshes.
3. After each mutation operation ([`INSERT`](insert.html), [`UPDATE`](update.html), or [`DELETE`](delete.html)), the probability of a refresh is calculated using a formula that takes the [cluster settings](cluster-settings.html) shown below as inputs. These settings define the target number of rows in a table that should be stale before statistics on that table are refreshed.

| Setting | Details |
|------------------------------------------------------+--------------------------------------------------------------------------------------|
| `sql.stats.automatic_collection.fraction_stale_rows` | Target fraction of stale rows per table that will trigger a statistics refresh |
| `sql.stats.automatic_collection.min_stale_rows` | Target minimum number of stale rows per table that will trigger a statistics refresh |

#### Turning off statistics

If you need to turn off automatic statistics collection, follow the steps below:

Expand All @@ -83,7 +93,12 @@ If you need to turn off automatic statistics collection, follow the steps below:

2. Use the [`SHOW STATISTICS`](show-statistics.html) statement to view automatically generated statistics.

3. Delete the automatically generated statistics using the instructions in [Delete statistics](create-statistics.html#delete-statistics).
3. Delete the automatically generated statistics using the following statement:

{% include copy-clipboard.html %}
~~~ sql
> DELETE FROM system.table_statistics WHERE true;
~~~

4. Restart the nodes in your cluster to clear the statistics caches.

Expand Down

0 comments on commit ee681b7

Please sign in to comment.