-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-22.1: stats: table setting to turn auto stats collection on/off #81019
Conversation
Fixes cockroachdb#40989 Previously, there was no way to enable or disable automatic statistics collection at the table level. It could only be turned on or off via the `sql.stats.automatic_collection.enabled` cluster setting. This was inadequate because statistics collection can be expensive for large tables, and it would be desirable to defer collection until after data is finished loading, or in off hours. Also, small tables which are frequently updated may trigger statistics collection leading to unnecessary overhead and/or unpredictable query plan changes. To address this, this patch adds support for setting of the following cluster settings at the table level: ``` sql_stats_automatic_collection_enabled sql_stats_automatic_collection_min_stale_rows sql_stats_automatic_collection_fraction_stale_rows ``` which correspond to the similarly-named cluster settings: ``` sql.stats.automatic_collection.enabled sql.stats.automatic_collection.min_stale_rows sql.stats.automatic_collection.fraction_stale_rows ``` for example: ``` ALTER TABLE t1 SET (sql_stats_automatic_collection_enabled = true); ALTER TABLE t1 SET (sql_stats_automatic_collection_fraction_stale_rows = 0.1, sql_stats_automatic_collection_min_stale_rows = 2000); ``` The table-level setting takes precedence over the cluster setting. Release justification: Low risk fix for missing fine-grained control over automatic statistics collection. Release note (sql change): Automatic statistics collection can now be enabled or disabled for individual tables, taking precedence over the cluster setting, for example: ``` ALTER TABLE t1 SET (sql_stats_automatic_collection_enabled = true); ALTER TABLE t1 SET (sql_stats_automatic_collection_enabled = false); ALTER TABLE t1 RESET (sql_stats_automatic_collection_enabled); ``` RESET removes the setting value entirely, in which case the similarly-name cluster setting, `sql.stats.automatic_collection.enabled`, is in effect for the table. Cluster settings `sql.stats.automatic_collection.fraction_stale_rows` and `sql.stats.automatic_collection.min_stale_rows` now also have table setting counterparts: ``` sql_stats_automatic_collection_fraction_stale_rows sql_stats_automatic_collection_min_stale_rows ``` The table settings may be set at table creation time, or later via ALTER TABLE ... SET, independent of whether auto stats is enabled: ``` ALTER TABLE t1 SET (sql_stats_automatic_collection_fraction_stale_rows = 0.1, sql_stats_automatic_collection_min_stale_rows = 2000); CREATE TABLE t1 (a INT, b INT) WITH (sql_stats_automatic_collection_enabled = true, sql_stats_automatic_collection_min_stale_rows = 1000000, sql_stats_automatic_collection_fraction_stale_rows= 0.05 ); ``` The current table settings (storage parameters) are shown in the `WITH` clause output of `SHOW CREATE TABLE`. Note, any row mutations which have occurred a minute or two before disabling auto stats collection via `ALTER TABLE ... SET` may trigger stats collection, though DML submitted after the setting change will not.
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 19 of 19 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
Backport 1/1 commits from #78110.
/cc @cockroachdb/release
Fixes #40989
Previously, there was no way to enable or disable automatic statistics
collection at the table level. It could only be turned on or off via the
sql.stats.automatic_collection.enabled
cluster setting.This was inadequate because statistics collection can be expensive for
large tables, and it would be desirable to defer collection until after
data is finished loading, or in off hours. Also, small tables which are
frequently updated may trigger statistics collection leading to
unnecessary overhead and/or unpredictable query plan changes.
To address this, this patch adds support for setting of the following
cluster settings at the table level:
which correspond to the similarly-named cluster settings:
for example:
The table-level setting takes precedence over the cluster setting.
Release justification: Low risk fix for missing fine-grained control
over automatic statistics collection.
Release note (sql change): Automatic statistics collection can now be
enabled or disabled for individual tables, taking precedence over the
cluster setting, for example:
RESET removes the setting value entirely, in which case the
similarly-name cluster setting,
sql.stats.automatic_collection.enabled
, is in effect for the table.Cluster settings
sql.stats.automatic_collection.fraction_stale_rows
and
sql.stats.automatic_collection.min_stale_rows
now also have tablesetting counterparts:
The table settings may be set at table creation time, or later via ALTER
TABLE ... SET, independent of whether auto stats is enabled:
The current table settings (storage parameters) are shown in the
WITH
clause output of
SHOW CREATE TABLE
.Note, any row mutations which have occurred a minute or two before
disabling auto stats collection via
ALTER TABLE ... SET
may triggerstats collection, though DML submitted after the setting change will
not.