Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: explicit zone configuration for the timeseries range on a new cluster #123762

Closed
andrewbaptist opened this issue May 7, 2024 · 1 comment · Fixed by #127034 or #128032
Closed

sql: explicit zone configuration for the timeseries range on a new cluster #123762

andrewbaptist opened this issue May 7, 2024 · 1 comment · Fixed by #127034 or #128032
Assignees
Labels
A-zone-configs C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-testcluster Issues found or occurred on a test cluster, i.e. a long-running internal cluster T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@andrewbaptist
Copy link
Collaborator

andrewbaptist commented May 7, 2024

Is your feature request related to a problem? Please describe.
When a cluster is created, we create a zone configuration for the default zone as well as for most of the "special" ranges that are documented here: https://www.cockroachlabs.com/docs/v23.2/configure-replication-zones#create-a-replication-zone-for-a-system-range. The only system range we don't explicitly create a zone config for is the timeseries range.

After a test cluster is created, it has the following configurations defined:

> select target from [show all zone configurations];
                       target
----------------------------------------------------
  RANGE default
  DATABASE system
  TABLE system.public.lease
  RANGE meta
  RANGE system
  RANGE liveness
  TABLE system.public.replication_constraint_stats
  TABLE system.public.replication_stats
  TABLE system.public.statement_statistics
  TABLE system.public.transaction_statistics
  TABLE system.public.tenant_usage
  TABLE system.public.span_stats_tenant_boundaries
  TABLE system.public.statement_activity
  TABLE system.public.transaction_activity
(14 rows)

Note that the 3 ranges for meta, system, and liveness have explicit zone configs, but timeseries doesn't`.

Describe the solution you'd like
On a system creation, we should create an explicit timeseries zone config that exactly matches the RANGE default zone config. This would prevent confusion and allow users to make the right decisions on what RF they want. Otherwise users who don't carefully read the documentation are not likely to consider this range.

Describe alternatives you've considered

  1. We could just leave the system as it is. The default zone config change and the documentation may be sufficient.
  2. We could have the timeseries range use the system configuration if there isn't a more explicit one set.

Additional context
In a test cluster where all the "visible" zone configurations were set with RF=5 and two nodes were brought down, we had a LoQ event for some of the timeseries ranges. This was confusing to debug since it required figuring out that the timeseries range was not included.

Jira issue: CRDB-38531

Epic CRDB-40419

@andrewbaptist andrewbaptist added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-testcluster Issues found or occurred on a test cluster, i.e. a long-running internal cluster A-zone-configs T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) labels May 7, 2024
@craig craig bot closed this as completed in 6154ebf Jul 18, 2024
@arulajmani
Copy link
Collaborator

@andrewbaptist @rafiss @nvanbenschoten and I discussed this internally, and came to the conclusion that adding an explicit entry to system.zones for TIMESERIES may break user expectations in certain cases. In particular, users that configure a cluster wide replication factor of 5x by changing RANGE DEFAULT might be surprised that it no longer applies to the TIMESERIES range. At the least, we don't have compelling evidence to make a backwards incompatible change here. As a result, we're reverting the patch that addressed this issue #127853.

@rafiss had a great suggestion on how we might be able to address the intention of this issue without making a backwards incompatible change. Instead of copying over RANGE DEFAULT on RANGE TIMESERIES entirely, we could instead just copy over a single field (such as the GC TTL). That way, it no longer runs into the 5x replication factor change problem described above.

Another alternative would be to always include an entry for all named zones in the output for SHOW ZONE CONFIGURATIONS, even if there isn't an explicit zone configuration set in system.zones. In cases where there is no zone configuration explicitly set, we could just leave the raw_config_sql column empty (or say ALTER RANGE TIMESERIES DISCARD instead).

Extending the above, one could also imagine making a distinction between SHOW ZONE CONFIGURATIONS and SHOW ALL ZONE CONFIGURATIONS. Today, they both behave the same, but maybe we can re-purpose the latter to include zone configurations for all ranges/schema objects in the system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-zone-configs C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-testcluster Issues found or occurred on a test cluster, i.e. a long-running internal cluster T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
No open projects
Status: Done
3 participants