server: sync on SQL setting overrides in testserver tenant init #110789

knz · 2023-09-17T16:44:52Z

Prerequisite for tests in #110758.
Fixes #110560.
Epic: CRDB-6671

Prior to this patch, if a test was running SET CLUSTER SETTING or ALTER VIRTUAL CLUSTER SET CLUSTER SETTING prior to starting the service for a virtual cluster, it wasn't guaranteed that the setting update had propagated (because it propagates via a rangefeed).

This commit fixes that.

Release note: None

Prior to this patch, if a test was running SET CLUSTER SETTING or ALTER VIRTUAL CLUSTER SET CLUSTER SETTING prior to starting the service for a virtual cluster, it wasn't guaranteed that the setting update had propagated (because it propagates via a rangefeed). This commit fixes that. Release note: None

cockroach-teamcity · 2023-09-17T16:45:01Z

This change is

yuzefovich

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DarrylWong, @rachitgsrivastava, and @stevendanna)

knz · 2023-09-19T03:20:53Z

TFYR

bors r=yuzefovich

craig · 2023-09-19T03:56:50Z

Build succeeded:

Bazel Essential CI (Cockroach)

110805: sql: limit statistics discard log message r=j82w a=j82w Problem: The discard log message occurs for every transaction end after the limit is hit. This causes the log to be flooded with discard messages. This is not useful for users and can cause issues with telemetry pipelines. Solution: The discard message will only be logged once per minute. The log rate is controlled by a cluster setting. This allows the message to be set to a very large interval if this expected behavior for a cluster. Refactored: The SQLStats creates and hold the reference to the counts. Then each container which is per an app name is passed the counts by reference. It's not obvious that the counts are shared between the containers. The code was refactored to make a single object to hold the counts and pass all the related content together. This makes the code easier to read and expand in the future if other values need to be added. Fixes: #110454 Release note (sql change): The discard log message is now limited to once per minute by default. The message was also changed to have both the number of transactions and the number of statements that were discarded. 110947: server: fix sync on setting overrides for secondary tenants r=yuzefovich a=knz Improves/completes #110789 Prerequisite for tests in #110758. Fixes #110560. Epic: CRDB-6671 The previous patch in this area was merely restarting the rangefeed but did not actually wait for the initial update event to be received. This patch fixes it. Release note: None Co-authored-by: j82w <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]>

111150: settings: make `.Override` set the value origin r=yuzefovich,stevendanna a=knz Previous in sequence: - [x] #110789 and #110947 - [x] #110676 - [x] #111008 - [x] #111145 - [x] #111147 - [x] #111149 Needed for #110758 Epic: CRDB-6671 Prior to this patch, setting overrides in tests did not have their value origin set properly. This patch fixes it. Co-authored-by: Raphael 'kena' Poss <[email protected]>

111153: settings: simple refactors r=yuzefovich,stevendanna a=knz See the last 7 commits for details. Previous in sequence: - [x] #110789 and #110947 - [x] #110676 - [x] #111008 - [x] #111145 - [x] #111147 - [x] #111149 - [x] #111150 Needed for #110758 Epic: CRDB-6671 Co-authored-by: Raphael 'kena' Poss <[email protected]>

110758: server,settings: properly cascade defaults for TenantReadOnly r=stevendanna,yuzefovich a=knz Previous in sequence: - [x] #110789 and #110947 - [x] #110676 - [x] #111008 - [x] #111145 - [x] #111147 - [x] #111149 - [x] #111150 - [x] #111153 - [x] #111475 - [x] #111210 - [x] #111212 Fixes #108677. Fixes #85729. Fixes #91825. Completes the work described in the settings RFC. Epic: CRDB-6671 TLDR: this patch ensures that virtual cluster servers observe changes made to TenantReadOnly settings via SET CLUSTER SETTING in the system interface, even when there is no override set via ALTER VIRTUAL CLUSTER SET CLUSTER SETTING. For example, after `SET CLUSTER SETTING kv.closed_timestamp.target_duration = '10s'` in the system interface, this value will show up via `SHOW CLUSTER SETTING` in a virtual cluster SQL session. This changes the way that settings are picked up in virtual cluster, as follows: 1. if there is an override specifically for this tenant's ID (in `tenant_settings`), use that. 2. otherwise, if there is an override for the pseudo-ID 0 (in `tenant_settings` still, set via `ALTER TENANT ALL SET CLUSTER SETTING`), then use that. 3. **NEW** otherwise, if the class is TenantReadOnly and there is a custom value in `system.settings`, set via a regular `SET CLUSTER SETTING` in the system tenant, then use that. 4. otherwise, use the global default set via the setting's `Register()` call. ---- Prior to this patch, TenantReadOnly settings as observed from virtual clusters were defined as the following priority order: 1. if there is an override specifically for this tenant's ID (in `tenant_settings`), use that. 2. otherwise, if there is an override for the pseudo-ID 0 (in `tenant_settings` still, set via `ALTER TENANT ALL SET CLUSTER SETTING`), then use that. 3. otherwise, use the global default set via the setting's `Register()` call. Remarkably, this did not pick up any changes made via a plain `SET CLUSTER SETTING` statement via the system interface, which only modifies this setting's value in `system.settings` (thus not `tenant_settings`). This situation was problematic in two ways. To start, settings like `kv.closed_timestamp.target_duration` cannot be set solely in `system.tenant_settings`; they are also used in the storage layer and so must also be picked up from changes in `system.settings`. For these settings, it is common for operators to just issue the plain `SET CLUSTER SETTING` statement (to update `system.settings`) and simply forget to _also_ run `ALTER TENANT ALL SET CLUSTER SETTING`. This mistake is nearly unavoidable and would result in incoherent behavior, where the storage layer would use the customized value and virtual clusters would use the registered global default. The second problem is in mixed-version configurations, where the storage layer runs version N+1 and the SQL service runs version N of the executable. If the registered global default changes from version N to N+1, the SQL service would not properly pick up the new default defined in version N+1 of the storage layer. This patch fixes both problems as follows: - it integrates changes to TenantReadOnly settings observed in `system.settings`, to the watcher that tracks changes to `system.tenant_settings`. When a TenantReadOnly setting is present in the former but not the latter, a synthetic override is added. - it also initializes synthetic overrides for all the TenantReadOnly settings upon server initialization, from the registered global default, so that virtual cluster servers always pick up the storage layer's default as override. 111383: *: simplify tests r=yuzefovich a=knz All commits except the last are from #110758. Epic: CRDB-6671 Now that "tenant-ro" settings take their default from the system tenant's value, we do not need `ALTER TENANT ALL` for them any more. This patch simplifies test code accordingly. Rcommended by `@yuzefovich` in the review for #110758. 111440: cluster-ui: fix db page stories r=THardy98 a=THardy98 Epic: none This change fixes the stories for the database pages. Release note: None 111512: kv: correctly handle shared lock replays in KVNemesis r=nvanbenschoten a=arulajmani Previously, we'd return an AssertionFailed error if a SKIP LOCKED request discovered another request from its own transaction waiting in a lock's wait queue. In SQL's use of KV, this can only happen if the SKIP LOCKED request is being replayed -- so returning an error here is fine. However, this tripped KVNemesis up. This patch marks such errors, for the benefit of KVNemesis, and doesn't call them assertion failed errors. Fixes #111426 Fixes #111506 Fixes #111513 Release note: None Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Thomas Hardy <[email protected]> Co-authored-by: Arul Ajmani <[email protected]>

knz requested review from stevendanna and yuzefovich September 17, 2023 16:44

knz requested review from a team as code owners September 17, 2023 16:44

knz requested review from rachitgsrivastava and DarrylWong and removed request for a team September 17, 2023 16:44

knz mentioned this pull request Sep 17, 2023

settings: assert that SystemOnly settings are not accessed in virtual clusters #110676

Merged

yuzefovich approved these changes Sep 18, 2023

View reviewed changes

craig bot merged commit 19d0030 into cockroachdb:master Sep 19, 2023

knz deleted the 20230917-setting-restart branch September 19, 2023 14:32

knz mentioned this pull request Sep 20, 2023

server: fix sync on setting overrides for secondary tenants #110947

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: sync on SQL setting overrides in testserver tenant init #110789

server: sync on SQL setting overrides in testserver tenant init #110789

knz commented Sep 17, 2023 •

edited

Loading

cockroach-teamcity commented Sep 17, 2023

yuzefovich left a comment

knz commented Sep 19, 2023

craig bot commented Sep 19, 2023

server: sync on SQL setting overrides in testserver tenant init #110789

server: sync on SQL setting overrides in testserver tenant init #110789

Conversation

knz commented Sep 17, 2023 • edited Loading

cockroach-teamcity commented Sep 17, 2023

yuzefovich left a comment

Choose a reason for hiding this comment

knz commented Sep 19, 2023

craig bot commented Sep 19, 2023

knz commented Sep 17, 2023 •

edited

Loading