Skip to content

Commit

Permalink
server,settings: properly cascade defaults for TenantReadOnly
Browse files Browse the repository at this point in the history
TLDR: this patch ensures that virtual cluster servers observe changes
made to TenantReadOnly settings via SET CLUSTER SETTING in the system
interface, even when there is no override set via ALTER VIRTUAL
CLUSTER SET CLUSTER SETTING.

For example, after `SET CLUSTER SETTING
kv.closed_timestamp.target_duration = '10s'` in the system interface,
this value will show up via `SHOW CLUSTER SETTING` in a virtual
cluster SQL session.

This changes the way that settings are picked up in virtual cluster,
as follows:

1. if there is an override specifically for this tenant's ID
   (in `tenant_settings`), use that.
2. otherwise, if there is an override for the pseudo-ID 0
   (in `tenant_settings` still, set via `ALTER TENANT ALL SET CLUSTER
   SETTING`), then use that.
3. **NEW**  otherwise, if the class is TenantReadOnly and there
   is a custom value in `system.settings`, set via a regular
   `SET CLUSTER SETTING` in the system tenant, then use that.
4. otherwise, use the global default set via the setting's
   `Register()` call.

----

Prior to this patch, TenantReadOnly settings as observed from virtual
clusters were defined as the following priority order:

1. if there is an override specifically for this tenant's ID
   (in `tenant_settings`), use that.
2. otherwise, if there is an override for the pseudo-ID 0
   (in `tenant_settings` still, set via `ALTER TENANT ALL SET CLUSTER
   SETTING`), then use that.
3. otherwise, use the global default set via the setting's
   `Register()` call.

Remarkably, this did not pick up any changes made via a plain `SET
CLUSTER SETTING` statement via the system interface, which only
modifies this setting's value in `system.settings` (thus not
`tenant_settings`).

This situation was problematic in two ways.

To start, settings like `kv.closed_timestamp.target_duration` cannot
be set solely in `system.tenant_settings`; they are also used in the
storage layer and so must also be picked up from changes in
`system.settings`.

For these settings, it is common for operators to just issue the plain
`SET CLUSTER SETTING` statement (to update `system.settings`) and
simply forget to _also_ run `ALTER TENANT ALL SET CLUSTER SETTING`.

This mistake is nearly unavoidable and would result in incoherent
behavior, where the storage layer would use the customized value
and virtual clusters would use the registered global default.

The second problem is in mixed-version configurations, where
the storage layer runs version N+1 and the SQL service runs version N
of the executable. If the registered global default changes from
version N to N+1, the SQL service would not properly pick up
the new default defined in version N+1 of the storage layer.

This patch fixes both problems as follows:

- it integrates changes to TenantReadOnly settings observed
  in `system.settings`, to the watcher that tracks changes
  to `system.tenant_settings`. When a TenantReadOnly setting
  is present in the former but not the latter, a synthetic
  override is added.

- it also initializes synthetic overrides for all the
  TenantReadOnly settings upon server initialization,
  from the registered global default, so that virtual
  cluster servers always pick up the storage layer's
  default as override.

Release note: None
  • Loading branch information
knz committed Sep 29, 2023
1 parent 7a163cd commit 654bd80
Show file tree
Hide file tree
Showing 8 changed files with 359 additions and 14 deletions.
2 changes: 2 additions & 0 deletions pkg/server/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -1110,6 +1110,8 @@ func NewServer(cfg Config, stopper *stop.Stopper) (serverctl.ServerStartupInterf
spanConfigKVAccessor: spanConfig.kvAccessorForTenantRecords,
kvStoresIterator: kvserver.MakeStoresIterator(node.stores),
inspectzServer: inspectzServer,

notifyChangeToTenantReadOnlySettings: tenantSettingsWatcher.SetAlternateDefaults,
},
SQLConfig: &cfg.SQLConfig,
BaseConfig: &cfg.BaseConfig,
Expand Down
12 changes: 10 additions & 2 deletions pkg/server/server_sql.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvtenant"
"github.com/cockroachdb/cockroach/pkg/kv/kvclient/rangefeed"
"github.com/cockroachdb/cockroach/pkg/kv/kvclient/rangestats"
"github.com/cockroachdb/cockroach/pkg/kv/kvpb"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvserverbase"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/liveness/livenesspb"
Expand Down Expand Up @@ -257,6 +258,13 @@ type sqlServerOptionalKVArgs struct {
// inspectzServer is used to power various crdb_internal vtables, exposing
// the equivalent of /inspectz but through SQL.
inspectzServer inspectzpb.InspectzServer

// notifyChangeToTenantReadOnlySettings is called by the settings
// watcher when one or more TenandReadOnly setting is updated via
// SET CLUSTER SETTING (i.e. updated in system.settings).
//
// The second argument must be sorted by setting key already.
notifyChangeToTenantReadOnlySettings func(context.Context, []kvpb.TenantSetting)
}

// sqlServerOptionalTenantArgs are the arguments supplied to newSQLServer which
Expand Down Expand Up @@ -568,8 +576,8 @@ func newSQLServer(ctx context.Context, cfg sqlServerArgs) (*SQLServer, error) {

var settingsWatcher *settingswatcher.SettingsWatcher
if codec.ForSystemTenant() {
settingsWatcher = settingswatcher.New(
cfg.clock, codec, cfg.Settings, cfg.rangeFeedFactory, cfg.stopper, cfg.settingsStorage,
settingsWatcher = settingswatcher.NewWithNotifier(ctx,
cfg.clock, codec, cfg.Settings, cfg.rangeFeedFactory, cfg.stopper, cfg.notifyChangeToTenantReadOnlySettings, cfg.settingsStorage,
)
} else {
// Create the tenant settings watcher, using the tenant connector as the
Expand Down
14 changes: 14 additions & 0 deletions pkg/server/tenantsettingswatcher/watcher.go
Original file line number Diff line number Diff line change
Expand Up @@ -265,3 +265,17 @@ func (w *Watcher) GetAllTenantOverrides(
) (overrides []kvpb.TenantSetting, changeCh <-chan struct{}) {
return w.GetTenantOverrides(ctx, allTenantOverridesID)
}

// SetAternateDefault configures a custom default value
// for a setting when there is no stored value picked up
// from system.tenant_settings.
//
// The second argument must be sorted by setting key already.
//
// At the time of this writing, this is used for TenantReadOnly
// settings, so that the values from the system tenant's
// system.settings table are used when there is no override
// in .tenant_settings.
func (w *Watcher) SetAlternateDefaults(ctx context.Context, payloads []kvpb.TenantSetting) {
w.store.setAlternateDefaults(ctx, payloads)
}
32 changes: 22 additions & 10 deletions pkg/server/testserver.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/server/authserver"
"github.com/cockroachdb/cockroach/pkg/server/serverpb"
"github.com/cockroachdb/cockroach/pkg/server/status"
"github.com/cockroachdb/cockroach/pkg/settings"
"github.com/cockroachdb/cockroach/pkg/settings/cluster"
"github.com/cockroachdb/cockroach/pkg/sql"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/bootstrap"
Expand Down Expand Up @@ -1537,16 +1538,27 @@ func (ts *testServer) StartTenant(
baseCfg.HeapProfileDirName = ts.Cfg.BaseConfig.HeapProfileDirName
baseCfg.GoroutineDumpDirName = ts.Cfg.BaseConfig.GoroutineDumpDirName

// TODO(knz): Once https://github.com/cockroachdb/cockroach/issues/96512 is
// resolved, we could override this cluster setting for the secondary tenant
// using SQL instead of reaching in using this testing knob. One way to do
// so would be to perform the override for all tenants and only then
// initializing our test tenant; However, the linked issue above prevents
// us from being able to do so.
sql.SecondaryTenantScatterEnabled.Override(ctx, &baseCfg.Settings.SV, true)
sql.SecondaryTenantSplitAtEnabled.Override(ctx, &baseCfg.Settings.SV, true)
sql.SecondaryTenantZoneConfigsEnabled.Override(ctx, &baseCfg.Settings.SV, true)
sql.SecondaryTenantsMultiRegionAbstractionsEnabled.Override(ctx, &baseCfg.Settings.SV, true)
for _, setting := range []settings.Setting{
sql.SecondaryTenantScatterEnabled,
sql.SecondaryTenantSplitAtEnabled,
sql.SecondaryTenantZoneConfigsEnabled,
sql.SecondaryTenantsMultiRegionAbstractionsEnabled,
} {
// Update the override for this setting. We need to do this
// instead of calling .Override() on the setting directly: certain
// tests expect to be able to change the value afterwards using
// another ALTER VC SET CLUSTER SETTING statement, which is not
// possible with regular overrides.
_, err := ie.Exec(ctx, "testserver-alter-tenant-cap", nil,
fmt.Sprintf("ALTER VIRTUAL CLUSTER [$1] SET CLUSTER SETTING %s = true", setting.Name()), params.TenantID.ToUint64())
if err != nil {
if params.SkipTenantCheck {
log.Infof(ctx, "ignoring error changing setting because SkipTenantCheck is true: %v", err)
} else {
return nil, err
}
}
}

// Waiting for capabilities can time To avoid paying this cost in all
// cases, we only set the nodelocal storage capability if the caller has
Expand Down
3 changes: 3 additions & 0 deletions pkg/settings/integration_tests/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@ go_test(
name = "integration_tests_test",
srcs = [
"main_test.go",
"propagation_test.go",
"settings_test.go",
],
args = ["-test.timeout=295s"],
shard_count = 6,
deps = [
"//pkg/base",
"//pkg/security/securityassets",
Expand All @@ -15,6 +17,7 @@ go_test(
"//pkg/settings",
"//pkg/testutils",
"//pkg/testutils/serverutils",
"//pkg/testutils/skip",
"//pkg/testutils/sqlutils",
"//pkg/util/leaktest",
"//pkg/util/log",
Expand Down
206 changes: 206 additions & 0 deletions pkg/settings/integration_tests/propagation_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
// Copyright 2023 The Cockroach Authors.
//
// Use of this software is governed by the Business Source License
// included in the file licenses/BSL.txt.
//
// As of the Change Date specified in that file, in accordance with
// the Business Source License, use of this software will be governed
// by the Apache License, Version 2.0, included in the file
// licenses/APL.txt.

package integration_tests

import (
"context"
"fmt"
"testing"

"github.com/cockroachdb/cockroach/pkg/base"
"github.com/cockroachdb/cockroach/pkg/settings"
"github.com/cockroachdb/cockroach/pkg/testutils"
"github.com/cockroachdb/cockroach/pkg/testutils/serverutils"
"github.com/cockroachdb/cockroach/pkg/testutils/skip"
"github.com/cockroachdb/cockroach/pkg/testutils/sqlutils"
"github.com/cockroachdb/cockroach/pkg/util/leaktest"
"github.com/cockroachdb/cockroach/pkg/util/log"
"github.com/cockroachdb/errors"
"github.com/stretchr/testify/require"
)

// TestSettingDefaultPropagationReadOnly1 runs one of 4 invocations of
// `RunSettingDefaultPropagationTest`. The test is split 4-ways to
// avoid timeouts in CI and increased test parallelism.
func TestSettingDefaultPropagationReadOnly1(t *testing.T) {
defer leaktest.AfterTest(t)()

runSettingDefaultPropagationTest(t, roS, true)
}

// TestSettingDefaultPropagationReadOnly2 runs one of 4 invocations of
// `RunSettingDefaultPropagationTest`. The test is split 4-ways to
// avoid timeouts in CI and increased test parallelism.
func TestSettingDefaultPropagationReadOnly2(t *testing.T) {
defer leaktest.AfterTest(t)()

runSettingDefaultPropagationTest(t, roS, false)
}

// TestSettingDefaultPropagationReadWrite1 runs one of 4 invocations
// of `RunSettingDefaultPropagationTest`. The test is split 4-ways to
// avoid timeouts in CI and increased test parallelism.
func TestSettingDefaultPropagationReadWrite1(t *testing.T) {
defer leaktest.AfterTest(t)()

runSettingDefaultPropagationTest(t, rwS, true)
}

// TestSettingDefaultPropagationReadWrite2 runs one of 4 invocations
// of `RunSettingDefaultPropagationTest`. The test is split 4-ways to
// avoid timeouts in CI and increased test parallelism.
func TestSettingDefaultPropagationReadWrite2(t *testing.T) {
defer leaktest.AfterTest(t)()

runSettingDefaultPropagationTest(t, rwS, false)
}

var roS = settings.RegisterStringSetting(settings.TenantReadOnly, "tenant.read.only", "desc", "initial")
var rwS = settings.RegisterStringSetting(settings.TenantWritable, "tenant.writable", "desc", "initial")

// runSettingDefaultPropagationTest is a test helper.
func runSettingDefaultPropagationTest(
t *testing.T, setting *settings.StringSetting, setSystemBefore bool,
) {
defer log.Scope(t).Close(t)

// This test currently takes >1 minute.
//
// TODO(multi-tenant): make it faster. Currently most of the
// overhead is in tenant service initialization and shutdown
// (strangely, especially shutdown).
skip.UnderShort(t)

skip.UnderRace(t, "slow test")
skip.UnderStress(t, "slow test")

ctx := context.Background()
s := serverutils.StartServerOnly(t, base.TestServerArgs{
DefaultTestTenant: base.TestControlsTenantsExplicitly,
})
defer s.Stopper().Stop(ctx)

sysDB := sqlutils.MakeSQLRunner(s.SystemLayer().SQLConn(t, ""))
sysDB.Exec(t, "SELECT crdb_internal.create_tenant($1, 'test')", serverutils.TestTenantID().ToUint64())
// Speed up the tests.
sysDB.Exec(t, "SET CLUSTER SETTING kv.closed_timestamp.target_duration = '10ms'")

expectation := func(setting settings.Setting, sysOverride, tenantOverride, tenantAllOverride string) string {
if tenantOverride != "" {
// ALTER TENANT xxx SET CLUSTER SETTING -> highest precedence.
return tenantOverride
}
if tenantAllOverride != "" {
// ALTER TENANT ALL SET CLUSTER SETTING, override is used
// only if there is no tenant-specific override.
return tenantAllOverride
}
// No tenant override. What is the default?
// For TenantReadOnly, if there is a custom value in the
// system interface, that becomes the default.
if setting.Class() == settings.TenantReadOnly && sysOverride != "" {
return sysOverride
}
// Otherwise, fall back to the default.
return "initial"
}

key := setting.InternalKey()
testutils.RunTrueAndFalse(t, "set-override-before", func(t *testing.T, setOverrideBefore bool) {
testutils.RunTrueAndFalse(t, "set-override-all-before", func(t *testing.T, setOverrideAllBefore bool) {
testutils.RunTrueAndFalse(t, "set-system-after", func(t *testing.T, setSystemAfter bool) {
testutils.RunTrueAndFalse(t, "set-override-after", func(t *testing.T, setOverrideAfter bool) {
testutils.RunTrueAndFalse(t, "set-override-all-after", func(t *testing.T, setOverrideAllAfter bool) {
var sysOverride string
if setSystemBefore {
t.Logf("before start: system custom value via SET CLUSTER SETTING")
sysOverride = "before-sys"
sysDB.Exec(t, fmt.Sprintf("SET CLUSTER SETTING %s = '%s'", key, sysOverride))
} else {
sysDB.Exec(t, fmt.Sprintf("RESET CLUSTER SETTING %s", key))
}
var tenantAllOverride string
if setOverrideAllBefore {
t.Logf("before start: override via ALTER VIRTUAL CLUSTER ALL SET CLUSTER SETTING")
tenantAllOverride = "before-all"
sysDB.Exec(t, fmt.Sprintf("ALTER VIRTUAL CLUSTER ALL SET CLUSTER SETTING %s = '%s'", key, tenantAllOverride))
} else {
sysDB.Exec(t, fmt.Sprintf("ALTER VIRTUAL CLUSTER ALL RESET CLUSTER SETTING %s", key))
}
var tenantOverride string
if setOverrideBefore {
t.Logf("before start: override via ALTER VIRTUAL CLUSTER test SET CLUSTER SETTING")
tenantOverride = "before-specific"
sysDB.Exec(t, fmt.Sprintf("ALTER VIRTUAL CLUSTER test SET CLUSTER SETTING %s = '%s'", key, tenantOverride))
} else {
sysDB.Exec(t, fmt.Sprintf("ALTER VIRTUAL CLUSTER test RESET CLUSTER SETTING %s", key))
}

t.Logf("starting secondary tenant service")
ten, err := s.TenantController().StartTenant(ctx, base.TestTenantArgs{
TenantID: serverutils.TestTenantID(),
})
require.NoError(t, err)
defer ten.AppStopper().Stop(ctx)

expected := expectation(setting, sysOverride, tenantOverride, tenantAllOverride)
t.Logf("expecting setting set to %q", expected)
val := setting.Get(&ten.ClusterSettings().SV)
t.Logf("setting currenly at %q", val)
if val != expected {
t.Errorf("expected %q, got %q", expected, val)
}

t.Logf("changing configuration")
if setSystemAfter {
t.Logf("after start: setting system custom value via SET CLUSTER SETTING")
sysOverride = "after-sys"
sysDB.Exec(t, fmt.Sprintf("SET CLUSTER SETTING %s = '%s'", key, sysOverride))
} else {
t.Logf("after start: resetting system custom value with RESET CLUSTER SETTING")
sysOverride = ""
sysDB.Exec(t, fmt.Sprintf("RESET CLUSTER SETTING %s", key))
}
if setOverrideAllAfter {
t.Logf("after start: override via ALTER VIRTUAL CLUSTER ALL SET CLUSTER SETTING")
tenantAllOverride = "after-all"
sysDB.Exec(t, fmt.Sprintf("ALTER VIRTUAL CLUSTER ALL SET CLUSTER SETTING %s = '%s'", key, tenantAllOverride))
} else {
t.Logf("after start: resetting system custom value with ALTER VIRTUAL CLUSTER ALL RESET CLUSTER SETTING")
tenantAllOverride = ""
sysDB.Exec(t, fmt.Sprintf("ALTER VIRTUAL CLUSTER ALL RESET CLUSTER SETTING %s", key))
}
if setOverrideAfter {
t.Logf("after start: override via ALTER VIRTUAL CLUSTER test SET CLUSTER SETTING")
tenantOverride = "after-specific"
sysDB.Exec(t, fmt.Sprintf("ALTER VIRTUAL CLUSTER test SET CLUSTER SETTING %s = '%s'", key, tenantOverride))
} else {
t.Logf("after start: resetting system custom value with ALTER VIRTUAL CLUSTER test RESET CLUSTER SETTING")
tenantOverride = ""
sysDB.Exec(t, fmt.Sprintf("ALTER VIRTUAL CLUSTER test RESET CLUSTER SETTING %s", key))
}

expected = expectation(setting, sysOverride, tenantOverride, tenantAllOverride)
t.Logf("expecting setting set to %q", expected)
testutils.SucceedsSoon(t, func() error {
val := setting.Get(&ten.ClusterSettings().SV)
t.Logf("setting currenly at %q", val)
if val != expected {
return errors.Newf("expected %q, got %q", expected, val)
}
return nil
})
})
})
})
})
})
}
1 change: 0 additions & 1 deletion pkg/settings/integration_tests/settings_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,6 @@ func TestSettingsRefresh(t *testing.T) {
}
return nil
})

}

func TestSettingsSetAndShow(t *testing.T) {
Expand Down
Loading

0 comments on commit 654bd80

Please sign in to comment.