settings,spanconfig: introduce a protobuf setting type #92466

irfansharif · 2022-11-24T18:57:42Z

For this setting type:

the protoutil.Message is held in memory,
the byte representation is stored in system.settings, and
the json representation is used when accepting input and rendering state (through SHOW CLUSTER SETTING <setting-name>, the raw form is visible when looking directly at system.settings)

We also use this setting type to support power a spanconfig.store.fallback_config_override, which overrides the fallback config used for ranges with no explicit span configs set. Previously we used a hardcoded value -- this makes it a bit more configurable. This is a partial and backportable workaround (read: hack) for #91238 and #91239. In an internal escalation we observed "orphaned" ranges from dropped tables that were not being being referenced by span configs (by virtue of them originating from now-dropped tables/configs). Typically ranges of this sort are short-lived, they're emptied out through GC and merged into LHS neighbors. But if the neighboring ranges are large enough, or load just high enough, the merge does not occur. For such orphaned ranges we were using a hardcoded "fallback config", with a replication factor of three. This made for confusing semantics where if RANGE DEFAULT was configured to have a replication factor of five, our replication reports indicated there were under-replicated ranges. This is partly because replication reports today are powered by zone configs (thus looking at RANGE DEFAULT) -- this will change shortly as part of #89987 where we'll instead consider span config data. In any case, we were warning users of under-replicated ranges but within KV we were not taking any action to upreplicate them -- KV was respecting the hard-coded fallback config. The issues above describe that we should apply each tenant's RANGE DEFAULT config to all such orphaned ranges, which is probably the right fix. This was alluded to in an earlier TODO but is still left for future work.

  // TODO(irfansharif): We're using a static[1] fallback span config
  // here, we could instead have this directly track the host tenant's
  // RANGE DEFAULT config, or go a step further and use the tenant's own
  // RANGE DEFAULT instead if the key is within the tenant's keyspace.
  // We'd have to thread that through the KVAccessor interface by
  // reserving special keys for these default configs.
  //
  // [1]: Modulo the private spanconfig.store.fallback_config_override, which
  //      applies globally.

So this PR instead takes a shortcut -- it makes the static config configurable through a cluster setting. We can now do the following which alters what fallback config is applied to orphaned ranges, and in our example above, force such ranges to also have a replication factor of five.

  SET CLUSTER SETTING spanconfig.store.fallback_config_override = '
    {
      "gcPolicy": {"ttlSeconds": 3600},
      "numReplicas": 5,
      "rangeMaxBytes": "536870912",
      "rangeMinBytes": "134217728"
    }';

Release note: None

cockroach-teamcity · 2022-11-24T18:57:50Z

This change is

tbg

The new settings type sounds like it stores a protobuf, but it looks like it stores JSON, which is problematic, since proto field names can change without affecting the proto encoding but this will break JSON compat. So this really is a JSON field in disguise?

Actually storing protobuf seems better from that POV, however that's pretty garbled when looking at the table directly (not sure how much this matters).

In any case, since this is primarily intended for a backport, why not use a more surgical approach, for instance injecting the fallback via an env var (JSON), or using a string setting that we parse as JSON? Which is almost what you have here except for the protobuf name.

Personally I would skew heavily towards the simplest backport here.

irfansharif

The new settings type sounds like it stores a protobuf, but it looks like it stores JSON, which is problematic, since proto field names can change without affecting the proto encoding but this will break JSON compat.

Completely forgot about the name thing. Changed, now:

what's stored in the table is the byte form of the protobuf itself,
is held in memory is the protoutil.Message,
rendered/accepted as input is the JSON representation.

garbled when looking at the table directly

Things are already garbled for the cluster version setting, this is no worse. It's not garbled if using SHOW CLUSTER SETTING <setting name>:

root@localhost:26257/defaultdb> show cluster setting spanconfig.store.fallback_config_override;
                                     spanconfig.store.fallback_config_override
--------------------------------------------------------------------------------------------------------------------
  {"gcPolicy": {"ttlSeconds": 3600}, "numReplicas": 5, "rangeMaxBytes": "536870912", "rangeMinBytes": "134217728"}
(1 row)

why not use a more surgical approach,

I'm hoping the protobuf setting type is useful beyond this PR, I can imagine it being so now that the legwork is done. Since this fallback config is a per-node one, using an envvar would require a full cluster restart, which can be painful given how easy it is to alter RANGE DEFAULT. Hopefully this is still small enough to backport.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @arulajmani)

For this setting type: - the protoutil.Message is held in memory, - the byte representation is stored in system.settings, and - the json representation is used when accepting input and rendering state (through SHOW CLUSTER SETTING <setting-name>, the raw form is visible when looking directly at system.settings) We also use this setting type to support power a spanconfig.store.fallback_config_override, which overrides the fallback config used for ranges with no explicit span configs set. Previously we used a hardcoded value -- this makes it a bit more configurable. This is a partial and backportable workaround (read: hack) for cockroachdb#91238 and \cockroachdb#91239. In an internal escalation we observed "orphaned" ranges from dropped tables that were not being being referenced by span configs (by virtue of them originating from now-dropped tables/configs). Typically ranges of this sort are short-lived, they're emptied out through GC and merged into LHS neighbors. But if the neighboring ranges are large enough, or load just high enough, the merge does not occur. For such orphaned ranges we were using a hardcoded "fallback config", with a replication factor of three. This made for confusing semantics where if RANGE DEFAULT was configured to have a replication factor of five, our replication reports indicated there were under-replicated ranges. This is partly because replication reports today are powered by zone configs (thus looking at RANGE DEFAULT) -- this will change shortly as part of \cockroachdb#89987 where we'll instead consider span config data. In any case, we were warning users of under-replicated ranges but within KV we were not taking any action to upreplicate them -- KV was respecting the hard-coded fallback config. The issues above describe that we should apply each tenant's RANGE DEFAULT config to all such orphaned ranges, which is probably the right fix. This was alluded to in an earlier TODO but is still left for future work. // TODO(irfansharif): We're using a static[1] fallback span config // here, we could instead have this directly track the host tenant's // RANGE DEFAULT config, or go a step further and use the tenant's own // RANGE DEFAULT instead if the key is within the tenant's keyspace. // We'd have to thread that through the KVAccessor interface by // reserving special keys for these default configs. // // [1]: Modulo the private spanconfig.store.fallback_config_override, which // applies globally. So this PR instead takes a shortcut -- it makes the static config configurable through a cluster setting. We can now do the following which alters what fallback config is applied to orphaned ranges, and in our example above, force such ranges to also have a replication factor of five. SET CLUSTER SETTING spanconfig.store.fallback_config_override = ' { "gcPolicy": {"ttlSeconds": 3600}, "numReplicas": 5, "rangeMaxBytes": "536870912", "rangeMinBytes": "134217728" }'; Release note: None

tbg

I agree that this is useful, but I'm not sure how to square this with our backport policy. We try to introduce minimal changes and adding a whole new settings type doesn't check that box.

What happens in a mixed cluster in which some nodes have picked up the patch and others didn't? We have never created a new settings type in a patch release. It looks safe from the code that I've seen but there is still risk that isn't matched by an important benefit.

Unfortunately I'm going to have to continue disagreeing that this is backport material. It's great for master though! Let's trim down the backport when it comes. I agree that having to do a restart (to set the env var) isn't great but also for that particular customer, they can set the env var and pick it up when they are migrating into the patch release so it's free. The combination of "global nonstandard replication" and "cares about unreferenced ranges' replication factor" is likely small.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @arulajmani)

irfansharif · 2022-11-28T15:23:42Z

Some env-var-only version of this for a backport sounds good to me.

bors r+

craig · 2022-11-28T15:53:52Z

Build failed:

Bazel Essential CI (Cockroach)

irfansharif · 2022-11-28T16:02:23Z

CI failure is unrelated, some DNS thing (internal slack thread). Will re-bors once resolved.

irfansharif · 2022-11-28T20:43:16Z

bors r+

craig · 2022-11-28T23:09:16Z

Build succeeded:

Bazel Essential CI (Cockroach)

It controls what replication factor is used for ranges with no explicit span configs set. This is a backportable form of the spanconfig.store.fallback_config_override we added in cockroachdb#92466. Release note: None

irfansharif requested review from ajwerner, tbg and arulajmani November 24, 2022 18:57

irfansharif requested a review from a team as a code owner November 24, 2022 18:57

tbg reviewed Nov 24, 2022

View reviewed changes

irfansharif force-pushed the 221124.orphaned-spanconfig branch 2 times, most recently from 6653300 to cd872c1 Compare November 25, 2022 16:40

irfansharif commented Nov 25, 2022

View reviewed changes

irfansharif force-pushed the 221124.orphaned-spanconfig branch from cd872c1 to 619de82 Compare November 25, 2022 19:53

irfansharif force-pushed the 221124.orphaned-spanconfig branch from 619de82 to b0cfdc9 Compare November 25, 2022 19:53

irfansharif changed the title ~~settings,spanconfig: introduce a protobuf setting type ...~~ settings,spanconfig: introduce a protobuf setting type Nov 25, 2022

tbg approved these changes Nov 28, 2022

View reviewed changes

craig bot merged commit 1bc257b into cockroachdb:master Nov 28, 2022

irfansharif deleted the 221124.orphaned-spanconfig branch November 28, 2022 23:13

irfansharif mentioned this pull request Dec 5, 2022

release-22.2: server: introduce COCKROACH_FALLBACK_SPANCONFIG_NUM_REPLICAS_OVERRIDE #93092

Merged

irfansharif mentioned this pull request Dec 5, 2022

release-22.1: server: introduce COCKROACH_FALLBACK_SPANCONFIG_NUM_REPLICAS_OVERRIDE #93093

Merged

irfansharif mentioned this pull request Dec 5, 2022

reports: range not covered by spanconfig reported as under-replicated #91239

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

settings,spanconfig: introduce a protobuf setting type #92466

settings,spanconfig: introduce a protobuf setting type #92466

irfansharif commented Nov 24, 2022 •

edited

Loading

cockroach-teamcity commented Nov 24, 2022

tbg left a comment

irfansharif left a comment

tbg left a comment

irfansharif commented Nov 28, 2022

craig bot commented Nov 28, 2022

irfansharif commented Nov 28, 2022

irfansharif commented Nov 28, 2022

craig bot commented Nov 28, 2022

settings,spanconfig: introduce a protobuf setting type #92466

settings,spanconfig: introduce a protobuf setting type #92466

Conversation

irfansharif commented Nov 24, 2022 • edited Loading

cockroach-teamcity commented Nov 24, 2022

tbg left a comment

Choose a reason for hiding this comment

irfansharif left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

irfansharif commented Nov 28, 2022

craig bot commented Nov 28, 2022

irfansharif commented Nov 28, 2022

irfansharif commented Nov 28, 2022

craig bot commented Nov 28, 2022

irfansharif commented Nov 24, 2022 •

edited

Loading