Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

settings: make host cluster or KV settings accessible to tenants #108677

Closed
erikgrinaker opened this issue Aug 13, 2023 · 0 comments · Fixed by #110758
Closed

settings: make host cluster or KV settings accessible to tenants #108677

erikgrinaker opened this issue Aug 13, 2023 · 0 comments · Fixed by #110758
Assignees
Labels
A-multitenancy Related to multi-tenancy C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-multitenant Issues owned by the multi-tenant virtual team

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Aug 13, 2023

We have some settings that it only makes sense to set for the host cluster, but that SQL code still needs to be able to read. There's currently no way to do this, and I see indications that a lot of code thinks it does but actually doesn't. We have three settings:

  • SystemOnly: can only be set and read by the host cluster.
  • TenantReadOnly: tenants can read the setting, but the setting must be set individually for each tenant. The host cluster setting is invisible, the tenant will see the default setting unless an individual setting is set.
  • TenantWritable: the setting is tenant-specific.

Consider e.g. kv.closed_timestamp.side_transport_interval which controls how often we send closed timestamp updates from leaseholders to followers. This is currently TenantWritable, which doesn't make any sense because the side transport doesn't run in a tenant, it runs below KV. However, SQL code (in this case changefeeds) still needs to access this setting, for example because it needs to know how often to expect closed timestamps to be emitted. Currently, they won't see the KV setting, they'll see the tenant setting, which can lead to changefeeds breaking completely in tenants with certain configurations. But if I change it to SystemOnly the setting isn't visible at all (well, code can read it because of a bug, but they'll only see the default value not the host value), and if I change it to TenantReadOnly it will only read the tenant's version of the setting, not the host cluster.

This seems pretty broken, and it isn't clear to me how to proceed here. Do we need a new class, e.g. TenantVisible, or alternatively make SystemOnly visible to tenants (at least from code)?

See also Slack thread.

This has tripped us up in #91824 and #108678, and also applies to e.g. kv.rangefeed.enabled, follower_read_timestamp(), and many other settings.

Jira issue: CRDB-30584
Epic: CRDB-6671

@erikgrinaker erikgrinaker added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-multitenancy Related to multi-tenancy T-multitenant Issues owned by the multi-tenant virtual team labels Aug 13, 2023
craig bot pushed a commit that referenced this issue Aug 21, 2023
…109142 #109152 #109156 #109157 #109161 #109165 #109166 #109172

107957: asim: convert randomized testing to data-driven r=kvoli a=wenyihu6

**asim: remove extra parsing for []float64, float64, time.Duration**

In cockroachdb/datadriven#45, we upstreamed the
scanning implementation in `datadriven` library. We can now handle parsing of
[]float64, float64, and time.Duration without additional handling.

Release Note: none
Epic: none

---

**asim: enable user-defined repliFactor, placement in rand range_gen**

This patch introduces two additional options for randomized range generations,
letting users define  replication factor and placement type. Although some
aspects of ranges configs are randomly generated (ranges and keyspace), these
two configurations are not randomized. Once set by the user, the configuration
will persist across iterations.

Release Note: none
Part Of: #106311

---

**asim: convert randomized testing to data-driven**
Previously, the randomized testing framework depends on default settings
hardcoded in the tests, requiring users to change code-configured
parameters to change the settings. This patch converts the framework to a
data-driven approach, enabling more dynamic user inputs, more testing examples,
and greater visibility into what each iteration is testing.

TestRandomized is a randomized data-driven testing framework that validates
allocators by creating randomized configurations. It is designed for
regression and exploratory testing.

**There are three modes for every aspect of randomized generation.**
- Static Mode:
1. If randomization options are disabled (e.g. no rand_ranges command is
used), the system uses the default configurations (defined in
default_settings.go) with no randomization.
- Randomized: two scenarios occur:
2. Use default settings for randomized generation (e.g.rand_ranges)
3. Use settings specified with commands (e.g.rand_ranges
range_gen_type=zipf)

**The following commands are provided:**
```
1. "rand_cluster" [cluster_gen_type=(single_region|multi_region|any_region)]
	e.g. rand_cluster cluster_gen_type=(multi_region)
	- rand_cluster: randomly picks a predefined cluster configuration
   according to the specified type.
	- cluster_gen_type (default value is multi_region) is cluster
   configuration type. On the next eval, the cluster is generated as the
   initial state of the simulation.

2. "rand_ranges" [placement_type=(even|skewed|random|weighted_rand)]
	[replication_factor=<int>] [range_gen_type=(uniform|zipf)]
	[keyspace_gen_type=(uniform|zipf)] [weighted_rand=(<[]float64>)]
	e.g. rand_ranges placement_type=weighted_rand weighted_rand=(0.1,0.2,0.7)
	e.g. rand_ranges placement_type=skewed replication_factor=1
		 range_gen_type=zipf keyspace_gen_type=uniform
	- rand_ranges: randomly generate a distribution of ranges across stores
   based on the specified parameters. On the next call to eval, ranges and
   their replica placement are generated and loaded to initial state.
	- placement_type(default value is even): defines the type of range placement
	  distribution across stores. Once set, it remains constant across
	  iterations with no randomization involved.
	- replication_factor(default value is 3): represents the replication factor
	  of each range. Once set, it remains constant across iterations with no
	  randomization involved.
	- range_gen_type(default value is uniform): represents the type of
	  distribution used to yield the range parameter as ranges are generated
   across iterations (range ∈[1, 1000]).
	- keyspace_gen_type: represents the type of distribution used to yield the
   keyspace parameter as ranges are generated across iterations
   (keyspace ∈[1000,200000]).
	- weighted_rand: specifies the weighted random distribution among stores.
	  Requirements (will panic otherwise): 1. weighted_rand should only be
   used with placement_type=weighted_rand and vice versa. 2. Must specify a
   weight between [0.0, 1.0] for each element in the array, with each element
   corresponding to a store 3. len(weighted_rand) cannot be greater than
   number of stores 4. sum of weights in the array should be equal to 1

3. "eval" [seed=<int64>] [num_iterations=<int>] [duration=<time.Duration>]
[verbose=<bool>]
e.g. eval seed=20 duration=30m2s verbose=true
   - eval: generates a simulation based on the configuration set with the given
   commands.
   - seed(default value is int64(42)): used to create a new random number
   generator which will then be used to create a new seed for each iteration.
   - num_iterations(default value is 3): specifies the number of simulations to
   run.
   - duration(default value is 10m): defines duration of each iteration.
   - verbose(default value is false): if set to true, plots all stat(as
   specified by defaultStat) history.
```

RandTestingFramework is initialized with specified testSetting and maintains
its state across all iterations. It repeats the test with different random
configurations. Each iteration in RandTestingFramework executes the following
steps:
1. Generates a random configuration: based on whether randOption is on and
the specific settings for randomized generation.
2. Executes the simulation and checks the assertions on the final state.
3. Stores any outputs and assertion failures in a buffer.

Release note: None
Part Of: #106311

108185: server: remove support for sticky engines r=itsbilal a=jbowens

Remove support for reusing engines from the StickyVFSRegistry. Tests should not
depend on ephemeral, in-memory engine state between server restarts, or read
closed Engine state.

Close #108119.

108467: sql: implement oidvectortypes builtin r=fqazi a=fqazi

Previously, the oidvectortypes builtin in wasn't implemented, causing a compatibility gap for tools
that need to format oidvectors. To address this, this patch adds the oidvectortypes built in.

Fixes: #107942

Release note (sql change): The oidvectortypes built-in has been implemented, which can format oidvector.

108678: closedts: make settings TenantReadOnly and public r=erikgrinaker a=erikgrinaker

It doesn't make sense for these to be `TenantWritable`, since the side transport runs below KV. Furthermore, these settings are referenced throughout our documentation, so make them public.

These should really be set only for the system tenant, and secondary tenants could simply read the system tenant's setting. This functionality runs in the host cluster below KV and it doesn't make any sense to set individual settings for tenants here. Unfortunately, this isn't currently possible with the existing settings classes, there is no way for secondary tenants to access the host's settings.

Touches #108677.

Epic: none
Release note (ops change): The following closed timestamp side-transport settings can no longer be set from secondary tenants (they did not have an effect in secondary tenants): kv.closed_timestamp.target_duration, kv.closed_timestamp.side_transport_interval, and kv.closed_timestamp.lead_for_global_reads_override.

108845: sql: add last_updated column to crdb_internal.kv_protected_ts_records r=jayshrivastava a=jayshrivastava

This change adds a `last_updated` column to the protected timestamps virtual table. This column contains the mvcc timestamp of the row. Having this column present in this table, which is included in debug zips, improves observability when debugging issues.

Informs: #104161
Release note: None
Epic: None

109029: sql: fix TestCreateStatisticsCanBeCancelled txn retry hang r=fqazi a=fqazi

Previously, this test could hang if there was an automatic
stats came in concurrently with a manual stats collection,
where the request filter would end up hanging and being called twice.
To address this patch will disable automatic stats collections
on the table.


Fixes: #109007

Release note: None

109049: concurrency: allow multiple transactions to hold locks on a single key  r=nvanbenschoten a=arulajmani

Locks on a single key are stored in the `lockState` struct. Prior to
this patch, the lock table only expected a single transaction to hold
a lock on a given key at any point in time. This restriction needs to
be lifted for shared locks, whose semantics allow multiple transactions
to hold locks on a single key.

This patch changes the `lockState` datastructure so that it can be
generalized in the future. We don't actually allow multiple transactions
to acquire locks on a single key just yet -- that'll come in a subsequent
patch.

Informs #91545

Release note: None

109087: storage: defer putBuffer release in all cases r=nvanbenschoten a=nvanbenschoten

Minor cleanup.

This commit switches the remainder of the calls to putBuffer.release to be deferred, instead of being manually called at the end of their function. The comments mentioning that the defer was "measurably slower" were introduced in 4444618, which was before Go 1.14 optimized the performance of defer. Most of these, including the more performance-sensitive calls, were already switched over to use defer in fbe8852.

Epic: None
Release note: None

109142: roachtest: Cast snapshot-recd bytes to int in disagg-rebalance r=jbowens a=itsbilal

Previously we were reading a float value as an int, which would trip up the Scan() method if the float value was large enough to be wired over in scientified notation eg. `2.3456E7`. This change ensures that Cockroach prints out the value as an integer to avoid the scan-time error in the roachtest.

Fixes #109114.

Epic: none

Release note: None

109152: build: update some configurations for remote build execution r=rail a=rickystewart

1. Use the `large` pool of executors for `enormous` test targets
2. Add (temporary) network access to the following tests: `amazon_test`,
   `base_test`, `cloudprivilege_test`, `externalconn_test`, and
   `cockroach-go-testserver-upgrade-to-master` logictests. These
   erroneously have a dependency on network assets; bugs have been
   filed for each of these.

Epic: CRDB-8308
Release note: None

109156: sql: version gate UNIQUE constraint with json column r=rafiss a=rafiss

This prevents the usage of a json column in a unique constraint, until after the upgrade is finalized.

fixes #108978
Release note: None

109157:  ci,ui: don't lint `e2e-tests` r=sjbarag a=rickystewart

This workspace has a huge download of `cypress` which was causing
CI to flake.

Epic: none
Release note: None

109161: workload: add background qos to kv workload r=bananabrick a=bananabrick

A --background-qos flag can be used in the kv workload to ensure that the generated work is treated as low priority by admission control.

Epic: none
Release note: None

109165: Revert "rangefeed/changefeed: Enable mux rangefeeds by default." r=erikgrinaker a=erikgrinaker

This reverts commit de65c54.

We decided to keep these disabled for another release, to get more real-world experience with it first.

Touches #95781.
Touches #105270.

Release note (performance improvement): The following release note no longer applies: "mux range feeds reuse connection and workers across multiple range feeds.  This mode is now enabled by default."

109166: build: more resources for building AWS dependency r=rail a=rickystewart

This is a huge package with apparently a lot of auto-generated code that was causing OOM's on EngFlow RBE. This fixes it.

Epic: none
Release note: CRDB-8308

109172: storage: Fix panic in MVCCHistories test r=jbowens a=itsbilal

storage_test.intentPrintingReadWriter previously did not support ReaderWithMustIterators.

Epic: none

Release note: None

Co-authored-by: wenyihu6 <[email protected]>
Co-authored-by: Jackson Owens <[email protected]>
Co-authored-by: Faizan Qazi <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Jayant Shrivastava <[email protected]>
Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Bilal Akhtar <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Arjun Nair <[email protected]>
@knz knz self-assigned this Sep 27, 2023
@craig craig bot closed this as completed in 1787e21 Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-multitenant Issues owned by the multi-tenant virtual team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants