spanconfig: enable range coalescing by default #98820

irfansharif · 2023-03-16T21:58:01Z

We built the basic infrastructure to coalesce ranges across table boundaries back in 22.2 as part of #66063. We've enabled this optimization for secondary tenants since then, but hadn't for the system tenant because of two primary blockers:

sql: SHOW RANGES broken by coalesced ranges #93617: SHOW RANGES was broken by coalesced ranges.
kvserver,server,sql: provide efficient mechanism to retrieve data size information for span #84105: APIs to compute sizes for schema objects (used in our UI) was broken by coalesced ranges.

In both these cases we baked in assumptions about there being a minimum of one-{table,index,partition}-per-range. These blockers didn't apply to secondary tenants at the time since they didn't have access to SHOW RANGES, nor the UI pages where these schema statistics were displayed.

We've addressed both these blockers in the 23.1 cycle as part of bridging the compatibility between secondary tenants and yesteryear's system tenant.

sql: better SHOW RANGES and changes to crdb_internal.ranges{,_no_leases} #93644 revised SHOW RANGES and crdb_internal.ranges{,_no_leases}, both internally and its external UX, to accommodate ranges straddling table/database boundaries.
kvserver, server: new implementation of SpanStats suitable for use with coalesced ranges #96223 re-worked our SpanStats API to work in the face of coalesced ranges, addressing kvserver,server,sql: provide efficient mechanism to retrieve data size information for span #84105.

Release note (general change): CockroachDB would previously use separate ranges for each table, index, or partition. This is no longer true -- it's possible now to have multiple tables, indexes, and partitions to get packed into the same range. For users with many such "schema objects", this will reduce the total range count in their clusters. This is especially true if individual tables, indexes, or partitions are smaller than the default configured maximum range size (controlled using zone configs, specifically the range_max_bytes parameter). We made this change to improve scalability with respect to the number of schema objects, since the underlying range count now no longer a bottleneck. Users upgrading from 22.2, once finalizing their upgrade, may observe a round of range merges and snapshot transfers (to power said range merges) as a result of this change. If users want to opt-out of this optimization, they can use the following cluster setting:

  SET CLUSTER SETTING spanconfig.storage_coalesce_adjacent.enabled = false;

cockroach-teamcity · 2023-03-16T21:58:14Z

This change is

irfansharif · 2023-03-16T22:31:28Z

Hm, looks like #97942 is another dependency. Our data distribution page in the advanced debug page is making the same minimum-one-table-per-range assumption here (+cc @dhartunian). Womp womp.

cockroach/pkg/server/admin.go

Line 3081 in b6e0f76

    
           _, tableID, err := keys.MakeSQLCodec(tenID).DecodeTablePrefix(rangeDesc.StartKey.AsRawKey())

knz · 2023-03-16T22:38:54Z

For UX we want #97858 (already being worked on) and #97861 too.

irfansharif · 2023-03-16T23:28:01Z

CI failure is #98555.

98979: sql: backward-compatibility for SHOW RANGES / crdb_internal.ranges r=cucaroach,fqazi a=knz Requested by `@mwang1026` and `@nvanbenschoten` . Fixes #97861. Fixes #99073. Needed for #98820. Epic: CRDB-24928 See the individual commits for details; exceprt from the 2nd commit: TLDR: the pre-v23.1 behavior of SHOW RANGES and `crdb_internal.ranges{_no_leases}` is made available conditional on a new cluster setting `sql.show_ranges_deprecated_behavior.enabled`. It is set to true by default in v23.1, however any use of the feature will prompt a SQL notice with the following message: ``` NOTICE: attention! the pre-23.1 behavior of SHOW RANGES and crdb_internal.ranges{,_no_leases} is deprecated! HINT: Consider enabling the new functionality by setting 'sql.show_ranges_deprecated_behavior.enabled' to 'false'. The new SHOW RANGES statement has more options. Refer to the online documentation or execute 'SHOW RANGES ??' for details. ``` The deprecated behavior is also disabled in the test suite and by default in `cockroach demo`, i.e. the tests continue to exercise the new behavior. Release note (backward-incompatible change): The pre-v23.1 output produced by `SHOW RANGES`, `crdb_internal.ranges`, `crdb_internal.ranges_no_leases` is deprecated. The new interface and functionality for `SHWO RANGES` can be enabled by changing the cluster setting `sql.show_ranges_deprecated_behavior.enabled` to `false`. It will become default in v23.2. Co-authored-by: Raphael 'kena' Poss <[email protected]>

The DataDistribution endpoint reports replica counts by database and table. When it was built, it operated off the assumption that a range would only ever contain a single table's data within. Now that we have coalesced ranges, a single range can span multiple tables. Unfortunately, the DataDistribution endpoint does not take this fact into account, meaning it reports garbled and inaccurate data, unless the `spanconfig.storage_coalesce_adjacent.enabled` setting is set to false (see cockroachdb#98820). For secondary tenants, ranges are *always* coalesced, so this endpoint in its current state could never report meaningful data for a tenant. Given all of this, we have decided to make this endpoint only available for the system tenant. This patch accomplishes this by moving the endpoint away from the adminServer and into the systemAdminServer, making it effectively unimplemented for secondary tenants. Release note: none

99663: sql: update connExecutor logic for pausable portals r=ZhouXing19 a=ZhouXing19 This PR replaces #96358 and is part of the initial implementation of multiple active portals. ---- This PR is to add limited support for multiple active portals. Now portals satisfying all following restrictions can be paused and resumed (i.e., with other queries interleaving it): 1. Not an internal query; 2. Read-only query; 3. No sub-queries or post-queries. And such a portal will only have the statement executed with a _non-distributed_ plan. This feature is gated by a session variable `multiple_active_portals_enabled`. When it's set `true`, all portals that satisfy the restrictions above will automatically become "pausable" when being created via the pgwire `Bind` stmt. The core idea of this implementation is 1. Add a `switchToAnotherPortal` status to the result-consumption state machine. When we receive an `ExecPortal` message for a different portal, we simply return the control to the connExecutor. (#99052) 2. Persist `flow` `queryID` `span` and `instrumentationHelper` for the portal, and reuse it when we re-execute a portal. This is to ensure we _continue_ the fetching rather than starting all over. (#99173) 3. To enable 2, we need to delay the clean-up of resources till we close the portal. For this we introduced the stacks of cleanup functions. (This PR) Note that we kept the implementation of the original "un-pausable" portal, as we'd like to limit this new functionality only to a small set of statements. Eventually some of them should be replaced (e.g. the limitedCommandResult's lifecycle) with the new code. Also, we don't support distributed plan yet, as it involves much more complicated changes. See `Start with an entirely local plan` section in the [design doc](https://docs.google.com/document/d/1SpKTrTqc4AlGWBqBNgmyXfTweUUsrlqIaSkmaXpznA8/edit). Support for this will come as a follow-up. Epic: CRDB-17622 Release note (sql change): initial support for multiple active portals. Now with session variable `multiple_active_portals_enabled` set to true, portals satisfying all following restrictions can be executed in an interleaving manner: 1. Not an internal query; 2. Read-only query; 3. No sub-queries or post-queries. And such a portal will only have the statement executed with an entirely local plan. 99947: ui: small fixes to DB Console charts shown for secondary tenants r=dhartunian a=abarganier #97995 updated the DB Console to filter out KV-specific charts from the metrics page when viewing DB Console as a secondary application tenant. The PR missed a couple small details. This patch cleans those up with the following: - Removes KV latency charts for app tenants - Adds a single storage graph for app tenants showing livebytes - Removes the "Capacity" chart on the Overview dashboard for app tenants Release note: none Epic: https://cockroachlabs.atlassian.net/browse/CRDB-12100 NB: Please only review the final commit. 1st commit is being reviewed separately @ #99860 100188: changefeedccl: pubsub sink refactor to batching sink r=rickystewart a=samiskin Epic: https://cockroachlabs.atlassian.net/browse/CRDB-13237 This change is a followup to #99086 which moves the Pubsub sink to the batching sink framework. The changes involve: 1. Moves the Pubsub code to match the `SinkClient` interface, moving to using the lower level v1 pubsub API that lets us publish batches manually 3. Removing the extra call to json.Marshal 4. Moving to using the `pstest` package for validating results in unit tests 5. Adding topic handling to the batching sink, where batches are created per-topic 6. Added a pubsub_sink_config since it can now handle Retry and Flush config settings 7. Added metrics even to the old pubsub for the purpose of comparing the two versions At default settings, this resulted in a peak of 90k messages per second on a single node with throughput at 27.6% cpu usage, putting it at a similar level to kafka. Running pubsub v2 across all of TPCC (nodes ran out of ranges at different speeds): <img width="637" alt="Screenshot 2023-03-30 at 3 38 25 PM" src="https://user-images.githubusercontent.com/6236424/229863386-edaee27d-9762-4806-bab6-e18b8a6169d6.png"> Running pubsub v1 (barely visible, 2k messages per second) followed by v2 on tpcc.order_line (in v2 only 2 nodes ended up having ranges assigned to them): <img width="642" alt="Screenshot 2023-04-04 at 12 53 45 PM" src="https://user-images.githubusercontent.com/6236424/229863507-1883ea45-d8ce-437b-9b9c-550afec68752.png"> In the following graphs from the cloud console, where v1 was ran followed by v2, you can see how the main reason v1 was slow was that it wasn't able to batch different keys together. <img width="574" alt="Screenshot 2023-04-04 at 12 59 51 PM" src="https://user-images.githubusercontent.com/6236424/229864083-758c0814-d53c-447e-84c3-471cf5d56c44.png"> Publish requests remained the same despite way more messages in v2 <img width="1150" alt="Screenshot 2023-04-04 at 1 46 51 PM" src="https://user-images.githubusercontent.com/6236424/229875314-6e07177e-62c4-4c15-b13f-f75e8143e011.png"> Release note (performance improvement): pubsub sink changefeeds can now support higher throughputs by enabling the changefeed.new_pubsub_sink_enabled cluster setting. 100620: pkg/server: move DataDistribution to systemAdminServer r=dhartunian a=abarganier The DataDistribution endpoint reports replica counts by database and table. When it was built, it operated off the assumption that a range would only ever contain a single table's data within. Now that we have coalesced ranges, a single range can span multiple tables. Unfortunately, the DataDistribution endpoint does not take this fact into account, meaning it reports garbled and inaccurate data, unless the `spanconfig.storage_coalesce_adjacent.enabled` setting is set to false (see #98820). For secondary tenants, ranges are *always* coalesced, so this endpoint in its current state could never report meaningful data for a tenant. Given all of this, we have decided to make this endpoint only available for the system tenant. This patch accomplishes this by moving the endpoint away from the adminServer and into the systemAdminServer, making it effectively unimplemented for secondary tenants. Release note: none Informs: #97942 Co-authored-by: Jane Xing <[email protected]> Co-authored-by: Alex Barganier <[email protected]> Co-authored-by: Shiranka Miskin <[email protected]>

irfansharif · 2023-05-07T23:09:29Z

Discussed offline with David H. + Alex B., who are ok with just the following warning in the advanced debug page.

@arulajmani, can I get a review?

pkg/spanconfig/spanconfigstore/span_store.go

knz · 2023-05-08T11:31:46Z

pkg/ui/workspaces/db-console/src/views/cluster/containers/dataDistribution/index.tsx

@@ -182,6 +182,15 @@ export class DataDistributionPage extends React.Component<
        <BackToAdvanceDebug history={this.props.history} />
        <section className="section">
          <h1 className="base-heading">Data Distribution</h1>
+          <p style={{ maxWidth: 300, paddingTop: 10 }}>


Is there a tracking issue to fix the data distribution screen or remove it outright?

There was #97942, but the Obs-Inf folks closed it. I'll defer to them if they want something to fix this and/or removing it outright.

arulajmani

Reviewed 10 of 11 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @irfansharif)

pkg/spanconfig/spanconfigstore/span_store.go line 159 at r1 (raw file):

			if roachpb.Key(rem).Compare(systemTableUpperBound) < 0 ||
				!(StorageCoalesceAdjacentSetting.Get(&s.settings.SV) &&
					s.settings.Version.IsActive(ctx, clusterversion.V23_2_EnableRangeCoalescingForSystemTenant)) {

Seems like the range coalescing cluster setting would be a no-op in 23.1 <-> 23.2 mixed version clusters; is the thinking that this is okay because it was a hidden cluster setting previously?

Fixes cockroachdb#81008. We built the basic infrastructure to coalesce ranges across table boundaries back in 22.2 as part of cockroachdb#66063. We've enabled this optimization for secondary tenants since then, but hadn't for the system tenant because of two primary blockers: - cockroachdb#93617: SHOW RANGES was broken by coalesced ranges. - cockroachdb#84105: APIs to compute sizes for schema objects (used in our UI) was broken by coalesced ranges. In both these cases we baked in assumptions about there being a minimum of one-{table,index,partition}-per-range. These blockers didn't apply to secondary tenants at the time since they didn't have access to SHOW RANGES, nor the UI pages where these schema statistics were displayed. We've addressed both these blockers in the 23.1 cycle as part of bridging the compatibility between secondary tenants and yesteryear's system tenant. - cockroachdb#93644 revised SHOW RANGES and crdb_internal.ranges{,_no_leases}, both internally and its external UX, to accommodate ranges straddling table/database boundaries. - cockroachdb#96223 re-worked our SpanStats API to work in the face of coalesced ranges, addressing cockroachdb#84105. Release note (general change): CockroachDB would previously use separate ranges for each table, index, or partition. This is no longer true -- it's possible now to have multiple tables, indexes, and partitions to get packed into the same range. For users with many such "schema objects", this will reduce the total range count in their clusters. This is especially true if individual tables, indexes, or partitions are smaller than the default configured maximum range size (controlled using zone configs, specifically the range_max_bytes parameter). We made this change to improve scalability with respect to the number of schema objects, since the underlying range count now no longer a bottleneck. Users upgrading from 22.2, once finalizing their upgrade, may observe a round of range merges and snapshot transfers (to power said range merges) as a result of this change. If users want to opt-out of this optimization, they can use the following cluster setting: SET CLUSTER SETTING spanconfig.storage_coalesce_adjacent.enabled = false;

irfansharif

TFTR.

bors r+

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @arulajmani, and @knz)

pkg/spanconfig/spanconfigstore/span_store.go line 36 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

Can you make this one public too then thanks

Done.

pkg/spanconfig/spanconfigstore/span_store.go line 159 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Seems like the range coalescing cluster setting would be a no-op in 23.1 <-> 23.2 mixed version clusters; is the thinking that this is okay because it was a hidden cluster setting previously?

Yes.

craig · 2023-05-08T22:44:16Z

Build succeeded:

Bazel Essential CI (Cockroach)

We enabled range coalescing by default in 23.2 as part of cockroachdb#98820. Alongside that change, we want to flip the SHOW RANGES behavior to be compatible with coalesced ranges, which this commit does. See cockroachdb#93644 and the accompanying release notes. Release note (backward-incompatible change): The pre-v23.1 output produced by SHOW RANGES, crdb_internal.ranges, crdb_internal.ranges_no_leases was deprecated in 23.1, and is now replaced by default with output that's compatible with coalesced ranges (i.e. ranges that pack multiple tables/indexes/partitions into individual ranges). See the 23.1 release notes for SHOW RANGES for more details.

101825: sql: add crdb_internal.pretty_value r=Xiang-Gu,msbutler a=stevendanna This composes nicely with crdb_internal.scan to allow inspection of table data and other command line tools like cockroach debug pebble. Epic: none Release note: None 102907: workload: remove initial prefix from bank workload payload r=herkolategan,srosenberg a=renatolabs An `initial-` prefix is added to the payload column of the `bank` table when the workload is initialized. It was introduced about 6 years ago [1] and its purpose at the time is not clear. There are two main problems with it: * the `initial-` prefix suggests the payload may be updated, but that actually never happens. * as currently implemented, it assumes that the `payload-bytes` command line flag is at least `len([]byte("initial-"))`. Passing a lower value to that command line flag leads to a panic. This is an implicit assumption that should not exist. This changes the row generation function so that `payload-bytes` bytes are randomly generated and inserted into the `payload` column, without the `initial-` prefix. [1] d49d535 Epic: none Release note: None 102961: sql: default sql.show_ranges_deprecated_behavior.enabled to true r=irfansharif a=irfansharif We enabled range coalescing by default in 23.2 as part of #98820. Alongside that change, we want to flip the SHOW RANGES behavior to be compatible with coalesced ranges, which this commit does. See #93644 and the accompanying release notes. Release note (backward-incompatible change): The pre-v23.1 output produced by SHOW RANGES, crdb_internal.ranges, crdb_internal.ranges_no_leases was deprecated in 23.1, and is now replaced by default with output that's compatible with coalesced ranges (i.e. ranges that pack multiple tables/indexes/partitions into individual ranges). See the 23.1 release notes for SHOW RANGES for more details. Co-authored-by: Steven Danna <[email protected]> Co-authored-by: Renato Costa <[email protected]> Co-authored-by: irfan sharif <[email protected]>

irfansharif requested review from knz and ajwerner March 16, 2023 21:58

irfansharif requested a review from a team as a code owner March 16, 2023 21:58

irfansharif requested a review from arulajmani March 16, 2023 21:58

irfansharif force-pushed the 230316.enable-range-coalesce branch 2 times, most recently from 050822f to c58ff8c Compare March 16, 2023 22:06

irfansharif mentioned this pull request Mar 16, 2023

server: make /data-distribution compatible with range coalescing #97942

Open

irfansharif force-pushed the 230316.enable-range-coalesce branch from c58ff8c to 914d273 Compare March 16, 2023 23:27

irfansharif requested a review from a team March 16, 2023 23:27

irfansharif requested review from a team as code owners March 16, 2023 23:27

irfansharif force-pushed the 230316.enable-range-coalesce branch from 914d273 to b19d25c Compare March 16, 2023 23:32

knz mentioned this pull request Mar 19, 2023

sql: backward-compatibility for SHOW RANGES / crdb_internal.ranges #98979

Merged

irfansharif changed the title ~~spanconfig: enable range coalescing by default~~ [v23.2] spanconfig: enable range coalescing by default Mar 20, 2023

ZhouXing19 mentioned this pull request Mar 22, 2023

import: avoid over-spliting schema objects #99274

Open

blathers-crl bot mentioned this pull request Mar 26, 2023

release-23.1: sql: backward-compatibility for SHOW RANGES / crdb_internal.ranges #99618

Merged

knz marked this pull request as draft March 30, 2023 14:02

abarganier mentioned this pull request Apr 4, 2023

pkg/server: move DataDistribution to systemAdminServer #100620

Merged

irfansharif force-pushed the 230316.enable-range-coalesce branch from b19d25c to b2c550f Compare May 7, 2023 23:07

irfansharif changed the title ~~[v23.2] spanconfig: enable range coalescing by default~~ spanconfig: enable range coalescing by default May 7, 2023

irfansharif force-pushed the 230316.enable-range-coalesce branch from b2c550f to 494b561 Compare May 7, 2023 23:42

knz reviewed May 8, 2023

View reviewed changes

arulajmani approved these changes May 8, 2023

View reviewed changes

irfansharif force-pushed the 230316.enable-range-coalesce branch from 494b561 to 7a57e7e Compare May 8, 2023 22:13

irfansharif commented May 8, 2023

View reviewed changes

irfansharif marked this pull request as ready for review May 8, 2023 22:14

irfansharif requested a review from a team May 8, 2023 22:14

craig bot merged commit 439c515 into cockroachdb:master May 8, 2023

irfansharif deleted the 230316.enable-range-coalesce branch May 8, 2023 22:44

cockroach-teamcity mentioned this pull request May 9, 2023

PR #98820 - spanconfig: enable range coalescing by default cockroachdb/docs#16908

Open

irfansharif mentioned this pull request May 9, 2023

sql: default sql.show_ranges_deprecated_behavior.enabled to true #102961

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spanconfig: enable range coalescing by default #98820

spanconfig: enable range coalescing by default #98820

irfansharif commented Mar 16, 2023 •

edited

Loading

cockroach-teamcity commented Mar 16, 2023

irfansharif commented Mar 16, 2023

knz commented Mar 16, 2023

irfansharif commented Mar 16, 2023

irfansharif commented May 7, 2023 •

edited

Loading

knz May 8, 2023

irfansharif May 8, 2023

arulajmani left a comment

irfansharif left a comment

craig bot commented May 8, 2023

spanconfig: enable range coalescing by default #98820

spanconfig: enable range coalescing by default #98820

Conversation

irfansharif commented Mar 16, 2023 • edited Loading

cockroach-teamcity commented Mar 16, 2023

irfansharif commented Mar 16, 2023

knz commented Mar 16, 2023

irfansharif commented Mar 16, 2023

irfansharif commented May 7, 2023 • edited Loading

knz May 8, 2023

Choose a reason for hiding this comment

irfansharif May 8, 2023

Choose a reason for hiding this comment

arulajmani left a comment

Choose a reason for hiding this comment

irfansharif left a comment

Choose a reason for hiding this comment

craig bot commented May 8, 2023

irfansharif commented Mar 16, 2023 •

edited

Loading

irfansharif commented May 7, 2023 •

edited

Loading