Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: investigate split and scattering the temporary index in mvcc index backfiller #76686

Closed
stevendanna opened this issue Feb 16, 2022 · 2 comments · Fixed by #77497
Closed
Assignees
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@stevendanna
Copy link
Collaborator

stevendanna commented Feb 16, 2022

Currently, sharded indexes are automatically split and scattered before being backfilled. We may want to do the same for the related temporary index in the index backfiller.

Jira issue: CRDB-13231

Epic CRDB-7363

@stevendanna stevendanna added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. branch-master Failures and bugs on the master branch. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Feb 16, 2022
@chengxiong-ruan chengxiong-ruan self-assigned this Mar 8, 2022
@blathers-crl blathers-crl bot added the T-sql-schema-deprecated Use T-sql-foundations instead label Mar 8, 2022
@postamar
Copy link
Contributor

postamar commented Mar 8, 2022

@stevendanna @chengxiong-ruan can you epic-link this?

@chengxiong-ruan
Copy link
Contributor

@postamar done!

craig bot pushed a commit that referenced this issue Mar 9, 2022
77403: kv/bulk: fix initial splits r=dt a=dt

Previously initial splits were missing at the first key due to how the loop
initialized. This meant that a processor Y that was ingesting data in the 
span `[k, p)` would write keys `k`, `j`, `l`, etc but not split until, say key `m`. 
When another processor X was ingesting into the prior span `[a, k)`, it regularly
splits and scatters the "empty" RHS of its span before it fills some prefix of it and
then splits and scatters the remaining span again. However the fact that processor 
Y did not split at `k` meant the "empty" range containing the empty suffix of `[a, k)` 
actually was not an empty range and instead had `k`, `j`, etc in it, which then had 
to be actually moved every time X scattered that range. 

Now the first key is used as a split key to avoid this. This requires using a
different predicate key than usual; typically we check if where we are
splitting at is still in the same range as the prior split to detect if
another node has already split the span, but we have no prior split on
the first one. Instead, we use the next split for the first one's predicate
as this can also serve to show that the span is still 'wide' enough to split.

Release note: none.

Release justification: low-risk fix of new functionality.

77443: admission: add support for tenant weights r=ajwerner,cucaroach a=sumeerbhola

The weights are used in ordering tenants, such that tenant i is
preferred over tenant j if used_i/weight_i < used_j/weight_j,
where the used values represent usage of slots or tokens. This
allows for a form of weighted fair sharing. This fair sharing
is quite primitive, since there is no history for slots, and only
1s of history for token consumption (with a sharp reset, instead
of a rolling one).

The weighting can be useful when a node (or store) has
significantly different number of ranges for two tenants, so
giving them an equal share of resources (like CPU) would not be
reasonable.

The minimum weight is 1 and the maximum weight is currently
capped at 20. Note that a tenant using 0 slots/tokens will always
be preferred over one that is using a non-zero amount, regardless
of weight. This reduces the likelihood starvation, though with
a large enough number of waiting tenants, both the unweighted
(weight of 1) and weighted scheme can have starvation since
ties between tenants that are using 0 slots/tokens are broken
non-deterministically (and not by preferring the longer waiting
tenant).

This will be used for KV admission control, both for kv and
kv-stores queues, which use slots and tokens respectively. The
integration code that periodically sets the weights based on
the range count per tenant will be in a later PR.

Informs #77358

Release justification: Low-risk update to new functionality.
Even though inter-tenant isolation was included in v21.2,
it has only been used in CockroachDB serverless recently,
and there is consensus to include weighting for v22.1.
The integration code (in a later PR) will be gated on a
cluster setting, and will default to not using weights.

Release note: None

77497: sql: presplit temp hash sharded index r=chengxiong-ruan a=chengxiong-ruan

fixes #76686

we presplit hash sharded index before backflling it. but with the new
mvcc index backfiller, we also create a temp index to take care of write
traffic when the origin index is being backfilled. We need to presplit
the temp index as well.

Release note: None
Release justification: this needed by the mvcc index backfiller

Co-authored-by: David Taylor <[email protected]>
Co-authored-by: sumeerbhola <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
@craig craig bot closed this as completed in 6dca2c9 Mar 9, 2022
@exalate-issue-sync exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants