Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admission,kvserver: broadcast per-store IO overload status #83720

Merged
merged 1 commit into from
Jul 7, 2022

Conversation

tbg
Copy link
Member

@tbg tbg commented Jul 1, 2022

The plan for #79215 is to have stores not send to followers that are
experiencing IO overload, unless this is necessary for quorum.

This commit teaches admission control to relay the current status of the
IO overload signals to the Store, which in turn gossips it. We add a
type IOThreshold that encapsulates the input signals to I/O admission
control to present them as a unified score, which hopefully leads to
less ad-hoc use of the underlying signals.

Together, this paves the way for using the signal to inform distribution
decisions (#83490) and (bluntly) shape raft traffic (#79215).

Touches #79215.
Touches #77604.
Touches #82611.
Touches #83490.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg tbg force-pushed the adm-ctrl-signal-gossip branch 5 times, most recently from 6412ae8 to 1797b48 Compare July 5, 2022 14:33
@tbg tbg requested review from kvoli and erikgrinaker July 5, 2022 14:35
@tbg tbg marked this pull request as ready for review July 5, 2022 14:36
@tbg tbg requested review from a team as code owners July 5, 2022 14:36
@tbg tbg removed request for a team July 5, 2022 14:36
@tbg tbg changed the title [dnm] kvserver: drop messages to overloaded followers admission,kvserver: broadcast per-store IO overload status Jul 5, 2022
Copy link
Collaborator

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - on the 10s interval change, see my comment on #83808.

The plan for cockroachdb#79215 is to have stores not send to followers that are
experiencing IO overload, unless this is necessary for quorum.

This commit teaches admission control to relay the current status of the
IO overload signals to the Store, which in turn gossips it. We add a
type `IOThreshold` that encapsulates the input signals to I/O admission
control to present them as a unified score, which hopefully leads to
less ad-hoc use of the underlying signals.

Together, this paves the way for using the signal to inform distribution
decisions (cockroachdb#83490) and (bluntly) shape raft traffic (cockroachdb#79215).

Touches cockroachdb#79215.
Touches cockroachdb#77604.
Touches cockroachdb#82611.
Touches cockroachdb#83490.

Release note: None
@tbg tbg force-pushed the adm-ctrl-signal-gossip branch from 1797b48 to aece96b Compare July 6, 2022 16:37
@tbg tbg removed the request for review from erikgrinaker July 6, 2022 16:38
@tbg
Copy link
Member Author

tbg commented Jul 6, 2022

TFTR! I removed the 10s commit (which has its own PR #83808 anyway).

@tbg
Copy link
Member Author

tbg commented Jul 7, 2022

bors r=kvoli

@craig
Copy link
Contributor

craig bot commented Jul 7, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jul 7, 2022

Build succeeded:

@craig craig bot merged commit e67e47f into cockroachdb:master Jul 7, 2022
@tbg tbg deleted the adm-ctrl-signal-gossip branch July 7, 2022 19:53
kvoli added a commit to kvoli/cockroach that referenced this pull request Feb 14, 2023
L0-sublevels used to be checked as a gauge of store health in order to
exclude unhealthy stores from being allocation targets. In cockroachdb#83720,
`IOThreshold` was included into gossip. `IOThreshold` includes
the L0-sublevels, L0-filecount and may be used for additional
measures of store health, making it more informative than looking at
`L0-sublevels` alone.

This commit stops gossiping the L0-sublevels and replaces uses of
L0-sublevels in the allocator and storepool with `IOThreshold`.

The prior cluster settings which controlled how store health was
considered in allocation are now deprecated:

`kv.allocator.L0_sublevels_threshold`
`kv.allocator.L0_sublevels_threshold_enforce`

The new cluster settings which perform an identical function:

`kv.allocator.io_overload_threshold`
Which has the same concept as `L0-sublevels_threshold`, however is
adjusted for `IOThreshold` where a value above 1 indicates overload. The
default cluster setting value is set to `0.8`, which is intentional to
prevent allocation before overload occurs.

`kv.allocator.io_overload_threshold_enforce`
Which is identical to the `L0_sublevels_threshold_enforce`.

Resolves: cockroachdb#85084

Release note (ops change): Deprecate cluster settings
`kv.allocator.l0_sublevels_threshold` and
`kv.allocator.L0_sublevels_threshold_enforce` and introduce
`kv.allocator.io_overload_threshold` and
`kv.allocator.io_overload_threshold_enforce`. The enforcement cluster
setting is unchanged, however renamed. The threshold cluster setting is
adapted to `IOThreshold`, where a value greater or equal to 1 indicates
IO overload. The default is set to 0.8 to prevent allocation prior to
overload occurring.
kvoli added a commit to kvoli/cockroach that referenced this pull request Feb 16, 2023
L0-sublevels used to be checked as a gauge of store health in order to
exclude unhealthy stores from being allocation targets. In cockroachdb#83720,
`IOThreshold` was included into gossip. `IOThreshold` includes
the L0-sublevels, L0-filecount and may be used for additional
measures of store health, making it more informative than looking at
`L0-sublevels` alone.

This commit stops gossiping the L0-sublevels and replaces uses of
L0-sublevels in the allocator and storepool with `IOThreshold`.

The prior cluster settings which controlled how store health was
considered in allocation are now deprecated:

`kv.allocator.L0_sublevels_threshold`
`kv.allocator.L0_sublevels_threshold_enforce`

The new cluster settings which perform an identical function:

`kv.allocator.io_overload_threshold`
Which has the same concept as `L0-sublevels_threshold`, however is
adjusted for `IOThreshold` where a value above 1 indicates overload. The
default cluster setting value is set to `0.8`, which is intentional to
prevent allocation before overload occurs.

`kv.allocator.io_overload_threshold_enforce`
Which is identical to the `L0_sublevels_threshold_enforce`.

Resolves: cockroachdb#85084

Release note (ops change): Deprecate cluster settings
`kv.allocator.l0_sublevels_threshold` and
`kv.allocator.L0_sublevels_threshold_enforce` and introduce
`kv.allocator.io_overload_threshold` and
`kv.allocator.io_overload_threshold_enforce`. The enforcement cluster
setting is unchanged, however renamed. The threshold cluster setting is
adapted to `IOThreshold`, where a value greater or equal to 1 indicates
IO overload. The default is set to 0.8 to prevent allocation prior to
overload occurring.
kvoli added a commit to kvoli/cockroach that referenced this pull request Feb 22, 2023
We previously checked stores' L0-sublevels to exclude IO overloaded
stores from being allocation targets (cockroachdb#78608). This commit replaces the signal
with the normalized IO overload score instead, which also factors in the
L0-filecount. We started gossiping this value as of cockroachdb#83720. We continue
gossiping L0-sublevels for mixed-version compatibility; we can stop doing this
in 23.2.

Resolves: cockroachdb#85084

Release note (ops change): We've deprecated two cluster settings:
- kv.allocator.l0_sublevels_threshold
- kv.allocator.l0_sublevels_threshold_enforce.
The pair of them were used to control rebalancing and upreplication behavior in
the face of IO overloaded stores. This has been now been replaced by other
internal mechanisms.
craig bot pushed a commit that referenced this pull request Feb 22, 2023
96914: sql: add spaces to delimiter in uniqueness violation error r=rharding6373 a=michae2

Whenever a mutation statement violates a uniqueness constraint we return a specific error which shows the affected row. This error is formatted to match the corresponding error in Postgres.

We create this error in two places. In one place (`pkg/sql/opt/exec/execbuilder/mutation.go`) we were using ", " as the delimter between values, which matches Postgres. But in the other place (`pkg/sql/row/errors.go`) we were using "," (without a space).

This patch adds the space.

Epic: None

Release note (bug fix): Fix formatting of uniqueness violation errors to match the corresponding errors from PostgreSQL.

97065: schemachanger: Support dropping index cascade with dependent inbound FK r=healthy-pod a=Xiang-Gu

Dropping a unique index cascade will need to drop any inbound FK constraint 
if this index is the only uniqueness provider.

This commit enables the declarative schema changer to support this behavior.

Fixes: #96731
Epic: None

97142: allocator: replace read amp with io thresh r=irfansharif a=kvoli

We previously checked stores' L0-sublevels to exclude IO overloaded
stores from being allocation targets (#78608). This commit replaces the signal
with the normalized IO overload score instead, which also factors in the
L0-filecount. We started gossiping this value as of #83720. We continue
gossiping L0-sublevels for mixed-version compatibility; we can stop doing this
in 23.2.

Resolves: #85084

Release note (ops change): We've deprecated two cluster settings:
- kv.allocator.l0_sublevels_threshold
- kv.allocator.l0_sublevels_threshold_enforce.
The pair of them were used to control rebalancing and upreplication behavior in
the face of IO overloaded stores. This has been now been replaced by other
internal mechanisms.

97495: roachprod: run scheduled backup init without timeout r=renatolabs a=msbutler

Previously, several roachtests failed during a cluster restart because a node serving the default scheduled backup command was not ready to serve requests. At this time, when roachprod start returns, not every node may be ready to serve requests.

To prevent this failure mode, this patch changes the scheduled backup cmd during roachprod.Start() to run with infinite timeout and only on the the first node in the cluster.

Fixes #97010, #97232

Release note: None

Epic: none

Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: Xiang Gu <[email protected]>
Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants