-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: store with high read amplification should not be a target of rebalancing #73714
Closed
Labels
A-kv-replication-constraints
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Comments
sumeerbhola
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-kv-replication-constraints
labels
Dec 11, 2021
aayushshah15
added a commit
to aayushshah15/cockroach
that referenced
this issue
Dec 12, 2021
This commit introduces two new cluster settings: ``` kv.snapshot_decline.read_amp_threshold server.declined_snapshot_timeout ``` With this commit, stores with a read amplification level higher than `kv.snapshot_decline.read_amp_threshold` will decline all `REBALANCE` snapshots. Upon receiving a `DECLINED` response, the senders of these snapshots will consider these receivers `throttled` for `server.declined_snapshot_timeout`. This means that stores with poor LSM health will not be considered as valid candidates for replica rebalancing. Fixes cockroachdb#73714 Related to cockroachdb#62168 Release note: None
aayushshah15
added a commit
to aayushshah15/cockroach
that referenced
this issue
Dec 13, 2021
This commit introduces two new cluster settings: ``` kv.snapshot_decline.read_amp_threshold server.declined_snapshot.timeout ``` With this commit, stores with a read amplification level higher than `kv.snapshot_decline.read_amp_threshold` will decline all `REBALANCE` snapshots. Upon receiving a `DECLINED` response, the senders of these snapshots will consider these receivers `throttled` for `server.declined_snapshot.timeout`. This means that stores with poor LSM health will not be considered as valid candidates for replica rebalancing. Fixes cockroachdb#73714 Related to cockroachdb#62168 Release note: None
aayushshah15
added a commit
to aayushshah15/cockroach
that referenced
this issue
Mar 8, 2022
This commit introduces two new cluster settings: ``` kv.snapshot_decline.read_amp_threshold server.declined_snapshot.timeout ``` With this commit, stores with a read amplification level higher than `kv.snapshot_decline.read_amp_threshold` will decline all `REBALANCE` snapshots. Upon receiving a `DECLINED` response, the senders of these snapshots will consider these receivers `throttled` for `server.declined_snapshot.timeout`. This means that stores with poor LSM health will not be considered as valid candidates for replica rebalancing. Fixes cockroachdb#73714 Related to cockroachdb#62168 Release note: None Release justification: This patch adds a tunable guardrail that could prevent or mitigate cluster instability
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Mar 31, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levlels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
This was referenced Apr 1, 2022
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 1, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levlels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 4, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levlels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 4, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levlels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 4, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levlels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 5, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 5, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 6, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 7, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 8, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 8, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
kvoli
added a commit
to kvoli/cockroach
that referenced
this issue
Apr 9, 2022
Previously, the only store health signal used as a hard allocation and rebalancing constraint was disk capacity. This patch introduces L0 sub-levels as an additional constraint, to avoid allocation and rebalancing to replicas to stores which are unhealthy, indicated by a high number of L0 sub-levels. A store's sub-level count must exceed both the (1) threshold and (2) cluster in order to be considered unhealthy. The average check ensures that a cluster full of moderately high read amplification stores is not unable to make progress, whilst still ensuring that positively skewed distributions exclude the positive tail. Simulation of the effect on candidate exclusion under different L0 sub-level distributions by using the mean as an additional check vs percentiles can be found here: https://gist.github.com/kvoli/be27efd4662e89e8918430a9c7117858 The threshold corresponds to the cluster setting `kv.allocator.L0_sublevels_threshold`, which is the number of L0 sub-levels, that when a candidate store exceeds it will be potentially excluded as a target for rebalancing, or both rebalancing and allocation of replicas. The enforcement of this threshold can be applied under 4 different levels of strictness. This is configured by the cluster setting: `kv.allocator.L0_sublevels_threshold_enforce`. The 4 levels are: `block_none`: L0 sub-levels is ignored entirely. `block_none_log`: L0 sub-levels are logged if threshold exceeded. Both states below log as above. `block_rebalance_to`: L0 sub-levels are considered when excluding stores for rebalance targets. `block_all`: L0 sub-levels are considered when excluding stores for rebalance targets and allocation targets. By default, `kv.allocator.L0_sublevels_threshold` is `20`. Which corresponds to admissions control's threshold, above which it begins limiting admission of work to a store based on store health. The default enforcement level of `kv.allocator.L0_sublevels_threshold_enforce` is `block_none_log`. resolves cockroachdb#73714 Release justification: low risk, high benefit during high read amplification scenarios where an operator may limit rebalancing to high read amplification stores, to stop fueling the flame. Release note (ops change): introduce cluster settings `kv.allocator.l0_sublevels_threshold` and `kv.allocator.L0_sublevels_threshold_enforce`, which enable excluding stores as targets for allocation and rebalancing of replicas when they have high read amplification, indicated by the number of L0 sub-levels in level 0 of the store's LSM. When both `kv.allocator.l0_sublevels_threshold` and the cluster average is exceeded, the action corresponding to `kv.allocator.l0_sublevels_threshold_enforce` is taken. `block_none` will exclude no candidate stores, `block_none_log` will exclude no candidates but log an event, `block_rebalance_to` will exclude candidates stores from being targets of rebalance actions, `block_all` will exclude candidate stores from being targets of both allocation and rebalancing. Default `kv.allocator.l0_sublevels_threshold` is set to `20` and `kv.allocator.l0_sublevels_threshold_enforce` is set to `block_none_log`.
manually reviewed and updated |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-replication-constraints
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Looking at a 6 hour interval of a node with a single store that had consistently high read amplification of > 1600, there are 3055 log entries containing "applying snapshot of type INITIAL".
The allocator should not add replicas to a store that is unhealthy in this manner.
There are also 11804 "removing replica" log statements, so probably the allocator has some signal it is using to shed load.
(the Cockroach Labs internal link for these logs https://upload.cockroachlabs.com/receive/?thread=J0LX-VAFS&packageCode=3V8ZJ2nVEB0zprhPefmADBbjcHJ1YEnIerx9xAyeqdE#keyCode=q1HsA9iW08Ftw0eGrEXYMckv635vIQbmw8s-Y5i3FxI)
cc: @aayushshah15
Jira issue: CRDB-11703
The text was updated successfully, but these errors were encountered: