Revisit default disk watermark on different data tiers #81406

Leaf-Lin · 2021-12-07T07:43:18Z

Today, different data tiers all have the same cluster.routing.allocation.disk.watermark.low|high|flood_stage settings. These settings by default are relative values means that for a larger disk, the default value can introduce some level of wastage.

Do we want to revisit the default value for disk watermark especially for different data tiers, there might be a different level of space requirement?

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-12-07T07:43:20Z

Pinging @elastic/es-data-management (Team:Data Management)

dakrone · 2021-12-07T21:51:17Z

One thing that would make this difficult is that these are dynamic settings, and we currently don't have the Settings functionality to be able to specify setting a particular value on a subset of nodes. This would mean that we could have separate defaults, but if the user ever dynamically changed one of the settings, it would change for all tiers.

I'm adding the core/infra label for this, since they'd need to help us with the Settings infrastructure if we wanted to support something like this fully.

elasticmachine · 2021-12-07T21:51:25Z

Pinging @elastic/es-core-infra (Team:Core/Infra)

Leaf-Lin · 2021-12-07T21:55:35Z

There was a discussion that this would be an operator-only setting on the cloud once the default settings are sensible.

dakrone · 2021-12-07T21:58:30Z

There was a discussion that this would be an operator-only setting on the cloud once the default settings are sensible.

That helps a bit. I think it'd still be good to think about where we want to go with tier-specific settings in the future, both for the purpose of on-prem users, but also for other settings (that may be non-operator-only) that we want to set to different values (for example, we already do this with a recovery setting: #68480)

elasticmachine · 2022-06-15T00:21:30Z

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner · 2022-06-15T06:43:55Z

I wonder if we really need more control here, as opposed to changing how Elasticsearch computes its disk watermarks to better utilise larger nodes. Today ES (roughly) aims to keep 10% of disk free on each node, but if we say changed this to target min(10%,100GB) of free space then larger nodes would be better-utilised automatically without needing to adjust any settings.

henningandersen · 2022-06-15T07:03:15Z

I agree with David here, doing similarly to what we did for frozen tier on all tiers would in my mind effectively solve this problem.

Making it tier specific is not enough/ideal during a rolling up-size anyway.

dakrone · 2022-06-15T22:01:20Z

Supporting a rudimentary syntax of either: 0.75 or 10% or 100gb or min(10%,100gb) or max(10%,100gb) (and then changing the default for our needs) seems like a reasonable thing to do for this to me also.

kingherc · 2022-07-05T09:28:18Z

From conversing with @henningandersen , we could use 20GB headroom for flood_stage, 100GB for high, and 150GB for low watermarks.

kingherc · 2022-07-11T09:07:46Z

@henningandersen , as we said, we could add a max_headroom setting (similar to the existing cluster.routing.allocation.disk.watermark.flood_stage.frozen.max_headroom) for the low, high and flood_stage watermarks. A couple of notes/questions that come to mind:

Seeing the logic of the existing max_headroom, it applies a default value (of 20GB for the frozen flood stage) only if the original frozen threshold is not set. Else if the frozen threshold is set, the user would need to explicitly set the max_headroom as well. I am thinking of following the same logic, although it initially felt more natural to me to have the default value even if the frozen threshold is set.
If we introduce the max headrooms, will it be weird that there will be still a special watermark and headroom setting just for the frozen tier? Would it make sense to deprecate it and tell people to use the new cluster.routing.allocation.disk.watermark.flood_stage.max_headroom setting (which would apply irrespective of data tier)? We do not need to throw an error, just ignore it and produce a warning (telling to use the new headroom settings).

kingherc · 2022-07-11T14:30:44Z

Team, as mentioned in my comment above, I plan to start seeing how to implement the max_headroom idea I discussed with @henningandersen . Feel free to provide me feedback / questions / objections if you have any, also for the questions I write above in my comment. cc @DaveCTurner

kingherc · 2022-07-13T13:23:19Z

In implementing this, I am thinking of adopting @henningandersen 's new RelativeByteSizeValue.java. It was only introduced for the frozen-tier flood stage watermark and headroom, but I believe it would make sense to adopt it in this issue for all disk watermarks and introduce the max headrooms in a similar way. I believe this will simplify the code (which now tends to separately check for the thresholds in either exact bytes or percentages) without breaking backwards compatibility. Please raise your voice if you would have any concerns over this.

kingherc · 2022-07-13T13:27:05Z

cc'ing also @gmarouli who I saw authored Add disk thresholds in the cluster state recently and could provide feedback as well.

kingherc · 2022-07-14T12:07:57Z

Hi! I have been dealing with this issue for some days now, and I have been doing several changes across DiskThresholdSettings.java, DiskThresholdMonitor.java, DiskThresholdDecider.java, HealthMetadata.java, and relevant tests. Just to double confirm I am going the correct way, I would like at least one more opinion on this one. Since a couple of people are away, @fcofdez , would you mind taking a look at my comments above and see if I'm going about the right way?

In short:

I'm introducing some "max_headroom" settings for the low/high/flood disk watermark settings
I'm converting the low/high/flood settings to RelativeByteSizeValue.java classes
I believe what I'm doing is backwards compatible, but I do need to change some Allocation "reasons" and some log messages every now and then.

Feedback is welcome

fcofdez · 2022-07-14T12:36:01Z

I'll take a look into this next Monday @kingherc if that's ok

fcofdez · 2022-07-18T08:12:03Z

I'm introducing some "max_headroom" settings for the low/high/flood disk watermark settings

I think this makes sense.

I'm converting the low/high/flood settings to RelativeByteSizeValue.java classes

Maybe we can leave this for a follow-up PR? Since I think this is just a refactoring to clean up some old code, right?

I believe what I'm doing is backwards compatible, but I do need to change some Allocation "reasons" and some log messages every now and then.

I would expect to keep the old behaviour when we don't specify the new max_headroom settings. That way we won't break any old cluster settings. If the changes are just about the new max headroom, I think you're in the good path.

If we introduce the max headrooms, will it be weird that there will be still a special watermark and headroom setting just for the frozen tier?

Yes, that's confusing. But we need to behave correctly in that case, i.e: use the frozen max_headroom when both general and specific are specified. I think we should deprecate that setting, but I'm not sure if this needs to go through the deprecation approval process.

cc @kingherc

kingherc · 2022-07-18T08:20:19Z

Awesome, thanks for the pointers @fcofdez ! I think the most important thing was to see I'm on a correct path which it seems so from your answer, so I will continue on this path.

Maybe we can leave this for a follow-up PR?

I did not go that way because converting them to RelativeByteSizeValue would simplify the code (it was already done for the frozen max_headroom setting, so I could adopt similar code). But it turns out I need to touch a lot of test code to make sure everything works appropriately, so if I see it needs a lot more work, I may then do it in a separate PR.

I also believe I am not introducing backwards incompatible changes/behaviour -- as I believe if the new max_headroom settings are not set, the old ones will behave as they were.

For the frozen setting, I found that I can leave it as it is. That way, we do not need to deprecate anything or make something backwards incompatible. However, I do agree that it may be confusing, and I would welcome especially @henningandersen (who introduced the frozen setting that I'm mimicking) to comment on whether we should do anything.

Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Also, convert the disk watermarks to RelativeByteSizeValue, similar to the existing setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Fixes elastic#81406

henningandersen · 2022-07-27T07:34:25Z

I think we should deprecate that setting, but I'm not sure if this needs to go through the deprecation approval process.

We cannot deprecate it since it has a lower default value. For frozen there is no merging and we can go much closer to the limit.

For the frozen setting, I found that I can leave it as it is

👍

Relates to elastic#81406

Relates to #81406

Relates to elastic#81406

Relates to #81406

Leaf-Lin added >enhancement Team:Data Management Meta label for data/management team needs:triage Requires assignment of a team area label labels Dec 7, 2021

not-napoleon added :Data Management/Other and removed needs:triage Requires assignment of a team area label labels Dec 7, 2021

dakrone added the :Core/Infra/Settings Settings infrastructure and APIs label Dec 7, 2021

elasticmachine added the Team:Core/Infra Meta label for core/infra team label Dec 7, 2021

Leaf-Lin added the :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) label Jun 15, 2022

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jun 15, 2022

gmarouli mentioned this issue Jul 7, 2022

Add disk thresholds in the cluster state #88175

Merged

kingherc self-assigned this Jul 11, 2022

kingherc mentioned this issue Jul 20, 2022

Introduce max headroom for disk watermark stages #88639

Merged

kingherc closed this as completed in #88639 Sep 19, 2022

kingherc added a commit to kingherc/elasticsearch that referenced this issue Oct 10, 2022

Explain disk headroom settings more in docs

981f592

Relates to elastic#81406

kingherc mentioned this issue Oct 10, 2022

Explain disk headroom settings more in docs #90763

Merged

kingherc added a commit that referenced this issue Oct 20, 2022

Explain disk headroom settings more in docs (#90763)

0f4374f

Relates to #81406

kingherc added a commit to kingherc/elasticsearch that referenced this issue Oct 20, 2022

Explain disk headroom settings more in docs (elastic#90763)

1e22107

Relates to elastic#81406

elasticsearchmachine pushed a commit that referenced this issue Oct 20, 2022

Explain disk headroom settings more in docs (#90763) (#91053)

4e2c7ff

Relates to #81406

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revisit default disk watermark on different data tiers #81406

Revisit default disk watermark on different data tiers #81406

Leaf-Lin commented Dec 7, 2021

elasticmachine commented Dec 7, 2021

dakrone commented Dec 7, 2021

elasticmachine commented Dec 7, 2021

Leaf-Lin commented Dec 7, 2021

dakrone commented Dec 7, 2021

elasticmachine commented Jun 15, 2022

DaveCTurner commented Jun 15, 2022

henningandersen commented Jun 15, 2022

dakrone commented Jun 15, 2022

kingherc commented Jul 5, 2022

kingherc commented Jul 11, 2022 •

edited

Loading

kingherc commented Jul 11, 2022

kingherc commented Jul 13, 2022

kingherc commented Jul 13, 2022

kingherc commented Jul 14, 2022

fcofdez commented Jul 14, 2022

fcofdez commented Jul 18, 2022

kingherc commented Jul 18, 2022

henningandersen commented Jul 27, 2022 •

edited

Loading

Revisit default disk watermark on different data tiers #81406

Revisit default disk watermark on different data tiers #81406

Comments

Leaf-Lin commented Dec 7, 2021

elasticmachine commented Dec 7, 2021

dakrone commented Dec 7, 2021

elasticmachine commented Dec 7, 2021

Leaf-Lin commented Dec 7, 2021

dakrone commented Dec 7, 2021

elasticmachine commented Jun 15, 2022

DaveCTurner commented Jun 15, 2022

henningandersen commented Jun 15, 2022

dakrone commented Jun 15, 2022

kingherc commented Jul 5, 2022

kingherc commented Jul 11, 2022 • edited Loading

kingherc commented Jul 11, 2022

kingherc commented Jul 13, 2022

kingherc commented Jul 13, 2022

kingherc commented Jul 14, 2022

fcofdez commented Jul 14, 2022

fcofdez commented Jul 18, 2022

kingherc commented Jul 18, 2022

henningandersen commented Jul 27, 2022 • edited Loading

kingherc commented Jul 11, 2022 •

edited

Loading

henningandersen commented Jul 27, 2022 •

edited

Loading