Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disk watermark validation rejects apparently valid sequence of updates #28309

Closed
DaveCTurner opened this issue Jan 19, 2018 · 13 comments
Closed
Assignees
Labels
>bug :Core/Infra/Settings Settings infrastructure and APIs help wanted adoptme v6.1.1

Comments

@DaveCTurner
Copy link
Contributor

$ elasticsearch-6.1.1/bin/elasticsearch --version
Java HotSpot(TM) 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
Version: 6.1.1, Build: bd92e7f/2017-12-17T20:23:25.338Z, JVM: 9.0.1
$ java -version
java version "9.0.1"
Java(TM) SE Runtime Environment (build 9.0.1+11)
Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode)
$ uname -a
Darwin Davids-MacBook-Pro.local 17.3.0 Darwin Kernel Version 17.3.0: Thu Nov  9 18:09:22 PST 2017; root:xnu-4570.31.3~1/RELEASE_X86_64 x86_64

On a fresh one-node cluster, the first of these operations succeeds but the second fails:

PUT _cluster/settings
{"transient":{"cluster.routing.allocation.disk.watermark.flood_stage":"99%"}}

PUT _cluster/settings
{"transient":{"cluster.routing.allocation.disk.watermark.high":"97%"}}

The result is:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "high disk watermark [97%] more than flood stage disk watermark [95%]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "high disk watermark [97%] more than flood stage disk watermark [95%]"
  },
  "status": 400
}

This is surprising because the flood stage watermark was set to 99% but the error message claims it is still 95%. As expected, GET _cluster/settings yields:

{
  "persistent": {},
  "transient": {
    "cluster": {
      "routing": {
        "allocation": {
          "disk": {
            "watermark": {
              "flood_stage": "99%"
            }
          }
        }
      }
    }
  }
}

Setting both watermarks together succeeds:

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.disk.watermark.flood_stage": "99%",
    "cluster.routing.allocation.disk.watermark.high": "97%"
  }
}

I see the same effect replacing transient with persistent throughout too.

@DaveCTurner DaveCTurner added >bug :Core/Infra/Settings Settings infrastructure and APIs v6.1.1 labels Jan 19, 2018
@DaveCTurner DaveCTurner assigned jasontedor and s1monw and unassigned jasontedor Jan 19, 2018
@jasontedor jasontedor assigned jasontedor and unassigned s1monw Jan 19, 2018
@rjernst rjernst added the help wanted adoptme label Mar 14, 2018
@dylanPowers
Copy link

This also happens with high versus low and is still present in v6.2.3

GET _cluster/settings
{
  "persistent": {
    "cluster": {
      "routing": {
        "allocation": {
          "disk": {
            "watermark": {
              "low": "50%"
            }
PUT _cluster/settings 
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.high":  "66%"
  }
}

Result:
{
  "error": {
    "root_cause": [
      {
        "type": "remote_transport_exception",
        "reason": "[tw-r03-s01.master-elasticsearch-logging-staging.service.chcg01.consul][172.23.73.194:9300][cluster:admin/settings/update]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "low disk watermark [85%] more than high disk watermark [66%]"
  },
  "status": 400
}

@faxm0dem
Copy link

I can confirm this is still in 6.3.2 thanks to @Xylakant for pointing me here

@cbismuth
Copy link
Contributor

Looks interesting, I'd like to work on it if no one has already started to do so.

@cbismuth
Copy link
Contributor

cbismuth commented Sep 28, 2018

Here is a complete cURL recreation script, see Gist here.

@cbismuth
Copy link
Contributor

cbismuth commented Sep 28, 2018

Issue found and reproduced on master branch.

It happens that when setting only the high value, it's validated against other settings (i.e. low and flood_stage) not read from cluster state but only containing default values. Actual cluster state is enclosed in target parameter at line 719 below, but it isn't passed to the validate API.

To me, the real issue here is that we have dependencies between three parameters (low, high and flood_stage), and actual runtime cluster values aren’t passed to ensure these dependencies are met.

private boolean updateSettings(Settings toApply, Settings.Builder target, Settings.Builder updates, String type, boolean onlyDynamic) {
boolean changed = false;
final Set<String> toRemove = new HashSet<>();
Settings.Builder settingsBuilder = Settings.builder();
final Predicate<String> canUpdate = (key) -> (
isFinalSetting(key) == false && // it's not a final setting
((onlyDynamic == false && get(key) != null) || isDynamicSetting(key)));
for (String key : toApply.keySet()) {
boolean isDelete = toApply.hasValue(key) == false;
if (isDelete && (isValidDelete(key, onlyDynamic) || key.endsWith("*"))) {
// this either accepts null values that suffice the canUpdate test OR wildcard expressions (key ends with *)
// we don't validate if there is any dynamic setting with that prefix yet we could do in the future
toRemove.add(key);
// we don't set changed here it's set after we apply deletes below if something actually changed
} else if (get(key) == null) {
throw new IllegalArgumentException(type + " setting [" + key + "], not recognized");
} else if (isDelete == false && canUpdate.test(key)) {
validate(key, toApply, false); // we might not have a full picture here do to a dependency validation

@cbismuth
Copy link
Contributor

cbismuth commented Sep 28, 2018

@jasontedor @rjernst @s1monw

I've reworked my comment on this issue to make it easier to read. Could you please have a look at it and tell us what do you think? Thanks a lot.

@DaveCTurner
Copy link
Contributor Author

Hi @cbismuth, you can reproduce this in a unit test as follows:

diff --git a/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java b/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java
index 342fcea7dde..b99c13e2d99 100644
--- a/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java
+++ b/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java
@@ -203,4 +203,10 @@ public class DiskThresholdSettingsTests extends ESTestCase {
         assertThat(cause, hasToString(containsString("low disk watermark [85%] more than high disk watermark [75%]")));
     }

+    public void testSequenceOfUpdates() {
+        final ClusterSettings clusterSettings = new ClusterSettings(Settings.EMPTY, ClusterSettings.BUILT_IN_CLUSTER_SETTINGS);
+        new DiskThresholdSettings(Settings.EMPTY, clusterSettings); // this has the effect of registering the settings updater
+        clusterSettings.applySettings(Settings.builder().put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_FLOOD_STAGE_WATERMARK_SETTING.getKey(), "99%").build());
+        clusterSettings.applySettings(Settings.builder().put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_HIGH_DISK_WATERMARK_SETTING.getKey(), "97%").build());
+    }
 }

I'm not sure what you mean by "these dependencies aren't expressed in the code" - e.g. the LowDiskWatermarkValidator declares its dependency on the high and flood_stage settings here:

public Iterator<Setting<String>> settings() {
return Arrays.asList(
CLUSTER_ROUTING_ALLOCATION_HIGH_DISK_WATERMARK_SETTING,
CLUSTER_ROUTING_ALLOCATION_DISK_FLOOD_STAGE_WATERMARK_SETTING)
.iterator();
}

@cbismuth
Copy link
Contributor

cbismuth commented Sep 28, 2018

Sorry, I'll reword. You're right @DaveCTurner, static dependencies are expressed in the code. But when triggering validation, runtime cluster state values (containing 99% as flood_stage) aren't passed to the validator. Therefore, the validator validates input against default values.

In my previous comment, the runtime cluster state values are enclosed in the target parameter of AbstractScopedSettings#updateSettings API. And as you can see in the snippet I referenced, this parameter isn't passed line 719 to AbstractScopedSettings#validate validation API.

Is it clearer?

@DaveCTurner
Copy link
Contributor Author

Therefore, the validator validates input against default values.

Right, that makes sense. I tried changing AbstractScopedSettings in a few sensible-looking ways and all sorts of other tests broke, but I did not put much time into working out whether it was because the tests or the change were wrong.

@cbismuth
Copy link
Contributor

Yes, this API has a lot of usages. Thank you for the unit test, I’ll play with it 👍

@cbismuth
Copy link
Contributor

cbismuth commented Oct 1, 2018

I had a look to available validation APIs in AbstractScopedSettings class, and I've opened PR #34184 to suggest an enriched version of the actual validation.

No test seems to be broken on my laptop.

There may be a small overhead, but I let you give me your opinion on this.

Reported issue is fixed with these changes, and here is a sample output when high is higher than flood_stage.

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]",
    "suppressed" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]"
      },
      {
        "type" : "illegal_argument_exception",
        "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]"
      }
    ]
  },
  "status" : 400
}

@cbismuth
Copy link
Contributor

cbismuth commented Oct 4, 2018

@DaveCTurner I've added the unit test you suggested.
I had to slightly modify it by reusing returned settings from the AbstractScopedSettings#applySettings API. Otherwise we couldn't have kept track of updated settings in a unit test.

Does it make sense to you or did I miss something?

@DaveCTurner
Copy link
Contributor Author

Closed by #34184.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Settings Settings infrastructure and APIs help wanted adoptme v6.1.1
Projects
None yet
Development

No branches or pull requests

7 participants