Disk watermark validation rejects apparently valid sequence of updates #28309

DaveCTurner · 2018-01-19T09:59:29Z

$ elasticsearch-6.1.1/bin/elasticsearch --version
Java HotSpot(TM) 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
Version: 6.1.1, Build: bd92e7f/2017-12-17T20:23:25.338Z, JVM: 9.0.1
$ java -version
java version "9.0.1"
Java(TM) SE Runtime Environment (build 9.0.1+11)
Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode)
$ uname -a
Darwin Davids-MacBook-Pro.local 17.3.0 Darwin Kernel Version 17.3.0: Thu Nov  9 18:09:22 PST 2017; root:xnu-4570.31.3~1/RELEASE_X86_64 x86_64

On a fresh one-node cluster, the first of these operations succeeds but the second fails:

PUT _cluster/settings
{"transient":{"cluster.routing.allocation.disk.watermark.flood_stage":"99%"}}

PUT _cluster/settings
{"transient":{"cluster.routing.allocation.disk.watermark.high":"97%"}}

The result is:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "high disk watermark [97%] more than flood stage disk watermark [95%]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "high disk watermark [97%] more than flood stage disk watermark [95%]"
  },
  "status": 400
}

This is surprising because the flood stage watermark was set to 99% but the error message claims it is still 95%. As expected, GET _cluster/settings yields:

{
  "persistent": {},
  "transient": {
    "cluster": {
      "routing": {
        "allocation": {
          "disk": {
            "watermark": {
              "flood_stage": "99%"
            }
          }
        }
      }
    }
  }
}

Setting both watermarks together succeeds:

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.disk.watermark.flood_stage": "99%",
    "cluster.routing.allocation.disk.watermark.high": "97%"
  }
}

I see the same effect replacing transient with persistent throughout too.

The text was updated successfully, but these errors were encountered:

dylanPowers · 2018-04-18T06:31:18Z

This also happens with high versus low and is still present in v6.2.3

GET _cluster/settings
{
  "persistent": {
    "cluster": {
      "routing": {
        "allocation": {
          "disk": {
            "watermark": {
              "low": "50%"
            }

PUT _cluster/settings 
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.high":  "66%"
  }
}

Result:
{
  "error": {
    "root_cause": [
      {
        "type": "remote_transport_exception",
        "reason": "[tw-r03-s01.master-elasticsearch-logging-staging.service.chcg01.consul][172.23.73.194:9300][cluster:admin/settings/update]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "low disk watermark [85%] more than high disk watermark [66%]"
  },
  "status": 400
}

faxm0dem · 2018-09-10T12:28:34Z

I can confirm this is still in 6.3.2 thanks to @Xylakant for pointing me here

cbismuth · 2018-09-28T08:40:24Z

Looks interesting, I'd like to work on it if no one has already started to do so.

cbismuth · 2018-09-28T10:14:56Z

Here is a complete cURL recreation script, see Gist here.

cbismuth · 2018-09-28T11:06:15Z

Issue found and reproduced on master branch.

It happens that when setting only the high value, it's validated against other settings (i.e. low and flood_stage) not read from cluster state but only containing default values. Actual cluster state is enclosed in target parameter at line 719 below, but it isn't passed to the validate API.

To me, the real issue here is that we have dependencies between three parameters (low, high and flood_stage), and actual runtime cluster values aren’t passed to ensure these dependencies are met.

elasticsearch/server/src/main/java/org/elasticsearch/common/settings/AbstractScopedSettings.java

Lines 702 to 719 in c74c46e

    
           private boolean updateSettings(Settings toApply, Settings.Builder target, Settings.Builder updates, String type, boolean onlyDynamic) { 
        
               boolean changed = false; 
        
               final Set<String> toRemove = new HashSet<>(); 
        
               Settings.Builder settingsBuilder = Settings.builder(); 
        
               final Predicate<String> canUpdate = (key) -> ( 
        
                   isFinalSetting(key) == false && // it's not a final setting 
        
                       ((onlyDynamic == false && get(key) != null) || isDynamicSetting(key))); 
        
               for (String key : toApply.keySet()) { 
        
                   boolean isDelete = toApply.hasValue(key) == false; 
        
                   if (isDelete && (isValidDelete(key, onlyDynamic) || key.endsWith("*"))) { 
        
                       // this either accepts null values that suffice the canUpdate test OR wildcard expressions (key ends with *) 
        
                       // we don't validate if there is any dynamic setting with that prefix yet we could do in the future 
        
                       toRemove.add(key); 
        
                       // we don't set changed here it's set after we apply deletes below if something actually changed 
        
                   } else if (get(key) == null) { 
        
                       throw new IllegalArgumentException(type + " setting [" + key + "], not recognized"); 
        
                   } else if (isDelete == false && canUpdate.test(key)) { 
        
                       validate(key, toApply, false); // we might not have a full picture here do to a dependency validation

cbismuth · 2018-09-28T13:58:39Z

@jasontedor @rjernst @s1monw

I've reworked my comment on this issue to make it easier to read. Could you please have a look at it and tell us what do you think? Thanks a lot.

DaveCTurner · 2018-09-28T15:04:55Z

Hi @cbismuth, you can reproduce this in a unit test as follows:

diff --git a/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java b/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java
index 342fcea7dde..b99c13e2d99 100644
--- a/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java
+++ b/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java
@@ -203,4 +203,10 @@ public class DiskThresholdSettingsTests extends ESTestCase {
         assertThat(cause, hasToString(containsString("low disk watermark [85%] more than high disk watermark [75%]")));
     }

+    public void testSequenceOfUpdates() {
+        final ClusterSettings clusterSettings = new ClusterSettings(Settings.EMPTY, ClusterSettings.BUILT_IN_CLUSTER_SETTINGS);
+        new DiskThresholdSettings(Settings.EMPTY, clusterSettings); // this has the effect of registering the settings updater
+        clusterSettings.applySettings(Settings.builder().put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_FLOOD_STAGE_WATERMARK_SETTING.getKey(), "99%").build());
+        clusterSettings.applySettings(Settings.builder().put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_HIGH_DISK_WATERMARK_SETTING.getKey(), "97%").build());
+    }
 }

I'm not sure what you mean by "these dependencies aren't expressed in the code" - e.g. the LowDiskWatermarkValidator declares its dependency on the high and flood_stage settings here:

elasticsearch/server/src/main/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettings.java

Lines 105 to 110 in 31aabe4

    
           public Iterator<Setting<String>> settings() { 
        
               return Arrays.asList( 
        
                       CLUSTER_ROUTING_ALLOCATION_HIGH_DISK_WATERMARK_SETTING, 
        
                       CLUSTER_ROUTING_ALLOCATION_DISK_FLOOD_STAGE_WATERMARK_SETTING) 
        
                       .iterator(); 
        
           }

cbismuth · 2018-09-28T15:15:42Z

Sorry, I'll reword. You're right @DaveCTurner, static dependencies are expressed in the code. But when triggering validation, runtime cluster state values (containing 99% as flood_stage) aren't passed to the validator. Therefore, the validator validates input against default values.

In my previous comment, the runtime cluster state values are enclosed in the target parameter of AbstractScopedSettings#updateSettings API. And as you can see in the snippet I referenced, this parameter isn't passed line 719 to AbstractScopedSettings#validate validation API.

Is it clearer?

DaveCTurner · 2018-09-28T16:26:27Z

Therefore, the validator validates input against default values.

Right, that makes sense. I tried changing AbstractScopedSettings in a few sensible-looking ways and all sorts of other tests broke, but I did not put much time into working out whether it was because the tests or the change were wrong.

cbismuth · 2018-09-28T16:35:11Z

Yes, this API has a lot of usages. Thank you for the unit test, I’ll play with it 👍

cbismuth · 2018-10-01T13:36:11Z

I had a look to available validation APIs in AbstractScopedSettings class, and I've opened PR #34184 to suggest an enriched version of the actual validation.

No test seems to be broken on my laptop.

There may be a small overhead, but I let you give me your opinion on this.

Reported issue is fixed with these changes, and here is a sample output when high is higher than flood_stage.

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]",
    "suppressed" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]"
      },
      {
        "type" : "illegal_argument_exception",
        "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]"
      }
    ]
  },
  "status" : 400
}

cbismuth · 2018-10-04T12:57:25Z

@DaveCTurner I've added the unit test you suggested.
I had to slightly modify it by reusing returned settings from the AbstractScopedSettings#applySettings API. Otherwise we couldn't have kept track of updated settings in a unit test.

Does it make sense to you or did I miss something?

DaveCTurner · 2019-01-07T16:14:55Z

Closed by #34184.

DaveCTurner added >bug :Core/Infra/Settings Settings infrastructure and APIs v6.1.1 labels Jan 19, 2018

DaveCTurner assigned jasontedor and s1monw and unassigned jasontedor Jan 19, 2018

jasontedor assigned jasontedor and unassigned s1monw Jan 19, 2018

rjernst added the help wanted adoptme label Mar 14, 2018

DaveCTurner mentioned this issue Oct 4, 2018

Separate out validation of groups of settings #34184

Merged

DaveCTurner closed this as completed Jan 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disk watermark validation rejects apparently valid sequence of updates #28309

Disk watermark validation rejects apparently valid sequence of updates #28309

DaveCTurner commented Jan 19, 2018

dylanPowers commented Apr 18, 2018

faxm0dem commented Sep 10, 2018

cbismuth commented Sep 28, 2018

cbismuth commented Sep 28, 2018 •

edited

Loading

cbismuth commented Sep 28, 2018 •

edited

Loading

cbismuth commented Sep 28, 2018 •

edited

Loading

DaveCTurner commented Sep 28, 2018

cbismuth commented Sep 28, 2018 •

edited

Loading

DaveCTurner commented Sep 28, 2018

cbismuth commented Sep 28, 2018

cbismuth commented Oct 1, 2018 •

edited

Loading

cbismuth commented Oct 4, 2018

DaveCTurner commented Jan 7, 2019

Disk watermark validation rejects apparently valid sequence of updates #28309

Disk watermark validation rejects apparently valid sequence of updates #28309

Comments

DaveCTurner commented Jan 19, 2018

dylanPowers commented Apr 18, 2018

faxm0dem commented Sep 10, 2018

cbismuth commented Sep 28, 2018

cbismuth commented Sep 28, 2018 • edited Loading

cbismuth commented Sep 28, 2018 • edited Loading

cbismuth commented Sep 28, 2018 • edited Loading

DaveCTurner commented Sep 28, 2018

cbismuth commented Sep 28, 2018 • edited Loading

DaveCTurner commented Sep 28, 2018

cbismuth commented Sep 28, 2018

cbismuth commented Oct 1, 2018 • edited Loading

cbismuth commented Oct 4, 2018

DaveCTurner commented Jan 7, 2019

cbismuth commented Sep 28, 2018 •

edited

Loading

cbismuth commented Sep 28, 2018 •

edited

Loading

cbismuth commented Sep 28, 2018 •

edited

Loading

cbismuth commented Sep 28, 2018 •

edited

Loading

cbismuth commented Oct 1, 2018 •

edited

Loading