Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking change for single data node setting #73737

19 changes: 19 additions & 0 deletions docs/reference/migration/migrate_8_0/settings.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -280,3 +280,22 @@ per index data path settings.

*Impact* +
Discontinue use of the deprecated settings.

[[single-data-node-watermark-setting]]
.Single data node watermark setting only accept true and is deprecated
[%collapsible]
====
*Details* +
In 7.14, setting `cluster.routing.allocation.disk.watermark.enable_for_single_data_node`
to false was deprecated. Starting in 8.0, the only legal value will be
true. In a future release, the setting will be removed completely, with same
behavior as if the setting was `true`.

If the old behavior is desired for a single data node cluster, disk based
allocation can be disabled by setting
`cluster.routing.allocation.disk.threshold_enabled: false`

*Impact* +
Discontinue use of the deprecated setting.
====

8 changes: 4 additions & 4 deletions docs/reference/modules/cluster/disk_allocator.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,10 @@ Controls the high watermark. It defaults to `90%`, meaning that {es} will attemp

`cluster.routing.allocation.disk.watermark.enable_for_single_data_node`::
(<<static-cluster-setting,Static>>)
For a single data node, the default is to disregard disk watermarks when
making an allocation decision. This is deprecated behavior and will be
changed in 8.0. This setting can be set to `true` to enable the
disk watermarks for a single data node cluster (will become default in 8.0).
In earlier releases, the default behaviour was to disregard disk watermarks for a single
data node cluster when making an allocation decision. This is deprecated behavior
since 7.14 and has been removed in 8.0. The only valid value for this setting
is now `true`. The setting will be removed in a future release.

[[cluster-routing-flood-stage]]
// tag::cluster-routing-flood-stage-tag[]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,11 @@
import org.elasticsearch.cluster.routing.allocation.RoutingAllocation;
import org.elasticsearch.common.Strings;
import org.elasticsearch.common.collect.ImmutableOpenMap;
import org.elasticsearch.common.logging.DeprecationLogger;
import org.elasticsearch.common.settings.ClusterSettings;
import org.elasticsearch.common.settings.Setting;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.settings.SettingsException;
import org.elasticsearch.common.unit.ByteSizeValue;
import org.elasticsearch.index.Index;
import org.elasticsearch.index.shard.ShardId;
Expand Down Expand Up @@ -66,23 +68,35 @@
public class DiskThresholdDecider extends AllocationDecider {

private static final Logger logger = LogManager.getLogger(DiskThresholdDecider.class);
private static final DeprecationLogger deprecationLogger = DeprecationLogger.getLogger(DiskThresholdDecider.class);

public static final String NAME = "disk_threshold";

public static final Setting<Boolean> ENABLE_FOR_SINGLE_DATA_NODE =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means the enableForSingleDataNode field is now always true and so can be inlined.

Setting.boolSetting("cluster.routing.allocation.disk.watermark.enable_for_single_data_node", false, Setting.Property.NodeScope);
Setting.boolSetting("cluster.routing.allocation.disk.watermark.enable_for_single_data_node", true,
new Setting.Validator<>() {
@Override
public void validate(Boolean value) {
if (value == Boolean.FALSE) {
throw new SettingsException("setting [{}=false] is not allowed, only true is valid",
ENABLE_FOR_SINGLE_DATA_NODE.getKey());
}
}
},
Setting.Property.NodeScope, Setting.Property.Deprecated);

public static final Setting<Boolean> SETTING_IGNORE_DISK_WATERMARKS =
Setting.boolSetting("index.routing.allocation.disk.watermark.ignore", false,
Setting.Property.IndexScope, Setting.Property.PrivateIndex);

private final DiskThresholdSettings diskThresholdSettings;
private final boolean enableForSingleDataNode;

public DiskThresholdDecider(Settings settings, ClusterSettings clusterSettings) {
this.diskThresholdSettings = new DiskThresholdSettings(settings, clusterSettings);
assert Version.CURRENT.major < 9 : "remove enable_for_single_data_node in 9";
this.enableForSingleDataNode = ENABLE_FOR_SINGLE_DATA_NODE.get(settings);
// get deprecation warnings.
boolean enabledForSingleDataNode = ENABLE_FOR_SINGLE_DATA_NODE.get(settings);
assert enabledForSingleDataNode;
}

/**
Expand Down Expand Up @@ -430,9 +444,6 @@ DiskUsage averageUsage(RoutingNode node, ImmutableOpenMap<String, DiskUsage> usa

private static final Decision YES_DISABLED = Decision.single(Decision.Type.YES, NAME, "the disk threshold decider is disabled");

private static final Decision YES_SINGLE_DATA_NODE =
Decision.single(Decision.Type.YES, NAME, "there is only a single data node present");

private static final Decision YES_USAGES_UNAVAILABLE = Decision.single(Decision.Type.YES, NAME, "disk usages are unavailable");

private Decision earlyTerminate(RoutingAllocation allocation, ImmutableOpenMap<String, DiskUsage> usages) {
Expand All @@ -441,12 +452,6 @@ private Decision earlyTerminate(RoutingAllocation allocation, ImmutableOpenMap<S
return YES_DISABLED;
}

// Allow allocation regardless if only a single data node is available
if (enableForSingleDataNode == false && allocation.nodes().getDataNodes().size() <= 1) {
logger.trace("only a single data node is present, allowing allocation");
return YES_SINGLE_DATA_NODE;
}

// Fail open if there are no disk usages available
if (usages.isEmpty()) {
logger.trace("unable to determine disk usages for disk-aware allocation, allowing allocation");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
import org.elasticsearch.common.UUIDs;
import org.elasticsearch.common.collect.ImmutableOpenMap;
import org.elasticsearch.common.settings.ClusterSettings;
import org.elasticsearch.common.settings.Setting;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.index.Index;
import org.elasticsearch.index.shard.ShardId;
Expand Down Expand Up @@ -925,141 +926,15 @@ Settings.EMPTY, new ClusterSettings(Settings.EMPTY, ClusterSettings.BUILT_IN_CLU
assertThat(result.routingTable().index("test").getShards().get(1).primaryShard().relocatingNodeId(), equalTo("node2"));
}

public void testForSingleDataNode() {
// remove test in 9.0
Settings diskSettings = Settings.builder()
.put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey(), true)
.put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_LOW_DISK_WATERMARK_SETTING.getKey(), "60%")
.put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_HIGH_DISK_WATERMARK_SETTING.getKey(), "70%").build();

ImmutableOpenMap.Builder<String, DiskUsage> usagesBuilder = ImmutableOpenMap.builder();
usagesBuilder.put("node1", new DiskUsage("node1", "n1", "/dev/null", 100, 100)); // 0% used
usagesBuilder.put("node2", new DiskUsage("node2", "n2", "/dev/null", 100, 20)); // 80% used
usagesBuilder.put("node3", new DiskUsage("node3", "n3", "/dev/null", 100, 100)); // 0% used
ImmutableOpenMap<String, DiskUsage> usages = usagesBuilder.build();

// We have an index with 1 primary shards each taking 40 bytes. Each node has 100 bytes available
ImmutableOpenMap.Builder<String, Long> shardSizes = ImmutableOpenMap.builder();
shardSizes.put("[test][0][p]", 40L);
shardSizes.put("[test][1][p]", 40L);
final ClusterInfo clusterInfo = new DevNullClusterInfo(usages, usages, shardSizes.build());

DiskThresholdDecider diskThresholdDecider = makeDecider(diskSettings);
Metadata metadata = Metadata.builder()
.put(IndexMetadata.builder("test").settings(settings(Version.CURRENT)).numberOfShards(2).numberOfReplicas(0))
.build();

RoutingTable initialRoutingTable = RoutingTable.builder()
.addAsNew(metadata.index("test"))
.build();

logger.info("--> adding one master node, one data node");
DiscoveryNode discoveryNode1 = new DiscoveryNode("", "node1", buildNewFakeTransportAddress(), emptyMap(),
singleton(DiscoveryNodeRole.MASTER_ROLE), Version.CURRENT);
DiscoveryNode discoveryNode2 = new DiscoveryNode("", "node2", buildNewFakeTransportAddress(), emptyMap(),
singleton(DiscoveryNodeRole.DATA_ROLE), Version.CURRENT);

DiscoveryNodes discoveryNodes = DiscoveryNodes.builder().add(discoveryNode1).add(discoveryNode2).build();
ClusterState baseClusterState = ClusterState.builder(ClusterName.CLUSTER_NAME_SETTING.getDefault(Settings.EMPTY))
.metadata(metadata)
.routingTable(initialRoutingTable)
.nodes(discoveryNodes)
.build();

// Two shards consumes 80% of disk space in data node, but we have only one data node, shards should remain.
ShardRouting firstRouting = TestShardRouting.newShardRouting("test", 0, "node2", null, true, ShardRoutingState.STARTED);
ShardRouting secondRouting = TestShardRouting.newShardRouting("test", 1, "node2", null, true, ShardRoutingState.STARTED);
RoutingNode firstRoutingNode = new RoutingNode("node2", discoveryNode2, firstRouting, secondRouting);

RoutingTable.Builder builder = RoutingTable.builder().add(
IndexRoutingTable.builder(firstRouting.index())
.addIndexShard(new IndexShardRoutingTable.Builder(firstRouting.shardId())
.addShard(firstRouting)
.build()
)
.addIndexShard(new IndexShardRoutingTable.Builder(secondRouting.shardId())
.addShard(secondRouting)
.build()
)
);
ClusterState clusterState = ClusterState.builder(baseClusterState).routingTable(builder.build()).build();
RoutingAllocation routingAllocation = new RoutingAllocation(null, new RoutingNodes(clusterState), clusterState, clusterInfo,
null, System.nanoTime());
routingAllocation.debugDecision(true);
Decision decision = diskThresholdDecider.canRemain(firstRouting, firstRoutingNode, routingAllocation);

// Two shards should start happily
assertThat(decision.type(), equalTo(Decision.Type.YES));
assertThat(decision.getExplanation(), containsString("there is only a single data node present"));
ClusterInfoService cis = () -> {
logger.info("--> calling fake getClusterInfo");
return clusterInfo;
};

AllocationDeciders deciders = new AllocationDeciders(new HashSet<>(Arrays.asList(
new SameShardAllocationDecider(
Settings.EMPTY, new ClusterSettings(Settings.EMPTY, ClusterSettings.BUILT_IN_CLUSTER_SETTINGS)
),
diskThresholdDecider
)));

AllocationService strategy = new AllocationService(deciders, new TestGatewayAllocator(),
new BalancedShardsAllocator(Settings.EMPTY), cis, EmptySnapshotsInfoService.INSTANCE);
ClusterState result = strategy.reroute(clusterState, "reroute");

assertThat(result.routingTable().index("test").getShards().get(0).primaryShard().state(), equalTo(STARTED));
assertThat(result.routingTable().index("test").getShards().get(0).primaryShard().currentNodeId(), equalTo("node2"));
assertThat(result.routingTable().index("test").getShards().get(0).primaryShard().relocatingNodeId(), nullValue());
assertThat(result.routingTable().index("test").getShards().get(1).primaryShard().state(), equalTo(STARTED));
assertThat(result.routingTable().index("test").getShards().get(1).primaryShard().currentNodeId(), equalTo("node2"));
assertThat(result.routingTable().index("test").getShards().get(1).primaryShard().relocatingNodeId(), nullValue());

// Add another datanode, it should relocate.
logger.info("--> adding node3");
DiscoveryNode discoveryNode3 = new DiscoveryNode("", "node3", buildNewFakeTransportAddress(), emptyMap(),
singleton(DiscoveryNodeRole.DATA_ROLE), Version.CURRENT);
ClusterState updateClusterState = ClusterState.builder(clusterState).nodes(DiscoveryNodes.builder(clusterState.nodes())
.add(discoveryNode3)).build();

firstRouting = TestShardRouting.newShardRouting("test", 0, "node2", null, true, ShardRoutingState.STARTED);
secondRouting = TestShardRouting.newShardRouting("test", 1, "node2", "node3", true, ShardRoutingState.RELOCATING);
firstRoutingNode = new RoutingNode("node2", discoveryNode2, firstRouting, secondRouting);
builder = RoutingTable.builder().add(
IndexRoutingTable.builder(firstRouting.index())
.addIndexShard(new IndexShardRoutingTable.Builder(firstRouting.shardId())
.addShard(firstRouting)
.build()
)
.addIndexShard(new IndexShardRoutingTable.Builder(secondRouting.shardId())
.addShard(secondRouting)
.build()
)
);

clusterState = ClusterState.builder(updateClusterState).routingTable(builder.build()).build();
routingAllocation = new RoutingAllocation(null, new RoutingNodes(clusterState), clusterState, clusterInfo, null,
System.nanoTime());
routingAllocation.debugDecision(true);
decision = diskThresholdDecider.canRemain(firstRouting, firstRoutingNode, routingAllocation);
assertThat(decision.type(), equalTo(Decision.Type.YES));
assertThat(((Decision.Single) decision).getExplanation(), containsString(
"there is enough disk on this node for the shard to remain, free: [60b]"));

result = strategy.reroute(clusterState, "reroute");
assertThat(result.routingTable().index("test").getShards().get(0).primaryShard().state(), equalTo(STARTED));
assertThat(result.routingTable().index("test").getShards().get(0).primaryShard().currentNodeId(), equalTo("node2"));
assertThat(result.routingTable().index("test").getShards().get(0).primaryShard().relocatingNodeId(), nullValue());
assertThat(result.routingTable().index("test").getShards().get(1).primaryShard().state(), equalTo(RELOCATING));
assertThat(result.routingTable().index("test").getShards().get(1).primaryShard().currentNodeId(), equalTo("node2"));
assertThat(result.routingTable().index("test").getShards().get(1).primaryShard().relocatingNodeId(), equalTo("node3"));
}

public void testWatermarksEnabledForSingleDataNode() {
Settings diskSettings = Settings.builder()
.put(DiskThresholdDecider.ENABLE_FOR_SINGLE_DATA_NODE.getKey(), true)
Settings.Builder builder = Settings.builder()
.put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_THRESHOLD_ENABLED_SETTING.getKey(), true)
.put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_LOW_DISK_WATERMARK_SETTING.getKey(), "60%")
.put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_HIGH_DISK_WATERMARK_SETTING.getKey(), "70%").build();
.put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_HIGH_DISK_WATERMARK_SETTING.getKey(), "70%");
if (randomBoolean()) {
builder.put(DiskThresholdDecider.ENABLE_FOR_SINGLE_DATA_NODE.getKey(), true);
}
Settings diskSettings = builder.build();

ImmutableOpenMap.Builder<String, DiskUsage> usagesBuilder = ImmutableOpenMap.builder();
usagesBuilder.put("data", new DiskUsage("data", "data", "/dev/null", 100, 20)); // 80% used
Expand Down Expand Up @@ -1131,6 +1006,25 @@ Settings.EMPTY, new ClusterSettings(Settings.EMPTY, ClusterSettings.BUILT_IN_CLU
"the shard cannot remain on this node because it is above the high watermark cluster setting" +
" [cluster.routing.allocation.disk.watermark.high=70%] and there is less than the required [30.0%] free disk on node," +
" actual free: [20.0%]"));

if (DiskThresholdDecider.ENABLE_FOR_SINGLE_DATA_NODE.exists(diskSettings)) {
assertSettingDeprecationsAndWarnings(new Setting<?>[] { DiskThresholdDecider.ENABLE_FOR_SINGLE_DATA_NODE });
}
}

public void testSingleDataNodeDeprecationWarning() {
Settings settings = Settings.builder()
.put(DiskThresholdDecider.ENABLE_FOR_SINGLE_DATA_NODE.getKey(), false)
.build();

IllegalArgumentException e = expectThrows(IllegalArgumentException.class,
() -> new DiskThresholdDecider(settings, new ClusterSettings(settings, ClusterSettings.BUILT_IN_CLUSTER_SETTINGS)));

assertThat(e.getCause().getMessage(),
equalTo("setting [cluster.routing.allocation.disk.watermark.enable_for_single_data_node=false] is not allowed," +
" only true is valid"));

assertSettingDeprecationsAndWarnings(new Setting<?>[]{ DiskThresholdDecider.ENABLE_FOR_SINGLE_DATA_NODE });
}

public void testDiskThresholdWithSnapshotShardSizes() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ private DeprecationChecks() {
Collections.emptyList();

static List<BiFunction<Settings, PluginsAndModules, DeprecationIssue>> NODE_SETTINGS_CHECKS = List.of(
NodeDeprecationChecks::checkSharedDataPathSetting
NodeDeprecationChecks::checkSharedDataPathSetting,
NodeDeprecationChecks::checkSingleDataNodeWatermarkSetting
);

static List<Function<IndexMetadata, DeprecationIssue>> INDEX_SETTINGS_CHECKS = List.of(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
package org.elasticsearch.xpack.deprecation;

import org.elasticsearch.action.admin.cluster.node.info.PluginsAndModules;
import org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDecider;
import org.elasticsearch.common.settings.Setting;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.env.Environment;
Expand Down Expand Up @@ -125,4 +126,18 @@ static DeprecationIssue checkSharedDataPathSetting(final Settings settings, fina
}
return null;
}

static DeprecationIssue checkSingleDataNodeWatermarkSetting(final Settings settings, final PluginsAndModules pluginsAndModules) {
if (DiskThresholdDecider.ENABLE_FOR_SINGLE_DATA_NODE.exists(settings)) {
String key = DiskThresholdDecider.ENABLE_FOR_SINGLE_DATA_NODE.getKey();
return new DeprecationIssue(DeprecationIssue.Level.CRITICAL,
String.format(Locale.ROOT, "setting [%s] is deprecated and will not be available in a future version", key),
"https://www.elastic.co/guide/en/elasticsearch/reference/7.14/" +
"breaking-changes-7.14.html#deprecate-single-data-node-watermark",
String.format(Locale.ROOT, "found [%s] configured. Discontinue use of this setting.", key)
);
}

return null;
}
}
Loading