Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ILM migrate data between tiers #61377

Merged
merged 42 commits into from
Sep 17, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
f4a37a9
Enable the _tier attribute as a node filter
andreidan Aug 20, 2020
024a0de
ILM: inject a migrate step to migrate data between data tiers
andreidan Aug 20, 2020
ca9800f
Merge branch 'master' into ilm-migrate-data-between-tiers
andreidan Sep 1, 2020
07dc8de
Add a DataTierMigrationRoutedStep and cleanup
andreidan Sep 3, 2020
42900b2
Fix tests
andreidan Sep 4, 2020
5d6729b
Fix test
andreidan Sep 4, 2020
28ccee9
Remove invalid tier name setting validation from ILM step
andreidan Sep 4, 2020
45d2eb3
Add test for DataTierMigrationRoutedStep
andreidan Sep 4, 2020
6c05591
Disable data tier migration in the full policy we use for timeseries ITS
andreidan Sep 4, 2020
783061d
Remove unused import
andreidan Sep 4, 2020
ab3ac33
Revert "Disable data tier migration in the full policy we use for tim…
andreidan Sep 6, 2020
6b74d7c
Add MigrateActionTest
andreidan Sep 6, 2020
61ed9eb
Merge branch 'master' into ilm-migrate-data-between-tiers
elasticmachine Sep 6, 2020
6f83f91
Don't randomly pick a disabled migrate action as it has no steps
andreidan Sep 7, 2020
4bef9b2
Add ILM tier migratino IT
andreidan Sep 7, 2020
5024595
Use 0 replicas in docs test as `check-migration` waits for replicas t…
andreidan Sep 7, 2020
41dc480
Fix ILMHistoryTests
andreidan Sep 7, 2020
d9b6390
Merge branch 'master' into ilm-migrate-data-between-tiers
elasticmachine Sep 7, 2020
8e9a4f3
Disable migrate action in test that waits for shard allocations
andreidan Sep 7, 2020
996c747
Add ILM client MigrateAction
andreidan Sep 7, 2020
83c31b6
Remove unused import
andreidan Sep 7, 2020
c1746d4
Fix HLRC tests
andreidan Sep 7, 2020
fe1fb8f
Add tests for conflicting auto allocate to hot nodes and MigrateActio…
andreidan Sep 7, 2020
e435a2f
Use valid tier name in test
andreidan Sep 7, 2020
656afe5
Implement hashcode
andreidan Sep 8, 2020
a24b0fc
Merge branch 'master' into ilm-migrate-data-between-tiers
andreidan Sep 8, 2020
8d806e2
Make method static
andreidan Sep 8, 2020
2133715
Check for "data" role in `check-migration` step
andreidan Sep 10, 2020
991fdc0
Document AllocationInfo and rename `actual_replicas` to `number_of_re…
andreidan Sep 10, 2020
be65929
Disallow the migrate action in the hot phase
andreidan Sep 10, 2020
39b518e
Remove rogue ;
andreidan Sep 10, 2020
f3a64d3
Fix HLRC tests
andreidan Sep 10, 2020
d119797
Merge branch 'master' into ilm-migrate-data-between-tiers
elasticmachine Sep 10, 2020
8f855c5
Fix ILMHistoryTests
andreidan Sep 10, 2020
0636d0e
Merge branch 'master' into ilm-migrate-data-between-tiers
elasticmachine Sep 14, 2020
500a275
Merge branch 'master' into ilm-migrate-data-between-tiers
elasticmachine Sep 16, 2020
9b90555
Remove frozen phase from HLRC
andreidan Sep 16, 2020
eddd223
Execute the allocate action before migrate
andreidan Sep 16, 2020
d733d72
Adjust test to wait until ILM completes in the cold phase
andreidan Sep 16, 2020
4da7379
Log and exception messages
andreidan Sep 16, 2020
be17159
Merge branch 'master' into ilm-migrate-data-between-tiers
andreidan Sep 16, 2020
b837718
Drop equals and hashcode (in favour of the ones in Step)
andreidan Sep 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,9 @@ public boolean match(DiscoveryNode node) {
}
}
}
} else if ("_tier".equals(attr)) {
// Always allow _tier as an attribute, will be handled elsewhere
return true;
} else {
String nodeAttributeValue = node.getAttributes().get(attr);
if (nodeAttributeValue == null) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@
import org.elasticsearch.xpack.core.ilm.IndexLifecycleMetadata;
import org.elasticsearch.xpack.core.ilm.LifecycleAction;
import org.elasticsearch.xpack.core.ilm.LifecycleType;
import org.elasticsearch.xpack.core.ilm.MigrateAction;
import org.elasticsearch.xpack.core.ilm.ReadOnlyAction;
import org.elasticsearch.xpack.core.ilm.RolloverAction;
import org.elasticsearch.xpack.core.ilm.SearchableSnapshotAction;
Expand Down Expand Up @@ -484,6 +485,7 @@ public List<NamedWriteableRegistry.Entry> getNamedWriteables() {
new NamedWriteableRegistry.Entry(LifecycleAction.class, UnfollowAction.NAME, UnfollowAction::new),
new NamedWriteableRegistry.Entry(LifecycleAction.class, WaitForSnapshotAction.NAME, WaitForSnapshotAction::new),
new NamedWriteableRegistry.Entry(LifecycleAction.class, SearchableSnapshotAction.NAME, SearchableSnapshotAction::new),
new NamedWriteableRegistry.Entry(LifecycleAction.class, MigrateAction.NAME, MigrateAction::new),
// Transforms
new NamedWriteableRegistry.Entry(XPackFeatureSet.Usage.class, XPackField.TRANSFORM, TransformFeatureSetUsage::new),
new NamedWriteableRegistry.Entry(PersistentTaskParams.class, TransformField.TASK_NAME, TransformTaskParams::new),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
package org.elasticsearch.xpack.core.ilm;

import com.carrotsearch.hppc.cursors.ObjectCursor;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.elasticsearch.action.support.ActiveShardCount;
Expand All @@ -22,15 +21,19 @@
import org.elasticsearch.common.Strings;
import org.elasticsearch.common.collect.ImmutableOpenIntMap;
import org.elasticsearch.common.settings.ClusterSettings;
import org.elasticsearch.common.settings.Setting;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.xcontent.ConstructingObjectParser;
import org.elasticsearch.common.xcontent.ToXContentObject;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.index.Index;
import org.elasticsearch.xpack.cluster.routing.allocation.DataTierAllocationDecider;

import java.io.IOException;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Objects;
import java.util.Set;

/**
* Checks whether all shards have been correctly routed in response to an update to the allocation rules for an index.
Expand All @@ -40,8 +43,23 @@ public class AllocationRoutedStep extends ClusterStateWaitStep {

private static final Logger logger = LogManager.getLogger(AllocationRoutedStep.class);

private static final AllocationDeciders ALLOCATION_DECIDERS = new AllocationDeciders(Collections.singletonList(
new FilterAllocationDecider(Settings.EMPTY, new ClusterSettings(Settings.EMPTY, ClusterSettings.BUILT_IN_CLUSTER_SETTINGS))));
private static final Set<Setting<?>> ALL_CLUSTER_SETTINGS;

static {
Set<Setting<?>> allSettings = new HashSet<>(ClusterSettings.BUILT_IN_CLUSTER_SETTINGS);
allSettings.add(DataTierAllocationDecider.CLUSTER_ROUTING_REQUIRE_SETTING);
allSettings.add(DataTierAllocationDecider.CLUSTER_ROUTING_INCLUDE_SETTING);
allSettings.add(DataTierAllocationDecider.CLUSTER_ROUTING_EXCLUDE_SETTING);
ALL_CLUSTER_SETTINGS = allSettings;
}

private static final AllocationDeciders ALLOCATION_DECIDERS = new AllocationDeciders(
List.of(
new FilterAllocationDecider(Settings.EMPTY, new ClusterSettings(Settings.EMPTY,
ClusterSettings.BUILT_IN_CLUSTER_SETTINGS)),
new DataTierAllocationDecider(new ClusterSettings(Settings.EMPTY, ALL_CLUSTER_SETTINGS))
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make this configurable and serializable for the step? (ie. use just the one allocation decider depending on where it is used)
I'd say we want a fully allocated index so verifying both always is alright. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I actually think that we may want to split checking the allocation for the migrate action into a separate step. For example, the allocation routed step currently has a pretty generic message (Waiting for [n] shards to be allocated to nodes matching the given filters). I think if we split this into a new MigrationRouted step we could give it a much better explanation, for example, something like:

waiting [23m] for [3] shards to be allocated on nodes with the [data_warm] tier

additionally, I think we could also even throw an error to signal to the user when things were in a bad state, something like:

exception waiting for index to be moved to the [data_cold] tier, there are currently no [data_cold] nodes in the cluster

Then the step could be retryable (so we check every 10 minutes) and it at least gives us a way of signaling to a user (alerting on the ilm-history index for example) when they are in an irreconcilable position and need to adjust their cluster.

What do you think?

Copy link
Contributor Author

@andreidan andreidan Aug 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You make a great point on validating if the cluster has any node with a particular role available. I'll create another step for the migrate action (the nicer messages will be a great UX improvement as well)

);

AllocationRoutedStep(StepKey key, StepKey nextStepKey) {
super(key, nextStepKey);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License;
* you may not use this file except in compliance with the Elastic License.
*/
package org.elasticsearch.xpack.core.ilm;

import org.elasticsearch.client.Client;
import org.elasticsearch.common.ParseField;
import org.elasticsearch.common.Strings;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.common.xcontent.ConstructingObjectParser;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.common.xcontent.XContentParser;
import org.elasticsearch.xpack.core.ilm.Step.StepKey;

import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.Objects;

/**
* A {@link LifecycleAction} which enables or disables the automatic migration of data between
* {@link org.elasticsearch.xpack.core.DataTier}s.
*/
public class MigrateAction implements LifecycleAction {
public static final String NAME = "migrate";
public static final ParseField ENABLED_FIELD = new ParseField("enabled");

private static final ConstructingObjectParser<MigrateAction, Void> PARSER = new ConstructingObjectParser<>(NAME,
a -> new MigrateAction((boolean) a[0]));

static {
PARSER.declareBoolean(ConstructingObjectParser.constructorArg(), ENABLED_FIELD);
andreidan marked this conversation as resolved.
Show resolved Hide resolved
}

private final boolean enabled;
andreidan marked this conversation as resolved.
Show resolved Hide resolved

public static MigrateAction parse(XContentParser parser) {
return PARSER.apply(parser, null);
}

public MigrateAction(boolean enabled) {
this.enabled = enabled;
}

public MigrateAction(StreamInput in) throws IOException {
this(in.readBoolean());
}

@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeBoolean(enabled);
}

@Override
public String getWriteableName() {
return NAME;
}

public boolean isEnabled() {
return enabled;
}

@Override
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
builder.startObject();
builder.field(ENABLED_FIELD.getPreferredName(), enabled);
builder.endObject();
return builder;
}

@Override
public boolean isSafeAction() {
return true;
}

@Override
public List<Step> toSteps(Client client, String phase, StepKey nextStepKey) {
if (enabled) {
Map<String, String> include = Map.of("_tier", "data_" + phase);
andreidan marked this conversation as resolved.
Show resolved Hide resolved
AllocateAction migrateDataAction = new AllocateAction(null, include, null, null);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove the possible_require and _exclude settings (which might've been set manually before) to make sure they don't invalidate the "migrate to the next data tier" goal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question, I'm not sure we'd want to piecemeal remove all of them, because they could be something for an additional attribute that we want to preserve between phases.

I will have to think on it a bit, it also makes me wonder whether we should make it configurable whether all other index-level allocation filtering settings are removed/preserved when doing the migration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an interesting one - we do not allow to have the migrate action enabled in a phase that has the allocate action (configuring allocation), so maybe it makes sense for the migrate action to invalidate the possible allocation filterings the index might have (for eg. from a previous phase where the migrate action was disabled and the allocation action was configured)

return migrateDataAction.toSteps(client, phase, nextStepKey);
} else {
return List.of();
}
}

@Override
public int hashCode() {
return Objects.hash(enabled);
}

@Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (obj.getClass() != getClass()) {
return false;
}
MigrateAction other = (MigrateAction) obj;
return Objects.equals(enabled, other.enabled);
}

@Override
public String toString() {
return Strings.toString(this);
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
import java.util.Objects;

/**
* A {@link LifecycleAction} which deletes the index.
* A {@link LifecycleAction} which rolls over the index.
andreidan marked this conversation as resolved.
Show resolved Hide resolved
*/
public class RolloverAction implements LifecycleAction {
public static final String NAME = "rollover";
Expand Down
Loading