Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not restore autoscaling policy from snapshots #66023

Merged
merged 9 commits into from
Dec 11, 2020
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@
import org.elasticsearch.common.lucene.Lucene;
import org.elasticsearch.common.regex.Regex;
import org.elasticsearch.common.settings.ClusterSettings;
import org.elasticsearch.common.settings.Setting;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.index.Index;
import org.elasticsearch.index.IndexSettings;
Expand Down Expand Up @@ -161,6 +162,8 @@ public class RestoreService implements ClusterStateApplier {

private final ClusterSettings clusterSettings;

private final boolean operatorPrivilegesEnabled;

private static final CleanRestoreStateTaskExecutor cleanRestoreStateTaskExecutor = new CleanRestoreStateTaskExecutor();

public RestoreService(ClusterService clusterService, RepositoriesService repositoriesService,
Expand All @@ -177,6 +180,8 @@ public RestoreService(ClusterService clusterService, RepositoriesService reposit
}
this.clusterSettings = clusterService.getClusterSettings();
this.shardLimitValidator = shardLimitValidator;
final Setting<?> operatorPrivilegesSetting = clusterSettings.get("xpack.security.operator_privileges.enabled");
operatorPrivilegesEnabled = operatorPrivilegesSetting != null && (Boolean.TRUE == clusterSettings.get(operatorPrivilegesSetting));
}

/**
Expand Down Expand Up @@ -454,7 +459,8 @@ restoreUUID, snapshot, overallState(RestoreInProgress.State.INIT, shards),
if (metadata.customs() != null) {
for (ObjectObjectCursor<String, Metadata.Custom> cursor : metadata.customs()) {
if (RepositoriesMetadata.TYPE.equals(cursor.key) == false
&& DataStreamMetadata.TYPE.equals(cursor.key) == false) {
&& DataStreamMetadata.TYPE.equals(cursor.key) == false
&& (operatorPrivilegesEnabled == false || "autoscaling".equals(cursor.key) == false)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just skip the operatorPrivilegesEnabled check. It is fine to not restore autoscaling policies for now and we can revisit once we have the proper framework around operator privileges and restore in general.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the check makes both the logic and code dependency simpler. Another practical advantage is that Cloud probably will not enable opereator privileges straight away after 7.11 release. Therefore removing this check makes sure the restore behavioiur of autoscaling policy stay the same before and after operator privileges get enabled in Cloud. Having a consistent behaviour is one of the motivations for this work to be included in 7.11. So it seems to make sense. But it would be different than what we have agreed during the meeting. @colings86 Could you please verify whether removing this check is desirable?

Copy link
Contributor

@henningandersen henningandersen Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it would be different than what we have agreed during the meeting

I believe we discussed that approach at the meeting and I came out of the meeting with the impression that this was the direction we would take (to just disable restore for autoscaling).

Copy link
Contributor

@colings86 colings86 Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with changing this so we never restore autoscaling policies regardless of whether operator privileges are enabled. At least for this first version

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks both. I updated the code to drop the operator privileges check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than switch on the "autoscaling" key I would prefer to add a marker interface "NotRestorableCustomMetadata", implemented by AutoscalingMetadata and use instanceof to omit it instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks!

// Don't restore repositories while we are working with them
// TODO: Should we restore them at the end?
// Also, don't restore data streams here, we already added them to the metadata builder above
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License;
* you may not use this file except in compliance with the Elastic License.
*/

package org.elasticsearch.xpack.autoscaling;

import org.elasticsearch.action.admin.cluster.snapshots.create.CreateSnapshotResponse;
import org.elasticsearch.action.admin.cluster.snapshots.restore.RestoreSnapshotResponse;
import org.elasticsearch.client.Client;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.rest.RestStatus;
import org.elasticsearch.xpack.autoscaling.action.DeleteAutoscalingPolicyAction;
import org.elasticsearch.xpack.autoscaling.action.GetAutoscalingPolicyAction;
import org.elasticsearch.xpack.autoscaling.action.PutAutoscalingPolicyAction;
import org.elasticsearch.xpack.autoscaling.policy.AutoscalingPolicy;
import org.junit.Before;

import java.nio.file.Path;

import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked;
import static org.elasticsearch.xpack.autoscaling.AutoscalingTestCase.randomAutoscalingPolicy;
import static org.hamcrest.Matchers.equalTo;

public class AutoscalingSnapshotsIT extends AutoscalingIntegTestCase {

public static final String REPO = "repo";
public static final String SNAPSHOT = "snap";

@Before
public void setup() throws Exception {
Path location = randomRepoPath();
logger.info("--> creating repository [{}] [{}]", REPO, "fs");
assertAcked(clusterAdmin().preparePutRepository(REPO).setType("fs").setSettings(Settings.builder().put("location", location)));
}

public void testAutoscalingPolicyWillBeRestored() {
final Client client = client();

final AutoscalingPolicy policy = randomAutoscalingPolicy();
final PutAutoscalingPolicyAction.Request request = new PutAutoscalingPolicyAction.Request(
policy.name(),
policy.roles(),
policy.deciders()
);
assertAcked(client.execute(PutAutoscalingPolicyAction.INSTANCE, request).actionGet());

CreateSnapshotResponse createSnapshotResponse = client.admin()
.cluster()
.prepareCreateSnapshot(REPO, SNAPSHOT)
.setWaitForCompletion(true)
.setIncludeGlobalState(true)
.get();

RestStatus status = createSnapshotResponse.getSnapshotInfo().status();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not just:

RestStatus status = createSnapshotResponse.status();

assertEquals(RestStatus.OK, status);

final DeleteAutoscalingPolicyAction.Request deleteRequest = new DeleteAutoscalingPolicyAction.Request(policy.name());
assertAcked(client().execute(DeleteAutoscalingPolicyAction.INSTANCE, deleteRequest).actionGet());

RestoreSnapshotResponse restoreSnapshotResponse = client.admin()
.cluster()
.prepareRestoreSnapshot(REPO, SNAPSHOT)
.setWaitForCompletion(true)
.setRestoreGlobalState(true)
.get();
assertEquals(RestStatus.OK, restoreSnapshotResponse.status());

final GetAutoscalingPolicyAction.Request getRequest = new GetAutoscalingPolicyAction.Request(policy.name());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should extend the test to also have the case where a policy does exist when restoring to check that it survives the restore. To protect us from "restore-global-state" clearing everything during restore. Could be just randomly doing that or the delete case.

Copy link
Member Author

@ywangd ywangd Dec 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have expanded the test to:

  • Randomly deleting or mutating the existing policy after snapshot is taken
  • Add another policy after snapshot is taken

Then assert things stay the same after restore.

final AutoscalingPolicy restoredPolicy = client().execute(GetAutoscalingPolicyAction.INSTANCE, getRequest).actionGet().policy();
assertThat(policy, equalTo(restoredPolicy));
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@ dependencies {
javaRestTestImplementation project.sourceSets.main.runtimeClasspath
}

File repoDir = file("$buildDir/testclusters/repo")

javaRestTest {
/* To support taking snapshots, we have to set path.repo setting */
systemProperty 'tests.path.repo', repoDir
}

testClusters.all {
testDistribution = 'DEFAULT'
numberOfNodes = 3
Expand All @@ -27,6 +34,7 @@ testClusters.all {
setting 'xpack.security.enabled', 'true'
setting 'xpack.security.http.ssl.enabled', 'false'
setting 'xpack.security.operator_privileges.enabled', "true"
setting 'path.repo', repoDir.absolutePath

user username: "test_admin", password: 'x-pack-test-password', role: "superuser"
user username: "test_operator", password: 'x-pack-test-password', role: "limited_operator"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@
import org.elasticsearch.client.Request;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.ResponseException;
import org.elasticsearch.common.Strings;
import org.elasticsearch.common.settings.SecureString;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.util.concurrent.ThreadContext;
import org.elasticsearch.common.util.set.Sets;
import org.elasticsearch.common.xcontent.json.JsonXContent;
import org.elasticsearch.test.rest.ESRestTestCase;

import java.io.IOException;
Expand All @@ -28,6 +30,9 @@

public class OperatorPrivilegesIT extends ESRestTestCase {

private static final String OPERATOR_AUTH_HEADER = "Basic "
+ Base64.getEncoder().encodeToString("test_operator:x-pack-test-password".getBytes(StandardCharsets.UTF_8));

@Override
protected Settings restClientSettings() {
String token = basicAuthHeaderValue("test_admin", new SecureString("x-pack-test-password".toCharArray()));
Expand All @@ -46,17 +51,13 @@ public void testNonOperatorSuperuserWillFailToCallOperatorOnlyApiWhenOperatorPri

public void testOperatorUserWillSucceedToCallOperatorOnlyApi() throws IOException {
final Request postVotingConfigExclusionsRequest = new Request("POST", "_cluster/voting_config_exclusions?node_names=foo");
final String authHeader = "Basic "
+ Base64.getEncoder().encodeToString("test_operator:x-pack-test-password".getBytes(StandardCharsets.UTF_8));
postVotingConfigExclusionsRequest.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", authHeader));
postVotingConfigExclusionsRequest.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", OPERATOR_AUTH_HEADER));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also looks unrelated to at least the PR description, perhaps move to separate PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. See also above.

client().performRequest(postVotingConfigExclusionsRequest);
}

public void testOperatorUserWillFailToCallOperatorOnlyApiIfRbacFails() throws IOException {
final Request deleteVotingConfigExclusionsRequest = new Request("DELETE", "_cluster/voting_config_exclusions");
final String authHeader = "Basic "
+ Base64.getEncoder().encodeToString("test_operator:x-pack-test-password".getBytes(StandardCharsets.UTF_8));
deleteVotingConfigExclusionsRequest.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", authHeader));
deleteVotingConfigExclusionsRequest.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", OPERATOR_AUTH_HEADER));
final ResponseException responseException = expectThrows(
ResponseException.class,
() -> client().performRequest(deleteVotingConfigExclusionsRequest)
Expand All @@ -67,9 +68,7 @@ public void testOperatorUserWillFailToCallOperatorOnlyApiIfRbacFails() throws IO

public void testOperatorUserCanCallNonOperatorOnlyApi() throws IOException {
final Request mainRequest = new Request("GET", "/");
final String authHeader = "Basic "
+ Base64.getEncoder().encodeToString("test_operator:x-pack-test-password".getBytes(StandardCharsets.UTF_8));
mainRequest.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", authHeader));
mainRequest.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", OPERATOR_AUTH_HEADER));
client().performRequest(mainRequest);
}

Expand Down Expand Up @@ -100,4 +99,85 @@ public void testOperatorPrivilegesXpackUsage() throws IOException {
assertTrue((boolean) operatorPrivileges.get("available"));
assertTrue((boolean) operatorPrivileges.get("enabled"));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just remove this and the associated build file changes etc. from this PR? Once the real framework for avoiding to restore autoscaling policies on operator privilege enabled setups goes in, this might look different anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed and all unrelated changes removed.

A bit more context: I thought about deleting these changes in last commit. But I got a bit paranoid and was trying to assert that operator privileges do not interfere with the restore behaviour. Looking back now, this thinking was an artefact of how this PR evolved, which started with the assumption that operator privileges would impact the behaviour. If it had started purely from the autoscaling side, I don't think I would add anything related to operator privileges. In short, it's the right move to delete these unrelated code.

public void testAutoscalingPolicyWillNotBeRestored() throws IOException {
final String repoName = "repo";
final String snapshotName = "snap";
final String policyName = "policy";
createSnapshotRepo(repoName);
createAutoscalingPolicy(policyName);
takeSnapshot(repoName, snapshotName);
deleteAutoscalingPolicy(policyName);
restoreSnapshot(repoName, snapshotName);

final Request getPolicyRequest = new Request("GET", "/_autoscaling/policy/" + policyName);
getPolicyRequest.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", OPERATOR_AUTH_HEADER));
final ResponseException e = expectThrows(ResponseException.class, () -> client().performRequest(getPolicyRequest));
assertThat(e.getMessage(), containsString("autoscaling policy with name [" + policyName + "] does not exist"));
}

private void createSnapshotRepo(String repoName) throws IOException {
Request request = new Request("PUT", "/_snapshot/" + repoName);
request.setJsonEntity(
Strings.toString(
JsonXContent.contentBuilder()
.startObject()
.field("type", "fs")
.startObject("settings")
.field("location", System.getProperty("tests.path.repo"))
.endObject()
.endObject()
)
);
assertOK(client().performRequest(request));
}

private void createAutoscalingPolicy(String policyName) throws IOException {
final Request request = new Request("PUT", "/_autoscaling/policy/" + policyName);
request.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", OPERATOR_AUTH_HEADER));
request.setJsonEntity(
Strings.toString(
JsonXContent.contentBuilder()
.startObject()
.array("roles", "master")
.startObject("deciders")
.startObject("fixed")
.endObject()
.endObject()
.endObject()
)
);
assertOK(client().performRequest(request));
}

private void takeSnapshot(String repoName, String snapshotName) throws IOException {
final Request request = new Request("POST", "/_snapshot/" + repoName + "/" + snapshotName);
if (randomBoolean()) {
request.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", OPERATOR_AUTH_HEADER));
}
request.addParameter("wait_for_completion", "true");
request.setJsonEntity(
Strings.toString(JsonXContent.contentBuilder().startObject().field("include_global_state", true).endObject())
);
assertOK(client().performRequest(request));
}

private void deleteAutoscalingPolicy(String policyName) throws IOException {
final Request request = new Request("DELETE", "/_autoscaling/policy/" + policyName);
request.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", OPERATOR_AUTH_HEADER));
assertOK(client().performRequest(request));
}

private void restoreSnapshot(String repoName, String snapshotName) throws IOException {
final Request request = new Request("POST", "/_snapshot/" + repoName + "/" + snapshotName + "/_restore");
if (randomBoolean()) {
request.setOptions(RequestOptions.DEFAULT.toBuilder().addHeader("Authorization", OPERATOR_AUTH_HEADER));
}
request.addParameter("wait_for_completion", "true");
request.setJsonEntity(
Strings.toString(JsonXContent.contentBuilder().startObject().field("include_global_state", true).endObject())
);
assertOK(client().performRequest(request));
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@ limited_operator:
cluster:
- "cluster:admin/voting_config/add_exclusions"
- "monitor"
- "manage_autoscaling"
- "cluster:admin/snapshot/*"