Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register CcrRepository based on settings update #36086

Merged
merged 26 commits into from
Dec 4, 2018

Conversation

Tim-Brooks
Copy link
Contributor

@Tim-Brooks Tim-Brooks commented Nov 29, 2018

This commit adds an empty CcrRepository snapshot/restore repository.
When a new cluster is registered in the remote cluster settings, a new
CcrRepository is registered for that cluster.

This is implemented using a new concept of "internal repositories".
RepositoryPlugin now allows implementations to return factories for
"internal repositories". The "internal repositories" are different from
normal repositories in that they cannot be registered through the
external repository api. Additionally, "internal repositories" are local
to a node and are not stored in the cluster state.

The repository will be unregistered if the remote cluster is removed.

@Tim-Brooks Tim-Brooks added >non-issue v7.0.0 :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v6.6.0 labels Nov 29, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@Tim-Brooks
Copy link
Contributor Author

Tim-Brooks commented Nov 30, 2018

@bleskes - In response to your comment here:

#35801 (comment)

Since we are using the client for the propagation to RepositoriesService here, an integration test seemed most straightforward. But I added a TODO to roll that integration test in more encompassing IT as the bootstrap work expands.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @tbrooks8. I've left some initial comments. I think we should also do more unit-level testing. Start e.g. with the newly added methods to RepositoriesService. They can easily be unit-tested. The repo manager might also allow some unit-tests.

import org.elasticsearch.action.support.master.AcknowledgedResponse;
import org.elasticsearch.common.io.stream.Writeable;

public class DeleteInternalRepositoryAction extends Action<AcknowledgedResponse> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move these actions to the CCR plugin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Although I will note that the conception on "internal repositories" still exist in open source. even if the actions to manipulate them do not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to change the names from what they are such as:

"cluster:admin/internal_repository/put"

to something ccr oriented?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, same for the action names and requests. For now, these can be DeleteInternalCCRRepositoryAction, DeleteInternalCCRRepositoryRequest, ...

}

@Override
public AcknowledgedResponse newResponse() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why AcknowledgedResponse? You're not interested in checking the acknowledged flag, so maybe just an ActionResponse.


public class DeleteInternalRepositoryRequest extends ActionRequest {

private String name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final

public ActionRequestValidationException validate() {
ActionRequestValidationException validationException = null;
if (name == null) {
validationException = addValidationError("name is missing", validationException);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe easier to check this right away on object creation, i.e., this.name = Objects.requireNonNull(name);


private String name;
private String type;
private Settings settings;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final^3

closeRepository(repository);
repository.close();
} else {
logger.warn(() -> new ParameterizedMessage("Attempted to unregistered internal repository [{}][{}]. " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unregister

Repository existingRepository = internalRepositories.putIfAbsent(name, repository);

if (existingRepository != null) {
logger.error(new ParameterizedMessage("Error registering internal repository [{}][{}]. " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start logging messages with lower case (in style with the rest of the class / code)?

@@ -104,6 +110,8 @@
private final boolean enabled;
private final Settings settings;
private final CcrLicenseChecker ccrLicenseChecker;
private SetOnce<ClusterService> clusterService = new SetOnce<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not used anywhere?

@@ -104,6 +110,8 @@
private final boolean enabled;
private final Settings settings;
private final CcrLicenseChecker ccrLicenseChecker;
private SetOnce<ClusterService> clusterService = new SetOnce<>();
private SetOnce<CcrRepositoryManager> repositoryManager = new SetOnce<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final

class CcrRepositoryManager extends RemoteClusterAware {

private final NodeClient client;
private final Set<String> clusters = ConcurrentCollections.newConcurrentSet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of caching this list here, can we directly ask RepositoriesService whether it already has a repo for this thing, and add / remove based on what RepositoriesService has?
Having these extra caches are always tricky, in particular when there are failure scenarios and the list of internal repositories in RepositoriesService might go out of sync with the cached clusters here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since put and delete should be idempotent (and they other for other normal repositories requests) I removed the cache and just call the action each time.

The other option we to create a get action, but I thought that was unnecessary as we need to handle potential concurrency in the put and delete methods anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the cache and just call the action each time.

sounds good

@Tim-Brooks Tim-Brooks requested a review from ywelsch November 30, 2018 23:06
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undo

Map<String, Repository.Factory> newRepoTypes = repoPlugin.getInternalRepositories(env, namedXContentRegistry);
for (Map.Entry<String, Repository.Factory> entry : newRepoTypes.entrySet()) {
if (internalFactories.put(entry.getKey(), entry.getValue()) != null) {
throw new IllegalArgumentException("Internal repository type [" + entry.getKey() + "] is already registered");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we enforce that these types are distinct to the non-internal ones?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test for this?

Repository newRepository = createRepository(metaData, internalTypesRegistry);
Repository repositoryToClose = null;
boolean updated = false;
synchronized (internalRepositories) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the concurrency logic looks complicated here. Maybe just add synchronized on the registerInternalRepository and unregisterInternalRepository methods? We will not be calling those concurrently anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the changes where we do not support updates, we can rely on the ConcurrentMap to provide concurrency control.

@@ -278,7 +278,7 @@ boolean isRemoteNodeConnected(final String remoteCluster, final DiscoveryNode no
public Map<String, OriginalIndices> groupIndices(IndicesOptions indicesOptions, String[] indices, Predicate<String> indexExists) {
Map<String, OriginalIndices> originalIndicesMap = new HashMap<>();
if (isCrossClusterSearchEnabled()) {
final Map<String, List<String>> groupedIndices = groupClusterIndices(indices, indexExists);
final Map<String, List<String>> groupedIndices = groupClusterIndices(remoteClusters.keySet(), indices, indexExists);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use getRemoteClusterNames() here?


@Override
public Writeable.Reader<ActionResponse> getResponseReader() {
return in -> new ActionResponse() {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safer to use the ActionResponse(StreamInput in) constructor here. Maybe use add a dummy response sub-class here.

client.executeLocally(DeleteInternalRepositoryAction.INSTANCE, request, future);
assert future.isDone() : "Should be completed as it is executed synchronously";
} else {
ActionRequest request = new PutInternalRepositoryRequest(clusterAlias, CcrRepository.TYPE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of using the clusterAlias name verbatim here, let's prepend something like "ccr" to avoid name conflicts with standard repositories.

import org.elasticsearch.action.support.master.AcknowledgedResponse;
import org.elasticsearch.common.io.stream.Writeable;

public class DeleteInternalRepositoryAction extends Action<AcknowledgedResponse> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, same for the action names and requests. For now, these can be DeleteInternalCCRRepositoryAction, DeleteInternalCCRRepositoryRequest, ...

this(name, type, Settings.EMPTY);
}

public PutInternalRepositoryRequest(String name, String type, Settings settings) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we rename this to PutInternalCCRRepositoryRequest, we can get rid of the settings parameter here. We don't use it for CCR.


// TODO: Normally we would do validation when we update a repository to ensure that it is not in use.
// Are we okay with not including that validation under the assumption that internal operations
// will do the right thing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine not having this validation here. If we remove the settings parameter from this method, a repo will never be updated with different settings.

class CcrRepositoryManager extends RemoteClusterAware {

private final NodeClient client;
private final Set<String> clusters = ConcurrentCollections.newConcurrentSet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the cache and just call the action each time.

sounds good

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few more comments. Looking great already.

@Tim-Brooks Tim-Brooks requested a review from ywelsch December 4, 2018 00:54
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 smaller asks, looks good o.w.

Map<String, Repository.Factory> newRepoTypes = repoPlugin.getInternalRepositories(env, namedXContentRegistry);
for (Map.Entry<String, Repository.Factory> entry : newRepoTypes.entrySet()) {
if (internalFactories.put(entry.getKey(), entry.getValue()) != null) {
throw new IllegalArgumentException("Internal repository type [" + entry.getKey() + "] is already registered");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test for this?

assertAcked(followerClient().admin().cluster().updateSettings(putFollowerRequest).actionGet());

String followerCopyRepoName = CcrRepository.NAME_PREFIX + "follower_cluster_copy";
assertBusy(() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the assertBusy should not be needed here. The update settings call will only return if the corresponding cluster state has been updated on all nodes, and the repositories are created as part of that CS update.

@Tim-Brooks
Copy link
Contributor Author

run the docbldesx

@Tim-Brooks
Copy link
Contributor Author

run default distro tests

@Tim-Brooks
Copy link
Contributor Author

run the docbldesx

@Tim-Brooks Tim-Brooks merged commit 8bde608 into elastic:master Dec 4, 2018
Tim-Brooks added a commit to Tim-Brooks/elasticsearch that referenced this pull request Dec 5, 2018
This is a follow-up to elastic#36086. It renames the internal repository
actions to be prefixed by "internal". This allows the system user to
execute the actions.

Additionally, this PR stops casting Client to NodeClient. The client we
have is a NodeClient so executing the actions will be local.
Tim-Brooks added a commit that referenced this pull request Dec 5, 2018
This is a follow-up to #36086. It renames the internal repository
actions to be prefixed by "internal". This allows the system user to
execute the actions.

Additionally, this PR stops casting Client to NodeClient. The client we
have is a NodeClient so executing the actions will be local.
Tim-Brooks added a commit to Tim-Brooks/elasticsearch that referenced this pull request Dec 5, 2018
This commit adds an empty CcrRepository snapshot/restore repository.
When a new cluster is registered in the remote cluster settings, a new
CcrRepository is registered for that cluster.

This is implemented using a new concept of "internal repositories".
RepositoryPlugin now allows implementations to return factories for
"internal repositories". The "internal repositories" are different from
normal repositories in that they cannot be registered through the
external repository api. Additionally, "internal repositories" are local
to a node and are not stored in the cluster state.

The repository will be unregistered if the remote cluster is removed.
Tim-Brooks added a commit that referenced this pull request Dec 7, 2018
This commit adds an empty CcrRepository snapshot/restore repository.
When a new cluster is registered in the remote cluster settings, a new
CcrRepository is registered for that cluster.

This is implemented using a new concept of "internal repositories".
RepositoryPlugin now allows implementations to return factories for
"internal repositories". The "internal repositories" are different from
normal repositories in that they cannot be registered through the
external repository api. Additionally, "internal repositories" are local
to a node and are not stored in the cluster state.

The repository will be unregistered if the remote cluster is removed.
Tim-Brooks added a commit to Tim-Brooks/elasticsearch that referenced this pull request Dec 7, 2018
This is a follow-up to elastic#36086. It renames the internal repository
actions to be prefixed by "internal". This allows the system user to
execute the actions.

Additionally, this PR stops casting Client to NodeClient. The client we
have is a NodeClient so executing the actions will be local.
Tim-Brooks added a commit that referenced this pull request Dec 7, 2018
This is a follow-up to #36086. It renames the internal repository
actions to be prefixed by "internal". This allows the system user to
execute the actions.
@Tim-Brooks Tim-Brooks deleted the add_ccr_repo_internal branch December 18, 2019 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features >non-issue v6.6.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants