Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow excluding a cluster from a CCS search using multi-target syntax #97865

Merged
merged 5 commits into from
Jul 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/reference/api-conventions.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,13 @@ request that targets an excluded alias. For example, if `test3` is an index
alias, the pattern `test*,-test3` still targets the indices for `test3`. To
avoid this, exclude the concrete indices for the alias instead.

You can also exclude clusters from a list of clusters to search using the `-` character:
`remote*:*,-remote1:*,-remote4:*` will search all clusters with an alias that starts
with "remote" except for "remote1" and "remote4". Note that to exclude a cluster
with this notation you must exclude all of its indexes. Excluding a subset of indexes
on a remote cluster is currently not supported. For example, this will throw an exception:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excluding a subset of indexes is supported but not quite this way? You can do remote1:-index1?

Do we reject remote*:abc,-remote1:abc? I am wondering what makes more sense, and also whether this would be perceived as a limitation from Kibana. I guess more of a convention: "if you want to exclude the cluster, use wildcard for indices".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do remote1:-index1?

Hmm, I haven't tried that. Good question. I'll look into it.

Do we reject remote*:abc,-remote1:abc? I am wondering what makes more sense

Yes, the current code rejects that.

I decided not to try to tackle that because it is a slippery slope. At this point in the code where it is handled, we don't have any idea what indexes exist on remote clusters. My goal here was to just exclude entire clusters. For your example, it would be fairly straightforward (though more work) to determine that it is a full cluster exclusion. But then a user might expect they can do remote*:abc*,-remote1:abc too. But to truly handle that, we'd have to change the sub-search section of TransportSearchAction to send a different request to remote1 then we send to all the other clusters. And the CCS code pathways are already complicated enough that I didn't think that was worth it.

I guess more of a convention: "if you want to exclude the cluster, use wildcard for indices".

Right, my view is just to have a rule "you can exclude the entire cluster" here by using * in the index position.
So we could allow remote*:abc,-remote1:abc which is a negation of concrete indexes, but then we have to explain to users why any wildcard other than just remote1:* is not allowed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your thinking is pragmatic, let's keep it this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do remote1:-index1?

Thanks for pointing that out. I tested it and it does work in the same way the it does for local indexes. Which has an annoying quirk that you have to use a wildcard in the "inclusion" list to exclude a concrete index.

Example:

This works to exclude blogs:

POST web_traffic,b*,-blogs/_async_search

but this does not:

POST web_traffic,blogs,-blogs/_async_search

// fails with
  "error": {
    "type": "status_exception",
    "reason": "error while executing search",
    "caused_by": {
      "type": "index_not_found_exception",
      "reason": "no such index [-blogs] and if you intended to exclude this index, ensure that you use wildcards that include it before explicitly excluding it",
      "resource.type": "index_or_alias",
      "resource.id": "-blogs",
      "index_uuid": "_na_",
      "index": "-blogs"
    }
  }

It used to just say "no such index [-blogs]" which is very misleading. I recently added the extra note "if you intended to exclude this index, ensure that you use wildcards that include it before explicitly excluding it".

So similarly, for remote clusters, this works:

POST remote*:blog*,remote1:-blogs/_async_search

but this fails with the original misleading error message:

POST remote*:blogs,remote1:-blogs/_async_search

  "error": {
    "type": "status_exception",
    "reason": "error while executing search",
    "caused_by": {
      "type": "index_not_found_exception",
      "reason": "no such index [-blogs]",
      "index_uuid": "_na_",
      "resource.type": "index_or_alias",
      "resource.id": "-blogs",
      "index": "-blogs"
    }
  }

so I'll file a bug to improve that error message.

`remote*:*,-remote1:logs*`.

Multi-target APIs that can target indices support the following query
string parameters:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,12 @@
import org.elasticsearch.common.settings.ClusterSettings;
import org.elasticsearch.common.settings.Setting;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.core.Strings;
import org.elasticsearch.node.Node;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Objects;
Expand All @@ -28,7 +30,6 @@
* Base class for all services and components that need up-to-date information about the registered remote clusters
*/
public abstract class RemoteClusterAware {

public static final char REMOTE_CLUSTER_INDEX_SEPARATOR = ':';
public static final String LOCAL_CLUSTER_GROUP_KEY = "";

Expand Down Expand Up @@ -58,30 +59,77 @@ protected static Set<String> getEnabledRemoteClusters(final Settings settings) {
* indices per cluster are collected as a list in the returned map keyed by the cluster alias. Local indices are grouped under
* {@link #LOCAL_CLUSTER_GROUP_KEY}. The returned map is mutable.
*
* @param remoteClusterNames the remote cluster names
* This method supports excluding clusters by using the {@code -cluster:*} index expression.
* For example, if requestIndices is [blogs, *:blogs, -remote1:*] and *:blogs resolves to "remote1:blogs, remote2:blogs"
* the map returned by the function will not have the remote1 entry. It will have only {"":blogs, remote2:blogs}.
* The index for the excluded cluster must be '*' to clarify that the entire cluster should be removed.
* A wildcard in the "-" excludes notation is also allowed. For example, suppose there are three remote clusters,
* remote1, remote2, remote3, and this index expression is provided: blogs,rem*:blogs,-rem*1:*. That would successfully
* remove remote1 from the list of clusters to be included.
*
* @param remoteClusterNames the remote cluster names. If a clusterAlias is preceded by a minus sign that cluster will be excluded.
* @param requestIndices the indices in the search request to filter
*
* @return a map of grouped remote and local indices
*/
protected Map<String, List<String>> groupClusterIndices(Set<String> remoteClusterNames, String[] requestIndices) {
Map<String, List<String>> perClusterIndices = new HashMap<>();
Set<String> clustersToRemove = new HashSet<>();
for (String index : requestIndices) {
int i = index.indexOf(RemoteClusterService.REMOTE_CLUSTER_INDEX_SEPARATOR);
if (i >= 0) {
if (isRemoteClusterClientEnabled == false) {
assert remoteClusterNames.isEmpty() : remoteClusterNames;
throw new IllegalArgumentException("node [" + nodeName + "] does not have the remote cluster client role enabled");
}
String remoteClusterName = index.substring(0, i);
int startIdx = index.charAt(0) == '-' ? 1 : 0;
String remoteClusterName = index.substring(startIdx, i);
List<String> clusters = ClusterNameExpressionResolver.resolveClusterNames(remoteClusterNames, remoteClusterName);
String indexName = index.substring(i + 1);
for (String clusterName : clusters) {
perClusterIndices.computeIfAbsent(clusterName, k -> new ArrayList<>()).add(indexName);
if (startIdx == 1) {
if (indexName.equals("*") == false) {
throw new IllegalArgumentException(
Strings.format(
"To exclude a cluster you must specify the '*' wildcard for " + "the index expression, but found: [%s]",
indexName
)
);
}
clustersToRemove.addAll(clusters);
} else {
for (String clusterName : clusters) {
perClusterIndices.computeIfAbsent(clusterName, k -> new ArrayList<>()).add(indexName);
}
}
} else {
perClusterIndices.computeIfAbsent(RemoteClusterAware.LOCAL_CLUSTER_GROUP_KEY, k -> new ArrayList<>()).add(index);
}
}
List<String> excludeFailed = new ArrayList<>();
for (String exclude : clustersToRemove) {
List<String> removed = perClusterIndices.remove(exclude);
if (removed == null) {
excludeFailed.add(exclude);
}
}
if (excludeFailed.size() > 0) {
String warning = Strings.format(
"Attempt to exclude cluster%s %s failed as %s not included in the list of clusters to be included: %s. Input: [%s]",
excludeFailed.size() == 1 ? "" : "s",
excludeFailed,
excludeFailed.size() == 1 ? "it is" : "they are",
perClusterIndices.keySet().stream().map(s -> s.equals("") ? "(local)" : s).collect(Collectors.toList()),
String.join(",", requestIndices)
);
throw new IllegalArgumentException(warning);
}
if (clustersToRemove.size() > 0 && perClusterIndices.size() == 0) {
throw new IllegalArgumentException(
"The '-' exclusions in the index expression list excludes all indexes. Nothing to search. Input: ["
+ String.join(",", requestIndices)
+ "]"
);
}
return perClusterIndices;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
import static org.elasticsearch.test.NodeRoles.nonMasterNode;
import static org.elasticsearch.test.NodeRoles.onlyRoles;
import static org.elasticsearch.test.NodeRoles.removeRoles;
import static org.hamcrest.Matchers.containsString;
import static org.hamcrest.Matchers.either;
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.instanceOf;
Expand Down Expand Up @@ -165,23 +166,25 @@ public void testGroupClusterIndices() throws IOException {
assertTrue(service.isRemoteClusterRegistered("cluster_1"));
assertTrue(service.isRemoteClusterRegistered("cluster_2"));
assertFalse(service.isRemoteClusterRegistered("foo"));
Map<String, List<String>> perClusterIndices = service.groupClusterIndices(
service.getRemoteClusterNames(),
new String[] {
"cluster_1:bar",
"cluster_2:foo:bar",
"cluster_1:test",
"cluster_2:foo*",
"foo",
"cluster*:baz",
"*:boo" }
);
List<String> localIndices = perClusterIndices.remove(RemoteClusterAware.LOCAL_CLUSTER_GROUP_KEY);
assertNotNull(localIndices);
assertEquals("foo", localIndices.get(0));
assertEquals(2, perClusterIndices.size());
assertEquals(Arrays.asList("bar", "test", "baz", "boo"), perClusterIndices.get("cluster_1"));
assertEquals(Arrays.asList("foo:bar", "foo*", "baz", "boo"), perClusterIndices.get("cluster_2"));
{
Map<String, List<String>> perClusterIndices = service.groupClusterIndices(
service.getRemoteClusterNames(),
new String[] {
"cluster_1:bar",
"cluster_2:foo:bar",
"cluster_1:test",
"cluster_2:foo*",
"foo",
"cluster*:baz",
"*:boo" }
);
List<String> localIndices = perClusterIndices.remove(RemoteClusterAware.LOCAL_CLUSTER_GROUP_KEY);
assertNotNull(localIndices);
assertEquals("foo", localIndices.get(0));
assertEquals(2, perClusterIndices.size());
assertEquals(Arrays.asList("bar", "test", "baz", "boo"), perClusterIndices.get("cluster_1"));
assertEquals(Arrays.asList("foo:bar", "foo*", "baz", "boo"), perClusterIndices.get("cluster_2"));
}

expectThrows(
NoSuchRemoteClusterException.class,
Expand All @@ -198,6 +201,125 @@ public void testGroupClusterIndices() throws IOException {
new String[] { "cluster_1:bar", "cluster_2:foo:bar", "cluster_1:test", "cluster_2:foo*", "does_not_exist:*" }
)
);

// test cluster exclusions
{
String[] indices = shuffledList(List.of("cluster*:foo*", "foo", "-cluster_1:*", "*:boo")).toArray(new String[0]);

Map<String, List<String>> perClusterIndices = service.groupClusterIndices(service.getRemoteClusterNames(), indices);
assertEquals(2, perClusterIndices.size());
List<String> localIndexes = perClusterIndices.get(RemoteClusterAware.LOCAL_CLUSTER_GROUP_KEY);
assertNotNull(localIndexes);
assertEquals(1, localIndexes.size());
assertEquals("foo", localIndexes.get(0));

List<String> cluster2 = perClusterIndices.get("cluster_2");
assertNotNull(cluster2);
assertEquals(2, cluster2.size());
assertEquals(List.of("boo", "foo*"), cluster2.stream().sorted().toList());
}
{
String[] indices = shuffledList(List.of("*:*", "-clu*_1:*", "*:boo")).toArray(new String[0]);

Map<String, List<String>> perClusterIndices = service.groupClusterIndices(service.getRemoteClusterNames(), indices);
assertEquals(1, perClusterIndices.size());

List<String> cluster2 = perClusterIndices.get("cluster_2");
assertNotNull(cluster2);
assertEquals(2, cluster2.size());
assertEquals(List.of("*", "boo"), cluster2.stream().sorted().toList());
}
{
String[] indices = shuffledList(List.of("cluster*:foo*", "cluster_2:*", "foo", "-cluster_1:*", "-c*:*")).toArray(
new String[0]
);

Map<String, List<String>> perClusterIndices = service.groupClusterIndices(service.getRemoteClusterNames(), indices);
assertEquals(1, perClusterIndices.size());
List<String> localIndexes = perClusterIndices.get(RemoteClusterAware.LOCAL_CLUSTER_GROUP_KEY);
assertNotNull(localIndexes);
assertEquals(1, localIndexes.size());
assertEquals("foo", localIndexes.get(0));
}
{
String[] indices = shuffledList(List.of("cluster*:*", "foo", "-*:*")).toArray(new String[0]);

Map<String, List<String>> perClusterIndices = service.groupClusterIndices(service.getRemoteClusterNames(), indices);
assertEquals(1, perClusterIndices.size());
List<String> localIndexes = perClusterIndices.get(RemoteClusterAware.LOCAL_CLUSTER_GROUP_KEY);
assertNotNull(localIndexes);
assertEquals(1, localIndexes.size());
assertEquals("foo", localIndexes.get(0));
}
{
IllegalArgumentException e = expectThrows(
IllegalArgumentException.class,
() -> service.groupClusterIndices(
service.getRemoteClusterNames(),
// -cluster_1:foo* is not allowed, only -cluster_1:*
new String[] { "cluster_1:bar", "-cluster_2:foo*", "cluster_1:test", "cluster_2:foo*", "foo" }
)
);
assertThat(
e.getMessage(),
equalTo("To exclude a cluster you must specify the '*' wildcard for the index expression, but found: [foo*]")
);
}
{
IllegalArgumentException e = expectThrows(
IllegalArgumentException.class,
() -> service.groupClusterIndices(
service.getRemoteClusterNames(),
// -cluster_1:* will fail since cluster_1 was never included in order to qualify to be excluded
new String[] { "-cluster_1:*", "cluster_2:foo*", "foo" }
)
);
assertThat(
e.getMessage(),
equalTo(
"Attempt to exclude cluster [cluster_1] failed as it is not included in the list of clusters to "
+ "be included: [(local), cluster_2]. Input: [-cluster_1:*,cluster_2:foo*,foo]"
)
);
}
{
IllegalArgumentException e = expectThrows(
IllegalArgumentException.class,
() -> service.groupClusterIndices(service.getRemoteClusterNames(), new String[] { "-cluster_1:*" })
);
assertThat(
e.getMessage(),
equalTo(
"Attempt to exclude cluster [cluster_1] failed as it is not included in the list of clusters to "
+ "be included: []. Input: [-cluster_1:*]"
)
);
}
{
IllegalArgumentException e = expectThrows(
IllegalArgumentException.class,
() -> service.groupClusterIndices(service.getRemoteClusterNames(), new String[] { "-*:*" })
);
assertThat(
e.getMessage(),
equalTo(
"Attempt to exclude clusters [cluster_1, cluster_2] failed as they are not included in the list of "
+ "clusters to be included: []. Input: [-*:*]"
)
);
}
{
String[] indices = shuffledList(List.of("cluster*:*", "*:foo", "-*:*")).toArray(new String[0]);

IllegalArgumentException e = expectThrows(
IllegalArgumentException.class,
() -> service.groupClusterIndices(service.getRemoteClusterNames(), indices)
);
assertThat(
e.getMessage(),
containsString("The '-' exclusions in the index expression list excludes all indexes. Nothing to search.")
);
}
}
}
}
Expand Down