Implement repair functionality for aliases colliding with indices bug #91887

original-brownbear · 2022-11-24T12:04:02Z

Follow up to #91456 implementing an automated fix for indices corrupted in 8.5.

Not sure this should get an automated test, could write one that corrupts the index in 8.5 and then does a rolling upgrade but maybe manually testing this is good enough in this case?

Follow up to elastic#91456 implementing an automated fix for indices corrupted in 8.5.

elasticsearchmachine · 2022-11-24T12:04:27Z

Pinging @elastic/es-distributed (Team:Distributed)

original-brownbear · 2022-11-24T12:15:28Z

Jenkins run elasticsearch-ci/part-1

henningandersen

could write one that corrupts the index in 8.5 and then does a rolling upgrade

I think the level of testing here is adequate for ensuring it is fixed on upgrade.

henningandersen · 2022-11-24T14:28:58Z

server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java

+                        var updatedBuilder = ImmutableOpenMap.builder(aliasesMap);
+                        final var brokenAlias = updatedBuilder.remove(index);
+                        final var fixedAlias = AliasMetadata.newAliasMetadata(brokenAlias, index + "-alias-corrupted-by-8-5");
+                        aliasesMap = updatedBuilder.fPut(fixedAlias.getAlias(), fixedAlias).build();


Is there a risk that adding an alias with the same name as the index for an index created in 8.5 will now succeed and just result in the "corrupted" alias name instead?

Can we perhaps find or add an integration test that verifies that when adding aliases we have validation in place that catches this?

Oh right ... this is quite annoying. I could obviously add the validation for this at another level as well (not sure where, it seems not entirely trivial to add this check elsewhere in a way that would only catch mutations) but it does add some extra complexity to the whole thing. Is it worth doing that for this corner case? The name collision checks are quite complicated as it is already and checking for this specifically would add yet more complexity at some other layer ...

Could we add a buildAndRepair method that is used from IndexMetadata.fromXContent, passing down a boolean to a private build method in order to only do the repair from that path?

I recon my comment being slightly convoluted, so here is a diff of what I am thinking of:

Index: server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== diff --git a/server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java b/server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java --- a/server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java (revision c6565a39e26f6c91f8e6f71f8021e07f8805639d) +++ b/server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java (date 1669302455554) @@ -2016,6 +2016,10 @@ } public IndexMetadata build() { + return build(false); + } + + private IndexMetadata build(boolean repair) { /* * We expect that the metadata has been properly built to set the number of shards and the number of replicas, and do not rely * on the default values here. Those must have been set upstream. @@ -2144,7 +2148,7 @@ var aliasesMap = aliases.build(); for (AliasMetadata alias : aliasesMap.values()) { if (alias.alias().equals(index)) { - if (indexCreatedVersion.equals(Version.V_8_5_0)) { + if (repair && indexCreatedVersion.equals(Version.V_8_5_0)) { var updatedBuilder = ImmutableOpenMap.builder(aliasesMap); final var brokenAlias = updatedBuilder.remove(index); final var fixedAlias = AliasMetadata.newAliasMetadata(brokenAlias, index + "-alias-corrupted-by-8-5"); @@ -2208,6 +2212,11 @@ ); } + // visible for testing. + IndexMetadata buildAndRepair() { + return build(true); + } + @SuppressWarnings("unchecked") public static void toXContent(IndexMetadata indexMetadata, XContentBuilder builder, ToXContent.Params params) throws IOException { Metadata.XContentContext context = Metadata.XContentContext.valueOf( @@ -2476,7 +2485,7 @@ assert settingsVersion : "settings version should be present for indices created on or after 6.5.0"; assert indexCreatedVersion(builder.settings).before(Version.V_7_2_0) || aliasesVersion : "aliases version should be present for indices created on or after 7.2.0"; - return builder.build(); + return builder.buildAndRepair(); } /**

I was thinking about that, but the would mean that the fix will only apply with a full cluster restart and not with a rolling restart? Wouldn't that be painful for Cloud?

We could also build and repair when receiving the index metadata over the wire maybe?

We could I guess :) lets do it.

Added to from--x-content, reading from the wire and (not sure this is necessary but I found it weird for it to behave different form normal transport read) diff application now :)

…ollision

henningandersen

LGTM.

henningandersen · 2022-11-29T08:49:06Z

server/src/test/java/org/elasticsearch/cluster/metadata/IndexMetadataTests.java

@@ -504,11 +505,21 @@ public void testIndexAndAliasWithSameName() {
                .numberOfShards(1)
                .numberOfReplicas(0)
                .putAlias(AliasMetadata.builder("index").build())
-                .build()
+                .build(randomBoolean())


Could we expand this test (or add one) to also verify that for indices created in v8.5, we reject an alias of the same name as the index?

…ollision

original-brownbear · 2022-11-29T09:53:21Z

Thanks Henning!

…elastic#91887) Follow up to elastic#91456 implementing an automated fix for indices corrupted in 8.5.

elasticsearchmachine · 2022-11-29T09:54:52Z

💔 Backport failed

Status	Branch	Result
✅	8.6
❌	8.5	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 91887

…#91887) (#91988) Follow up to #91456 implementing an automated fix for indices corrupted in 8.5.

…elastic#91887) Follow up to elastic#91456 implementing an automated fix for indices corrupted in 8.5.

…es bug (#91993) * Fix corrupted Metadata from index and alias having the same name (#91456) This fixes a bug introduced in #87863 which added a Metadata copy constructor with separate name collision checks that assumed index name and alias names were already validated in `IndexMetada`. => fixed corrupted states by actually adding the validation to `IndexMetadata` to make it impossible to instantiate a broken `IndexMetadata` in the first place. * Implement repair functionality for aliases colliding with indices bug (#91887) Follow up to #91456 implementing an automated fix for indices corrupted in 8.5.

* upstream/main: (209 commits) Remove unused methods and classes from HLRC (elastic#92030) Clean up on exception while chunking XContent (elastic#92024) Add profiling plugin (elastic#91640) Remove unused methods and classes from HLRC (elastic#92012) Remove IndexerState from HLRC (elastic#92023) Ensure cached time elapses in ClusterServiceIT (elastic#91986) Chunked encoding for RestGetIndicesAction (elastic#92016) Simplify shardsWithState (elastic#91991) [DOCS] Updates ML decider docs by mentioning CPU as scaling criterion (elastic#92018) Add chunking to ClusterState.Custom impls (elastic#91963) Speedup time_series agg by caching current tsid ordinal, parent bucket ordinal and buck ordinal (elastic#91784) Drop the ingest listener call count tracking (elastic#92003) [DOCS] fixes issue number 91889 - missing [discrete] header (elastic#91976) Fix PersistentTasksClusterServiceTests (elastic#92002) [docs] Update search-settings documentation to reflect the fact that the indices.query.bool.max_clause_count setting has been deprecated (elastic#91811) Clarify writability in Netty4HttpPipeliningHandler (elastic#91982) Load stable plugins as synthetic modules (elastic#91869) Handle any exception thrown while generating source for an IngestDocument (elastic#91981) fixing Apache HttpHost url on java-rest doc (elastic#91945) Implement repair functionality for aliases colliding with indices bug (elastic#91887) ...

Relates elastic#91456 Relates elastic#91887

Relates #91456 Relates #91887

Implement repair functionality for aliases colliding with indices bug

e712d4a

Follow up to elastic#91456 implementing an automated fix for indices corrupted in 8.5.

original-brownbear added >non-issue :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.6.0 v8.7.0 v8.5.3 labels Nov 24, 2022

elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Nov 24, 2022

henningandersen reviewed Nov 24, 2022

View reviewed changes

original-brownbear added 2 commits November 28, 2022 10:04

Merge remote-tracking branch 'elastic/main' into repair-index-alias-c…

fa8b673

…ollision

repair conditionally

d73170a

original-brownbear requested a review from henningandersen November 29, 2022 08:35

henningandersen approved these changes Nov 29, 2022

View reviewed changes

original-brownbear added 2 commits November 29, 2022 10:05

Merge remote-tracking branch 'elastic/main' into repair-index-alias-c…

af71bf2

…ollision

enhance test a little

7bef12f

original-brownbear added the auto-backport-and-merge label Nov 29, 2022

original-brownbear merged commit 046592e into elastic:main Nov 29, 2022

original-brownbear deleted the repair-index-alias-collision branch November 29, 2022 09:53

original-brownbear mentioned this pull request Nov 29, 2022

[8.6] Implement repair functionality for aliases colliding with indices bug (#91887) #91988

Merged

elasticsearchmachine added the backport pending label Nov 29, 2022

elasticsearchmachine pushed a commit that referenced this pull request Nov 29, 2022

Implement repair functionality for aliases colliding with indices bug (…

14daacf

…#91887) (#91988) Follow up to #91456 implementing an automated fix for indices corrupted in 8.5.

original-brownbear mentioned this pull request Nov 29, 2022

[8.5] Implement repair functionality for aliases colliding with indices bug #91993

Merged

original-brownbear removed the backport pending label Nov 29, 2022

DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Dec 12, 2022

Add known-issue docs for index/alias name clash bug

b0e4ece

Relates elastic#91456 Relates elastic#91887

DaveCTurner mentioned this pull request Dec 12, 2022

Add known-issue docs for index/alias name clash bug #92262

Merged

DaveCTurner added a commit that referenced this pull request Dec 12, 2022

Add known-issue docs for index/alias name clash bug (#92262)

b0af79d

Relates #91456 Relates #91887

DaveCTurner added a commit that referenced this pull request Dec 12, 2022

Add known-issue docs for index/alias name clash bug (#92262)

ab602ae

Relates #91456 Relates #91887

DaveCTurner added a commit that referenced this pull request Dec 12, 2022

Add known-issue docs for index/alias name clash bug (#92262)

cdb32ef

Relates #91456 Relates #91887

original-brownbear restored the repair-index-alias-collision branch April 18, 2023 20:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement repair functionality for aliases colliding with indices bug #91887

Implement repair functionality for aliases colliding with indices bug #91887

original-brownbear commented Nov 24, 2022

elasticsearchmachine commented Nov 24, 2022

original-brownbear commented Nov 24, 2022

henningandersen left a comment

henningandersen Nov 24, 2022

original-brownbear Nov 24, 2022

henningandersen Nov 24, 2022

henningandersen Nov 24, 2022

original-brownbear Nov 24, 2022

henningandersen Nov 24, 2022

original-brownbear Nov 28, 2022

original-brownbear Nov 29, 2022

henningandersen left a comment

henningandersen Nov 29, 2022

original-brownbear commented Nov 29, 2022

elasticsearchmachine commented Nov 29, 2022

Implement repair functionality for aliases colliding with indices bug #91887

Implement repair functionality for aliases colliding with indices bug #91887

Conversation

original-brownbear commented Nov 24, 2022

elasticsearchmachine commented Nov 24, 2022

original-brownbear commented Nov 24, 2022

henningandersen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henningandersen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear commented Nov 29, 2022

elasticsearchmachine commented Nov 29, 2022

💔 Backport failed