Replicate closed indices #33888

tlrx · 2018-09-20T11:13:29Z

Closed indices currently have the downside that they are not replicated, so it's possible for an index to be closed, then the data lost because the node was terminated without saving the data.

We'd like to be able to close indices to remove any memory overhead of them, but still have the indices available for replication.

Steps to accomplish this (not necessarily in order):

Routing Table

Keep closed indices in the routing table and reinitialize shards when closing an index (Allow shards of closed indices to be replicated as regular shards #38024)
Add the no op engine (from Replicate closed indices #31141) and plug it as the default engine for closed index shards ([RCI] Add NoOpEngine for closed indices #33903)
Change the Gateway service to create routing table also for closed indices (Recover closed indices after a full cluster restart #39249)
Adapt the Synced Flush Service to work when routing table exist for closed indices (SyncedFlushService.getShardRoutingTable() should use metadata to check for index existence #37691)
Ignore shard started requests when primary term does not match (Ignore shard started requests when primary term does not match #37899)
Change IndexService so that it does not instantiates a mapper service/ index analyzers for closed indices (No mapper service and index caches for replicated closed indices #40423)

Open/Close Index APIs

Engine/Translog

Add NoOpEngine as engine implementation to use for closed shards ([RCI] Add NoOpEngine for closed indices #33903)
Ensure that max seq # is equal to the global checkpoint when creating ReadOnlyEngines(Ensure that max seq # is equal to the global checkpoint when creating ReadOnlyEngines #37426)
Relax NoOpEngine constraints on translog (Relax NoOpEngine constraints on translog #37413)
ReadOnlyEngine should update translog recovery state information (ReadOnlyEngine should update translog recovery state information #39238)
Should we clean/trim the translog when an index is closed? (Compact index when closing #42445)
Should we enforce a single index commit when an index is closed? (see comment) (Compact index when closing #42445)

Tests

Others

The text was updated successfully, but these errors were encountered:

elasticmachine · 2018-09-20T11:13:31Z

Pinging @elastic/es-distributed

This commit adds a new NoOpEngine implementation based on the current ReadOnlyEngine. This new implementation uses an empty DirectoryReader with no segments readers and will always returns 0 docs. The NoOpEngine is the default Engine created for IndexShards of closed indices. It expects an empty translog when it is instantiated. Relates to #33888

…ner<Releasable>) The current implementation of asyncBlockOperations() can be used to execute some code once all indexing operations permits have been acquired, then releases all permits immediately after the code execution. This immediate release is not suitable for treatments that need to keep all permits over multiple execution steps. This commit adds a new asyncBlockOperations() that exposes a Releasable, making it possible to acquire all permits and only release them all when needed by closing the Releasable. This method is aimed to be used in a TransportReplicationAction that will acquire all permits on the primary shard. The existing blockOperations() and asyncBlockOperations() methods have been modified to delegate permit acquisition/releasing to this new method. Relates to elastic#33888

…ner<Releasable>) (#34902) The current implementation of asyncBlockOperations() can be used to execute some code once all indexing operations permits have been acquired, then releases all permits immediately after the code execution. This immediate release is not suitable for treatments that need to keep all permits over multiple execution steps. This commit adds a new asyncBlockOperations() that exposes a Releasable, making it possible to acquire all permits and only release them all when needed by closing the Releasable. The existing blockOperations() method has been modified to delegate permit acquisition/releasing to this new method. Relates to #33888

This commit adds a new NoOpEngine implementation based on the current ReadOnlyEngine. This new implementation uses an empty DirectoryReader with no segments readers and will always returns 0 docs. The NoOpEngine is the default Engine created for IndexShards of closed indices. It expects an empty translog when it is instantiated. Relates to #33888

…ner<Releasable>) (elastic#34902) The current implementation of asyncBlockOperations() can be used to execute some code once all indexing operations permits have been acquired, then releases all permits immediately after the code execution. This immediate release is not suitable for treatments that need to keep all permits over multiple execution steps. This commit adds a new asyncBlockOperations() that exposes a Releasable, making it possible to acquire all permits and only release them all when needed by closing the Releasable. The existing blockOperations() method has been modified to delegate permit acquisition/releasing to this new method. Relates to elastic#33888

…ationAction (#35332) Today, the TransportReplicationAction checks the global level blocks and the index level blocks before routing the operation to the primary, in the ReroutePhase, and it happens at the very beginning of the transport replication action execution. For the upcoming rework of the Close Index API and in order to deal with primary relocation, we'll need to also check for blocks before executing the operation on the primary (while holding a permit) but before routing to the new primary. This pull request change the AsyncPrimaryAction so that it checks for replication action's blocks before executing the operation locally or before routing the primary action to the newly primary shard. The check is done while holding a PrimaryShardReference. Related to #33888

…ner<Releasable>) (#34902) The current implementation of asyncBlockOperations() can be used to execute some code once all indexing operations permits have been acquired, then releases all permits immediately after the code execution. This immediate release is not suitable for treatments that need to keep all permits over multiple execution steps. This commit adds a new asyncBlockOperations() that exposes a Releasable, making it possible to acquire all permits and only release them all when needed by closing the Releasable. The existing blockOperations() method has been modified to delegate permit acquisition/releasing to this new method. Relates to #33888

…ationAction (#35332) Today, the TransportReplicationAction checks the global level blocks and the index level blocks before routing the operation to the primary, in the ReroutePhase, and it happens at the very beginning of the transport replication action execution. For the upcoming rework of the Close Index API and in order to deal with primary relocation, we'll need to also check for blocks before executing the operation on the primary (while holding a permit) but before routing to the new primary. This pull request change the AsyncPrimaryAction so that it checks for replication action's blocks before executing the operation locally or before routing the primary action to the newly primary shard. The check is done while holding a PrimaryShardReference. Related to #33888

After two recent changes (elastic#38824 and elastic#33888), the _cat/indices API no longer report information for active recovering indices and non-replicated closed indices. It also misreport replicated closed indices that are potentially not authorized for the user. This commit changes how the cat action works by first using the Get Settings API in order to resolve authorized indices. It then uses the Cluster State, Cluster Health and Indices Stats APIs to retrieve information about the indices. Closes elastic#39933

After two recent changes (#38824 and #33888), the _cat/indices API no longer report information for active recovering indices and non-replicated closed indices. It also misreport replicated closed indices that are potentially not authorized for the user. This commit changes how the cat action works by first using the Get Settings API in order to resolve authorized indices. It then uses the Cluster State, Cluster Health and Indices Stats APIs to retrieve information about the indices. Closes #39933

After two recent changes (elastic#38824 and elastic#33888), the _cat/indices API no longer report information for active recovering indices and non-replicated closed indices. It also misreport replicated closed indices that are potentially not authorized for the user. This commit changes how the cat action works by first using the Get Settings API in order to resolve authorized indices. It then uses the Cluster State, Cluster Health and Indices Stats APIs to retrieve information about the indices. Closes elastic#39933

After two recent changes (#38824 and #33888), the _cat/indices API no longer report information for active recovering indices and non-replicated closed indices. It also misreport replicated closed indices that are potentially not authorized for the user. This commit changes how the cat action works by first using the Get Settings API in order to resolve authorized indices. It then uses the Cluster State, Cluster Health and Indices Stats APIs to retrieve information about the indices. Closes #39933

Today the `InternalClusterInfoService` collects information on the sizes of shards of open indices, but does not consider closed indices. This means that shards of closed indices are treated as having zero size when they are being allocated. This commit fixes this, obtaining the sizes of all shards. Relates elastic#33888

Today the `InternalClusterInfoService` collects information on the sizes of shards of open indices, but does not consider closed indices. This means that shards of closed indices are treated as having zero size when they are being allocated. This commit fixes this, obtaining the sizes of all shards. Relates #33888

Today the `InternalClusterInfoService` collects information on the sizes of shards of open indices, but does not consider closed indices. This means that shards of closed indices are treated as having zero size when they are being allocated. This commit fixes this, obtaining the sizes of all shards. Relates elastic#33888

Prior to elastic#33888 it is dangerous to keep closed indices in your cluster long-term: Elasticsearch does not maintain their shard copies so they tend to get lost as the cluster migrates to new nodes. This risk isn't documented today. This commit addresses that gap.

Prior to #33888 it is dangerous to keep closed indices in your cluster long-term: Elasticsearch does not maintain their shard copies so they tend to get lost as the cluster migrates to new nodes. This risk isn't documented today. This commit addresses that gap.

Today you cannot explicitly indicate that an operation should use the usual behaviour of waiting for active shards according to the underlying index setting. This is a problem for the close index API which has a default of `none` in 7.x for BWC reasons (see elastic#33888), but the usual behaviour in 8.0: you cannot today opt-in to the 8.0 behaviour with this parameter. This commit adds support for the literal value `default` for the `wait_for_active_shards` query parameter. Relates elastic#66419

This assertion fails in the presence of pre-7.2.0 closed indices because such indices don't even have routing table entries. Relates elastic#33888 Closes elastic#91470

This assertion fails in the presence of pre-7.2.0 closed indices because such indices don't even have routing table entries. Relates #33888 Closes #91470

This assertion fails in the presence of pre-7.2.0 closed indices because such indices don't even have routing table entries. Relates elastic#33888 Closes elastic#91470

This assertion fails in the presence of pre-7.2.0 closed indices because such indices don't even have routing table entries. Relates #33888 Closes #91470

tlrx added >feature :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. labels Sep 20, 2018

tlrx self-assigned this Sep 20, 2018

tlrx mentioned this issue Sep 20, 2018

Replicate closed indices #31141

Closed

8 tasks

tlrx mentioned this issue Sep 27, 2018

[RCI] Keep index routing table for closed indices #34108

Closed

tlrx mentioned this issue Oct 26, 2018

[RCI] Add IndexShardOperationPermits.asyncBlockOperations(ActionListener<Releasable>) #34902

Merged

tlrx mentioned this issue Nov 7, 2018

[RCI] Check blocks while having index shard permit in TransportReplicationAction #35332

Merged

s1monw mentioned this issue Nov 9, 2018

Allow marking an index as cold and moving its in-memory items to disk #23546

Closed

ywelsch added the Meta label Nov 12, 2018

tlrx mentioned this issue Jun 25, 2019

Fix indices shown in _cat/indices (#43286) #43589

Merged

tlrx mentioned this issue Jun 26, 2019

Fix indices shown in _cat/indices (#43286) #43620

Merged

DaveCTurner mentioned this issue Jul 26, 2019

index.data_path is mutable on closed replicated indices #44899

Closed

codebrain mentioned this issue Aug 5, 2019

[meta] 7.2 Release elastic/elasticsearch-net#3980

Closed

37 tasks

Mpdreamz mentioned this issue Aug 7, 2019

[meta] 7.3 Release elastic/elasticsearch-net#4001

Closed

16 tasks

DaveCTurner mentioned this issue Jan 6, 2020

Collect shard sizes for closed indices #50645

Merged

This was referenced Feb 3, 2020

[meta] 7.6 release elastic/elasticsearch-net#4340

Closed

[meta] 7.6 release elastic/elasticsearch-net#4341

Closed

DaveCTurner mentioned this issue Aug 19, 2020

Add note on the danger of closed indices #61332

Merged

mark54g mentioned this issue Sep 7, 2020

Request that Elasticsearch cat shards and cat indices show closed indices as CLOSED instead of STARTED #62071

Open

DaveCTurner mentioned this issue Dec 18, 2020

Add ?wait_for_active_shards=default #66575

Closed

DaveCTurner mentioned this issue Jan 17, 2022

Remove unused CSC#transportCloseIndexAction #82666

Merged

DaveCTurner mentioned this issue Nov 21, 2022

[CI] :qa:full-cluster-restart:v7.0.1#upgradedClusterTest failing #91470

Closed

DaveCTurner mentioned this issue Nov 21, 2022

Skip ancient closed indices in desired balance #91765

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicate closed indices #33888

Replicate closed indices #33888

tlrx commented Sep 20, 2018 •

edited by dnhatn

Loading

elasticmachine commented Sep 20, 2018

Replicate closed indices #33888

Replicate closed indices #33888

Comments

tlrx commented Sep 20, 2018 • edited by dnhatn Loading

elasticmachine commented Sep 20, 2018

tlrx commented Sep 20, 2018 •

edited by dnhatn

Loading