[BUG] [Remote Store] Shard routing table has wrong number of replicas while restoring an index with >= 1 replicas from remote store #8479
Labels
bug
Something isn't working
Storage:Durability
Issues and PRs related to the durability framework
Storage
Issues and PRs relating to data and metadata storage
Describe the bug
An
IllegalStateException
is thrown due to a mismatch between the replica count in index metadata and the shards getting restored from remote store. While this does not block the restore at runtime, it is failing (new) integration tests for scenarios where the remote store-enabled index has >= 1 replicas. This behaviour is due to the following assert:OpenSearch/server/src/main/java/org/opensearch/cluster/routing/allocation/AllocationService.java
Lines 174 to 179 in eca5a6c
To Reproduce
Steps to reproduce the behavior:
Logs similar to the following would be seen:
Expected behavior
The restore flow should gracefully handle indices with replication enabled.
Additional context
This behaviour is failing new integration tests being added for restore flow from remote store. One way to resolve this is to explicitly set the replica count to 0 in the index metadata before the restore:
i.e. changing
OpenSearch/server/src/main/java/org/opensearch/snapshots/RestoreService.java
Lines 247 to 253 in c9974a4
to
However, this would manual intervention post the restore to recover the replication configuration. We also need to analyze if this could have any cascading effects.
The text was updated successfully, but these errors were encountered: