-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate last node to reinitialized routing tables #91549
Propagate last node to reinitialized routing tables #91549
Conversation
When closing or opening an index, or restoring a snapshot over a closed index, we reinitialize its routing table from scratch and expect the gateway allocators to select the appropriate node for each shard copy. With this commit we also keep track of the last-allocated node ID for each copy which makes it more likely that the desired balance of these shards remains unchanged too. Closes elastic#91472
Pinging @elastic/es-distributed (Team:Distributed) |
…ount, expect all the node IDs to be filled in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One smaller concern, otherwise this looks good.
} | ||
final var previousNodes = new ArrayList<String>(previousShardRoutingTable.size()); | ||
previousNodes.add(primaryNode); | ||
for (final var assignedShard : previousShardRoutingTable.assignedShards()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also includes the target of relocations. I wonder if we should only look at active shards, since anything less will anyway not be considered good enough by the gateway allocator?
The problem I see with this is that if a relocation is ongoing, we risk a copy having a last allocated node id that is much worse than it could be (i.e., a node that only has just started the recovery)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks - see bd12ab9.
assertThat(shard.unassignedInfo().getReason(), equalTo(expectedUnassignedReason)); | ||
final var lastAllocatedNodeId = shard.unassignedInfo().getLastAllocatedNodeId(); | ||
if (lastAllocatedNodeId == null) { | ||
// restoring an index may change the number of shards/replicas so no guarantee that lastAllocatedNodeId is populated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think only the number of replicas, not the number of shards can be changed? Probably what you meant with shards/replicas, but removing "shards/" would be better I think.
// restoring an index may change the number of shards/replicas so no guarantee that lastAllocatedNodeId is populated | |
// restoring an index may change the number of replicas so no guarantee that lastAllocatedNodeId is populated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the contrary, I didn't think there's anything to require that the snapshot has the same number of shards as index on top of which it's being restored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, right, thanks.
// both original and restored index must have at least one shard tho | ||
assertTrue(foundAnyNodeIds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this not go one line up, i.e., we can check this for every shard id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not if the shard count can change in a restore (which AFAIK it can)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
* main: (163 commits) [DOCS] Edits frequent items aggregation (elastic#91564) Handle providers of optional services in ubermodule classloader (elastic#91217) Add `exportDockerImages` lifecycle task for exporting docker tarballs (elastic#91571) Fix CSV dependency report output file location in DRA CI job Fix variable placeholder for Strings.format calls (elastic#91531) Fix output dir creation in ConcatFileTask (elastic#91568) Fix declaration of dependencies in DRA snapshots CI job (elastic#91569) Upgrade Gradle Enterprise plugin to 3.11.4 (elastic#91435) Ingest DateProcessor (small) speedup, optimize collections code in DateFormatter.forPattern (elastic#91521) Fix inter project handling of generateDependenciesReport (elastic#91555) [Synthetics] Add synthetics-* read to fleet-server (elastic#91391) [ML] Copy more settings when creating DF analytics destination index (elastic#91546) Reduce CartesianCentroidIT flakiness (elastic#91553) Propagate last node to reinitialized routing tables (elastic#91549) Forecast write load during rollovers (elastic#91425) [DOCS] Warn about potential overhead of named queries (elastic#91512) Datastream unavailable exception metadata (elastic#91461) Generate docker images and dependency report in DRA ci job (elastic#91545) Support cartesian_bounds aggregation on point and shape (elastic#91298) Add support for EQL samples queries (elastic#91312) ... # Conflicts: # x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/downsample/RollupShardIndexer.java
When closing or opening an index, or restoring a snapshot over a closed index, we reinitialize its routing table from scratch and expect the gateway allocators to select the appropriate node for each shard copy. With this commit we also keep track of the last-allocated node ID for each copy which makes it more likely that the desired balance of these shards remains unchanged too.
Closes #91472