-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow _shrink
to N shards if source shards is a multiple of N
#18699
Conversation
if (docsStats != null) { | ||
count += docsStats.getCount(); | ||
} | ||
if (count >= IndexWriter.MAX_DOCS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be count > IndexWriter.MAX_DOCS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah true :)
This is not the part of the codebase I'm most familiar with, but it looks good. |
this is ready for review, @clintongormley I messed with docs you might wanna make sure the art is not damaged |
every shard in the index must be present on the same node. | ||
with fewer primary shards. The number of primary shards in the target index | ||
must be a factor of the shards in the source index. For example an index with | ||
`8` primary shards can be shrunk into `4`, `2` or `1` primary shard or an index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or 1
primary shardS
@s1monw couple of small comments, otherwise LGTM |
@@ -299,15 +297,20 @@ public ClusterState execute(ClusterState currentState) throws Exception { | |||
|
|||
indexSettingsBuilder.put(SETTING_INDEX_UUID, UUIDs.randomBase64UUID()); | |||
final Index shrinkFromIndex = request.shrinkFrom(); | |||
int routingFactor = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dead code
Can you also update the estimate shard size for shrinked indices (#18659)? I was also wondering if we can relax the condition to have all shards active on one node... i.e. reduce it to the ones that are needed for local recovery (selectShrinkShards for the shard that is being recovered). |
all of those should be done in a follow up. I am currently not very in favor to reduce that limitation just now |
ok, Iet's do a follow-up for estimated shard size then. LGTM |
@ywelsch I misunderstood, I agree we should fix the estimated shard size here - the node restriction should be a followup. I will push a new commit later or next week |
@ywelsch I pushed a new commit |
@@ -298,17 +298,22 @@ public void testSizeShrinkIndex() { | |||
ImmutableOpenMap.Builder<String, Long> shardSizes = ImmutableOpenMap.builder(); | |||
shardSizes.put("[test][0][p]", 10L); | |||
shardSizes.put("[test][1][p]", 100L); | |||
shardSizes.put("[test][2][p]", 1000L); | |||
shardSizes.put("[test][2][p]", 500L); | |||
shardSizes.put("[test][3][p]", 500l); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think checkstyle will fail with the lowercase L.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
picky bastard that checkstyle
LGTM. Thanks @s1monw |
Today we allow to shrink to 1 shard but that might not be possible due to too many documnent or a single shard doesn't meet the requirements for the index. The logic can be expanded to N shards if the source index shards is a mutiple of N. This guarantees that there are not hotspots created due to different number of shards being shrunk into one.
The `_routing` metadata field docs currently include formulas for how Elasticsearch routes documents to shards. However, these formulas were not updated for elastic#18699. This updates the routing formulas and adds xrefs for related settings. Closes elastic#76072
The `_routing` metadata field docs currently include formulas for how Elasticsearch routes documents to shards. However, these formulas were not updated for elastic#18699. This updates the routing formulas and adds xrefs for related settings. Closes elastic#76072
The `_routing` metadata field docs currently include formulas for how Elasticsearch routes documents to shards. However, these formulas were not updated for #18699. This updates the routing formulas and adds xrefs for related settings. Closes #76072 Co-authored-by: James Rodewig <[email protected]>
The `_routing` metadata field docs currently include formulas for how Elasticsearch routes documents to shards. However, these formulas were not updated for #18699. This updates the routing formulas and adds xrefs for related settings. Closes #76072 Co-authored-by: James Rodewig <[email protected]>
Today we allow to shrink to 1 shard but that might not be possible due to
too many documents or a single shard doesn't meet the requirements for the index.
The logic can be expanded to N shards if the source index shards is a multiple of N.
This guarantees that there are not hotspots created due to different number of shards
being shrunk into one.
There is still some work to do on the documentation end etc. but I wanted to get it out there to get initial feedback if we should do it at all.