Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow _shrink to N shards if source shards is a multiple of N #18699

Merged
merged 9 commits into from
Jun 7, 2016

Conversation

s1monw
Copy link
Contributor

@s1monw s1monw commented Jun 2, 2016

Today we allow to shrink to 1 shard but that might not be possible due to
too many documents or a single shard doesn't meet the requirements for the index.
The logic can be expanded to N shards if the source index shards is a multiple of N.
This guarantees that there are not hotspots created due to different number of shards
being shrunk into one.

There is still some work to do on the documentation end etc. but I wanted to get it out there to get initial feedback if we should do it at all.

if (docsStats != null) {
count += docsStats.getCount();
}
if (count >= IndexWriter.MAX_DOCS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be count > IndexWriter.MAX_DOCS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah true :)

@jpountz
Copy link
Contributor

jpountz commented Jun 2, 2016

This is not the part of the codebase I'm most familiar with, but it looks good.

@s1monw s1monw removed the WIP label Jun 3, 2016
@s1monw
Copy link
Contributor Author

s1monw commented Jun 3, 2016

this is ready for review, @clintongormley I messed with docs you might wanna make sure the art is not damaged

every shard in the index must be present on the same node.
with fewer primary shards. The number of primary shards in the target index
must be a factor of the shards in the source index. For example an index with
`8` primary shards can be shrunk into `4`, `2` or `1` primary shard or an index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or 1 primary shardS

@clintongormley
Copy link
Contributor

@s1monw couple of small comments, otherwise LGTM

@@ -299,15 +297,20 @@ public ClusterState execute(ClusterState currentState) throws Exception {

indexSettingsBuilder.put(SETTING_INDEX_UUID, UUIDs.randomBase64UUID());
final Index shrinkFromIndex = request.shrinkFrom();
int routingFactor = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code

@ywelsch
Copy link
Contributor

ywelsch commented Jun 3, 2016

Can you also update the estimate shard size for shrinked indices (#18659)? I was also wondering if we can relax the condition to have all shards active on one node... i.e. reduce it to the ones that are needed for local recovery (selectShrinkShards for the shard that is being recovered).

@s1monw
Copy link
Contributor Author

s1monw commented Jun 3, 2016

Can you also update the estimate shard size for shrinked indices (#18659)? I was also wondering if we can relax the condition to have all shards active on one node... i.e. reduce it to the ones that are needed for local recovery (selectShrinkShards for the shard that is being recovered).

all of those should be done in a follow up. I am currently not very in favor to reduce that limitation just now

@ywelsch
Copy link
Contributor

ywelsch commented Jun 3, 2016

ok, Iet's do a follow-up for estimated shard size then. LGTM

@s1monw
Copy link
Contributor Author

s1monw commented Jun 3, 2016

ok, Iet's do a follow-up for estimated shard size then. LGTM

@ywelsch I misunderstood, I agree we should fix the estimated shard size here - the node restriction should be a followup. I will push a new commit later or next week

@s1monw
Copy link
Contributor Author

s1monw commented Jun 6, 2016

@ywelsch I pushed a new commit

@@ -298,17 +298,22 @@ public void testSizeShrinkIndex() {
ImmutableOpenMap.Builder<String, Long> shardSizes = ImmutableOpenMap.builder();
shardSizes.put("[test][0][p]", 10L);
shardSizes.put("[test][1][p]", 100L);
shardSizes.put("[test][2][p]", 1000L);
shardSizes.put("[test][2][p]", 500L);
shardSizes.put("[test][3][p]", 500l);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think checkstyle will fail with the lowercase L.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

picky bastard that checkstyle

@ywelsch
Copy link
Contributor

ywelsch commented Jun 6, 2016

LGTM. Thanks @s1monw

s1monw added 5 commits June 7, 2016 10:02
Today we allow to shrink to 1 shard but that might not be possible due to
too many documnent or a single shard doesn't meet the requirements for the index.
The logic can be expanded to N shards if the source index shards is a mutiple of N.
This guarantees that there are not hotspots created due to different number of shards
being shrunk into one.
@s1monw s1monw merged commit b2c4c32 into elastic:master Jun 7, 2016
@lcawl lcawl added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Allocation labels Feb 13, 2018
jrodewig added a commit that referenced this pull request Aug 9, 2021
The `_routing` metadata field docs currently include formulas for how
Elasticsearch routes documents to shards. However, these formulas were not
updated for #18699.  This updates the routing formulas and adds xrefs for
related settings.

Closes #76072
elasticsearchmachine pushed a commit to elasticsearchmachine/elasticsearch that referenced this pull request Aug 9, 2021
The `_routing` metadata field docs currently include formulas for how
Elasticsearch routes documents to shards. However, these formulas were not
updated for elastic#18699.  This updates the routing formulas and adds xrefs for
related settings.

Closes elastic#76072
elasticsearchmachine pushed a commit to elasticsearchmachine/elasticsearch that referenced this pull request Aug 9, 2021
The `_routing` metadata field docs currently include formulas for how
Elasticsearch routes documents to shards. However, these formulas were not
updated for elastic#18699.  This updates the routing formulas and adds xrefs for
related settings.

Closes elastic#76072
elasticsearchmachine added a commit that referenced this pull request Aug 9, 2021
The `_routing` metadata field docs currently include formulas for how
Elasticsearch routes documents to shards. However, these formulas were not
updated for #18699.  This updates the routing formulas and adds xrefs for
related settings.

Closes #76072

Co-authored-by: James Rodewig <[email protected]>
elasticsearchmachine added a commit that referenced this pull request Aug 9, 2021
The `_routing` metadata field docs currently include formulas for how
Elasticsearch routes documents to shards. However, these formulas were not
updated for #18699.  This updates the routing formulas and adds xrefs for
related settings.

Closes #76072

Co-authored-by: James Rodewig <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. >enhancement v5.0.0-alpha4
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants