Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add expectedShardSize to ShardRouting and use it in path.data allocation #12947

Merged
merged 2 commits into from
Aug 21, 2015

Conversation

s1monw
Copy link
Contributor

@s1monw s1monw commented Aug 17, 2015

Today we only guess how big the shard will be that we are allocating on a node.
Yet, we have this information on the master but it's not available on the data nodes
when we pick a data path for the shard. We use some rather simple heuristic based on
existing shard sizes on this node which might be complete bogus. This change adds
the expected shard size to the ShardRouting for RELOCATING and INITIALIZING shards
to be used on the actual node to find the best data path for the shard.

Closes #11271

@s1monw s1monw added >enhancement blocker :Allocation :Distributed Indexing/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. v2.0.0 labels Aug 17, 2015
@@ -54,6 +54,11 @@ public Long getShardSize(ShardRouting shardRouting) {
return shardSizes.get(shardIdentifierFromRouting(shardRouting));
}

public long getShardSize(ShardRouting shardRouting, long defaultValue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of passing defaultValue here, can we just return ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE? And maybe move that constant here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

others use thsi as well and they need 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't they check the constant and do their own "use 0 instead"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, thinking about it more ... I think I like the required defaultValue arg: it avoids sneaky bugs when the caller fails to check for the -1.

@mikemccand
Copy link
Contributor

I left a couple comments, and I don't understand all the places where you had to punch expectedShardSize through, but big +1 to sending expectedShardSize to the node so it can make a more informed decision when it has multiple path.data.

@clintongormley
Copy link
Contributor

@s1monw what happens in this scenario: A node has three data paths and one shard, located on the data path with least capacity. The shard grows to fill 99% of its data path...

@s1monw
Copy link
Contributor Author

s1monw commented Aug 18, 2015

@s1monw what happens in this scenario: A node has three data paths and one shard, located on the data path with least capacity. The shard grows to fill 99% of its data path...

it blows up - this is out of scope in this PR that's a problem of the disk-allocation decider and needs to be fixed in a different PR

@s1monw s1monw added the review label Aug 19, 2015
@s1monw
Copy link
Contributor Author

s1monw commented Aug 19, 2015

if nobody objects I will push this in the next 72 hours

…ation

Today we only guess how big the shard will be that we are allocating on a node.
Yet, we have this information on the master but it's not available on the data nodes
when we pick a data path for the shard. We use some rather simple heuristic based on
existing shard sizes on this node which might be complete bogus. This change adds
the expected shard size to the ShardRouting for RELOCATING and INITIALIZING shards
to be used on the actual node to find the best data path for the shard.

Closes elastic#11271
s1monw added a commit that referenced this pull request Aug 21, 2015
Add `expectedShardSize` to ShardRouting and use it in path.data allocation
@s1monw s1monw merged commit 3fb2d8e into elastic:master Aug 21, 2015
@s1monw s1monw deleted the expected_shard_size branch August 21, 2015 06:34
@jpountz
Copy link
Contributor

jpountz commented Aug 21, 2015

@s1monw I can only find this commit in master, not in the 2.0 branch, is it expected?

s1monw added a commit to s1monw/elasticsearch that referenced this pull request Aug 21, 2015
Add `expectedShardSize` to ShardRouting and use it in path.data allocation
@s1monw
Copy link
Contributor Author

s1monw commented Aug 21, 2015

@jpountz I ported it just now... wanted to give it some CI time first

@jpountz jpountz removed the v2.1.0 label Aug 21, 2015
@s1monw s1monw added v2.1.0 and removed review labels Aug 21, 2015
@clintongormley clintongormley removed the :Distributed Indexing/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. label Aug 24, 2015
@lcawl lcawl added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Allocation labels Feb 13, 2018
@clintongormley clintongormley added :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) and removed :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants