-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup BalanceUnbalancedClusterTests #88794
Speedup BalanceUnbalancedClusterTests #88794
Conversation
This commit speeds up the above tests (and probably many others) by changing how we assert the invariant. Previously invariant was checked by rebuilding internal collections from scratch and comparing them against ones already present in the object after every single modification twice. This commit verifies the invariant once after all bulk changes.
Pinging @elastic/es-distributed (Team:Distributed) |
} | ||
|
||
void update(ShardRouting oldShard, ShardRouting newShard) { | ||
assert invariant(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The invariant asserted after operation as well. I do not think it is required to do it twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
@@ -145,6 +145,9 @@ private RoutingNodes(RoutingTable routingTable, DiscoveryNodes discoveryNodes, b | |||
} | |||
} | |||
} | |||
for (var node : nodesToShards.values()) { | |||
assert node.invariant(); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invariant checking is expensive in case of big collections so it is verified once in the end instead of after every operation.
This is constructing RoutingNodes
from RoutingTable
so no other concurrent additions is expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. One suggestion: I think we do the invariant check in one line with method references:
nodesToShards.values().forEach(RoutingNode::invariant)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link #88951 :)
shardRoutingsRelocating.add(shard); | ||
} | ||
shardRoutingsByIndex.computeIfAbsent(shard.index(), k -> new HashSet<>(10)).add(shard); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was rebuild for assertion twice after any shard addition to the routing node.
This was especially slow for a test with 4k shards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could do something more stupid instead and just skip the expensive bit of the invariant check most of the time (perhaps 99% of the time if the cluster is "large" in some sense)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I've left a few comments, thank you for working on this Ievgen!
if (shard.relocating()) { | ||
shardRoutingsRelocating.add(shard); | ||
} | ||
shardRoutingsByIndex.computeIfAbsent(shard.index(), k -> new HashSet<>(10)).add(shard); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to use Sets.newHashSetWithExpectedSize
here to correctly pre-size the HashSet
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we do not know expected size here. I used some random number to avoid initial resize(s)
assert initializingShards.size() == shardRoutingsInitializing.size(); | ||
assert initializingShards.containsAll(shardRoutingsInitializing); | ||
boolean invariant() { | ||
var shardRoutingsInitializing = new ArrayList<ShardRouting>(shards.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about declaring shardRoutingsInitializing
and shardRoutingsRelocating
as HashSet
? Then we can do assertions as simple equals
calls
assert initializingShards.equals(shardRoutingsInitializing);
assert relocatingShards.equals(shardRoutingsRelocating);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about it, but then decided to keep Lists to avoid calculating hash for each item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hashCode
is cached for ShardRouting
, but I agree that it does seem redundant is this case, bceause these new collections are never queried.
addInternal(shard, true); | ||
} | ||
|
||
void addNoValidate(ShardRouting shard) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about addWithoutValidation
? It sounds more natural for me.
} | ||
|
||
void update(ShardRouting oldShard, ShardRouting newShard) { | ||
assert invariant(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
@@ -145,6 +145,9 @@ private RoutingNodes(RoutingTable routingTable, DiscoveryNodes discoveryNodes, b | |||
} | |||
} | |||
} | |||
for (var node : nodesToShards.values()) { | |||
assert node.invariant(); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. One suggestion: I think we do the invariant check in one line with method references:
nodesToShards.values().forEach(RoutingNode::invariant)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank Ievgen!
* upstream/main: Add 8.5 migration docs (elastic#88923) Script: Reindex & UpdateByQuery Metadata (elastic#88665) Remove unused plugins dir var from server CLI (elastic#88917) Use tracing API in TaskManager (elastic#88885) Add source fallback for keyword fields using operation (elastic#88735) Prune changelogs after 8.3.3 release Bump versions after 8.3.3 release Add a test for checking for misspelled "dry_run" parameters for Desired Nodes API (elastic#88898) Speedup BalanceUnbalancedClusterTests (elastic#88794) Preventing exceptions on node shutdown in integration tests (elastic#88827) Do not trigger check part3 for test mute and docs PRs (elastic#88895) Add troubleshooting docs about data corruption (elastic#88760) Mute RollupActionSingleNodeTests#testRollupDatastream (elastic#88891) [DOCS] Domain splitting impacts API keys (elastic#88677) Fix SqlSearchIT testAllTypesWithRequestToOldNodes (elastic#88866) (elastic#88883) Update synthetic-source.asciidoc (elastic#88880) Log more details in TaskAssertions (elastic#88864) Make Tuple a record (elastic#88280)
This needs to be in a separate method, it's currently running in production and uses significant CPU time. Broken in elastic#88794
This needs to be in a separate method, it's currently running in production and uses significant CPU time. Broken in #88794
This commit speeds up the above tests (and probably many others) by changing how
we assert the invariant. Previously invariant was checked by rebuilding
internal collections from scratch and comparing them against ones already
present in the object after every single modification twice. This commit
verifies the invariant once after all bulk changes.
Closes: #12629