-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add validation of the number_of_shards parameter in Shrink Action of ILM #74219
Conversation
Pinging @elastic/es-core-features (Team:Core/Features) |
if (numberOfShards != null) { | ||
int sourceNumberOfShards = indexMetadata.getNumberOfShards(); | ||
int factor = sourceNumberOfShards / numberOfShards; | ||
if (factor * numberOfShards < sourceNumberOfShards) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, is there any particular reason you didn't go with if (sourceNumberOfShards % numberOfShards != 0) {
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a habit, %
is better, I think.
@Override | ||
public void performAction(IndexMetadata indexMetadata, ClusterState clusterState, | ||
ClusterStateObserver observer, ActionListener<Boolean> listener) { | ||
String indexName = indexMetadata.getIndex().getName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about refactoring all of this into some new Step
? I like the logic overall, and I agree that we should do the validation before the single allocation starts, I'm just not sure it belongs within this step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's a good idea, I will refactor the code soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @joegallo , I've refactor the code yet, can you take a look at the new commit?
@elasticmachine ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating on this @gaobinlong
I've left a few suggestions (and I suspect there'll be a few ITs failures) but this is looking good
...k/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/CheckTargetShardsCountStep.java
Outdated
Show resolved
Hide resolved
...k/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/CheckTargetShardsCountStep.java
Show resolved
Hide resolved
...k/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/CheckTargetShardsCountStep.java
Outdated
Show resolved
Hide resolved
@andreidan, the issues you mentioned above have been resolved now, I noticed the test failures and I'm still working on fixing them. |
@andreidan , I have no idea about how to fix the test failure in ClusterStateWaitThresholdBreachTests#testWaitInShrunkShardsAllocatedExceedsThreshold: Line 109 in 2c5e406
, as we add a elasticsearch/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/ShrinkAction.java Line 193 in 2c5e406
, maybe we can fail the Line 177 in 2c5e406
, but I don't know how to mock the DocsStats .Can you give me some cue to fix the failure? |
@gaobinlong Great point. I've opened #75695 to make this particular test rely on the shrunk index not being able to allocate (as opposed to the managed index not being able to shrink successfully as it was before). If you merge the main branch into your branch after #75695 is merged things should start working for this particular test (there might be others that use the same strategy for failing the shrink action - ie. higher configured number of shards - so they'll need a similar change to |
@andreidan thanks for your help, I've fixed other thest failures except the ShrinkActionIT#testShrinkStepMovesForwardIfShrunkIndexIsCreatedBetweenRetries: Line 306 in 2dcc337
, as it's hard to make the ShrinkStep failed and ShrinkAction will stuck in the new added CheckTargetShardsCountStep if the target shards count is higher than the source index's shards count, so I delete the test method directly. Can you help to take a look at the new commits?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating on this @gaobinlong
I think this is nearly complete.
I agree testShrinkStepMovesForwardIfShrunkIndexIsCreatedBetweenRetries
is not feasible anymore, however can you please create a unit test in ShrinkStepTests
instead? (basically have a clusterState
that already contains the shruken index and assert the success
listener method is called once by performAction
)
@andreidan,I have done that, can you help to take a look again? |
@joegallo as @andreidan is on vacation this week, could you take another peek at this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for iterating on this @gaobinlong
…n of ILM (elastic#74219) Add validation of the number_of_shards parameter in Shrink Action of ILM (cherry picked from commit 58feb4e) Signed-off-by: Andrei Dan <[email protected]>
…n of ILM (#74219) (#76565) Add validation of the number_of_shards parameter in Shrink Action of ILM (cherry picked from commit 58feb4e) Signed-off-by: Andrei Dan <[email protected]> Co-authored-by: bellengao <[email protected]>
* master: (868 commits) Query API key - Rest spec and yaml tests (elastic#76238) Delay shard reassignment from nodes which are known to be restarting (elastic#75606) Reenable bwc tests for elastic#76475 (elastic#76576) Set version to 7.15 in BWC code (elastic#76577) Don't remove warning headers on all failure (elastic#76434) Disable bwc tests for elastic#76475 (elastic#76541) Re-enable bwc tests (elastic#76567) Keep track of data recovered from snapshots in RecoveryState (elastic#76499) [Transform] Align transform checkpoint range with date_histogram interval for better performance (elastic#74004) EQL: Remove "wildcard" function (elastic#76099) Fix 'accept' and 'content_type' fields for search_mvt API Add persistent licensed feature tracking (elastic#76476) Add system data streams to feature state snapshots (elastic#75902) fix the error message for instance methods that don't exist (elastic#76512) ILM: Add validation of the number_of_shards parameter in Shrink Action of ILM (elastic#74219) remove dashboard only reserved role (elastic#76507) Fix Stack Overflow in UnassignedInfo in Corner Case (elastic#76480) Add (Extended)KeyUsage KeyUsage, CipherSuite & Protocol to SSL diagnostics (elastic#65634) Add recovery from snapshot to tests (elastic#76535) Reenable BwC Tests after elastic#76532 (elastic#76534) ...
Relates to #72724.
The main changes of the PR are:
number_of_shards
parameter in theSetSingleNodeAllocateStep
, fail the step ifnumber_of_shards
is not a factor of the source index's shards count.