-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Control concurrency and add retry action in decommission flow #4684
Control concurrency and add retry action in decommission flow #4684
Conversation
Signed-off-by: Rishab Nahata <[email protected]>
Signed-off-by: Rishab Nahata <[email protected]>
Adding comments from local PR here - imRishN#66 gbbafna 8 days ago Owner do we need remainingTimeoutMS check here only ? Why do we need it ? Owner @gbbafna gbbafna 8 days ago Let me know if this answers your question @gbbafna gbbafna 8 days ago Owner |
@shwetathareja @Bukhtawar need your help reviewing this |
Signed-off-by: Rishab Nahata <[email protected]>
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #4684 +/- ##
============================================
+ Coverage 70.52% 70.61% +0.08%
- Complexity 57525 57573 +48
============================================
Files 4654 4654
Lines 276978 277036 +58
Branches 40525 40529 +4
============================================
+ Hits 195348 195637 +289
+ Misses 65229 64941 -288
- Partials 16401 16458 +57
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Gradle Check (Jenkins) Run Completed with:
|
Signed-off-by: Rishab Nahata <[email protected]>
Signed-off-by: Rishab Nahata <[email protected]>
Gradle Check (Jenkins) Run Completed with:
|
return; | ||
} | ||
decommissionRequest.setRetryOnClusterManagerChange(true); | ||
decommissionRequest.setRetryTimeout(TimeValue.timeValueMillis(remainingTimeoutMS)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will not be used in the new cluster manager . this retry timeout is checked only at the the time of cluster manager abdication and not anywhere else . Hence more than not (66% in case of 3 cluster manager setup), this parameter is never used , in scope of this PR . What are the cons of removing this altogether ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an unstable cluster having multiple leader switches, if not given a timeout, we might get stuck into endless loop of retrying and exhausting a transport thread. This timeout, helps to control the retry action. Although I agree, we might not hit this for a stable cluster. But I feel this will help to reject a work later on which the unstable cluster might not be able to execute
* @param startTime start time of previous request | ||
* @param listener callback for the retry action | ||
*/ | ||
public void retryDecommissionAction( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we rename to tryDecommissionOnNewClusterManager
as it is used only there for now ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The controller, is more of generic methods which can be used in the service layer. This method is just attempting a retry and hence the name. The place where we are using it is during leader switch (one use case of it.
Let me know if you think it makes sense. I can rename it if required
Description
#4084 (comment)
These are the changes implemented as part of this PR -
Issues Resolved
#4543
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.