-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added uniformAcrossAzUpdate strategy #27
Conversation
- Added Unit tests (More to come)
@narayanan: This is awesome! Thanks a lot for implementing this feature. Certainly makes upgrades more flexible. |
- `strategy.type`: This field is optional and currently only random update strategy is supported. Refer to [random_update_strategy.yaml](examples/random_update_strategy.yaml) for sample custom resource definition. | ||
- `strategy.type`: This field is optional and currently two strategies are supported | ||
- `randomUpdate` - Default is type is not specified. Picks nodes randomly for updating. Refer to [random_update_strategy.yaml](examples/random_update_strategy.yaml) for sample custom resource definition. | ||
- `uniformAcrossAzUpdate` - Picks same number of nodes or same percentage of nodes from each AZ for update. Refer to [uniform_across_az_update_strategy.yaml](examples/uniform_across_az_update_strategy.yaml) for sample custom resource definition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the behavior when the strategy.type is uniformAcrossAzUpdate
and maxUnavailable is say 3? Does it mean that there will be 3 nodes per az that will be taken down at a time or will it be 1 node per az (assuming there are 3 zones).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 nodes per AZ will be taken down. So if there are 3 AZ's, total a max of 9 nodes will be restacked on each iteration
controllers/helpers.go
Outdated
func getMaxUnavailable(strategy upgrademgrv1alpha1.UpdateStrategy, totalNodes int) int { | ||
// Below are the constants set in intstr package | ||
// const ( | ||
// Int Type = iota // The IntOrString holds an int. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these comments necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were there already, I moved the method from rolling_upgrade_controller,go
to helpers.go
as rolling_upgrade_controller,go
grew to 1000+ lines.
I can remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
// maxUnavailable has to be atleast 1 when there are nodes in the ASG | ||
if totalNodes > 0 && maxUnavailable < 1 { | ||
maxUnavailable = 1 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth logging what the computed value of maxUnavailable is. This would help debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done... added log messages from where this method is getting called, decided add the logging at the calling place as callers have more relevant context and can be logged with more relevant details.
@@ -0,0 +1,174 @@ | |||
package controllers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job on writing these tests! Fabulous!!
Thanks for your contributions @narayanan! |
* check error message instead of code for instance not found * fix test * Fixes for autoscaling API changes (keikoproj#26) * Add semaphore.yml * build * build * squashed commits * remove badge * readme * test * pr * fix kafka issue * update to fix potential throttling issue * Mod fix (#7) * revert go sum * fix mod build * add go mod files for logger * Fix terminate error (#8) * check error message instead of code for instance not found * fix test * add build badge * fix image * Release v0.2 (keikoproj#28) * Bump version to 0.3-dev. (keikoproj#29) * Fix keikoproj#30 and release v0.3 (keikoproj#31) Testing Done: - Verified that the new docker image can be built with aws-sdk v1.25.0 - Verified that rolling upgrade actually completed. * Bump version to 0.4-dev. (keikoproj#32) * Added uniformAcrossAzUpdate strategy (keikoproj#27) * Modified ClusterState to capture AZ details of a node * - Added UniformAcrossAzUpdate strategy - Added Unit tests (More to come) * Added Unit tests * Updated validation * Added uniformAcrossAzUpdate update strategy sample * Updated README * Externalize node selectors behind interface * Removed unnecessary comments * Log maxUnavailable value
* Add semaphore.yml * build * build * squashed commits * remove badge * readme * test * pr * fix kafka issue * update to fix potential throttling issue * Mod fix (#7) * revert go sum * fix mod build * add go mod files for logger * Fix terminate error (#8) * check error message instead of code for instance not found * fix test * Add badge (#9) * check error message instead of code for instance not found * fix test * Fixes for autoscaling API changes (#26) * Add semaphore.yml * build * build * squashed commits * remove badge * readme * test * pr * fix kafka issue * update to fix potential throttling issue * Mod fix (#7) * revert go sum * fix mod build * add go mod files for logger * Fix terminate error (#8) * check error message instead of code for instance not found * fix test * add build badge * fix image * Release v0.2 (#28) * Bump version to 0.3-dev. (#29) * Fix #30 and release v0.3 (#31) Testing Done: - Verified that the new docker image can be built with aws-sdk v1.25.0 - Verified that rolling upgrade actually completed. * Bump version to 0.4-dev. (#32) * Added uniformAcrossAzUpdate strategy (#27) * Modified ClusterState to capture AZ details of a node * - Added UniformAcrossAzUpdate strategy - Added Unit tests (More to come) * Added Unit tests * Updated validation * Added uniformAcrossAzUpdate update strategy sample * Updated README * Externalize node selectors behind interface * Removed unnecessary comments * Log maxUnavailable value * Add badge (#10) * check error message instead of code for instance not found * fix test * add build badge * fix image * update readme * readme * Add badge (#11) * check error message instead of code for instance not found * fix test * add build badge * fix image * update readme * fix mod
Externalized the node selectors behind an interface
Refactored the restack logic to leverage node selectors to get a list of instances to restack on each iteration
Added a strategy for
RandomUpdate
Added a new strategy to perform updates across AZ in same proportion. This new strategy picks either same number of nodes or same percentage of nodes from each AZ for updating in each iteration.
Added unit tests - 80% coverage for controllers
Tested the strategy by performing Random & rolling update in a k8s cluster
Sample
RollingUpgrade
object