-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement cluster autoscaler as bootstrap addon #9787
Conversation
/cc @hakman |
That's a great idea, thanks for adding that @olemarkus !! |
@olemarkus: you cannot LGTM your own PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
BTW, I guess it would be also worth adding some documentation for it? Or will that be done through a separate PR? |
Docs will come as part of this PR. Just want to ensure there are no objections before starting on that. |
upup/models/cloudup/resources/addons/cluster-autoscaler.addons.k8s.io/k8s-1.15.yaml.template
Outdated
Show resolved
Hide resolved
b1b88e1
to
d598472
Compare
/retest |
1 similar comment
/retest |
/retest |
/test pull-kops-e2e-cni-cilium |
/retest |
e00a2d4
to
feacaf1
Compare
38f5fdf
to
ce5a2cf
Compare
Use provider-agnostic node definition for cas instead of aws auto-discovery Validate clusterAutoscalerSpec Add spec documentation Add cas docs Make CRDs Apply suggestions from code review Co-authored-by: John Gardiner Myers <[email protected]> Add enabled flag to cas config Apply suggestions from code review Co-authored-by: Guy Templeton <[email protected]> Add support for custom cas image Support more k8s versions Use full image names
@johngmyers leaving the approval to you, in case there is anything left from your point of view. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither of the review comments is blocking. Holding to give opportunity to address.
/approve
/hold
|
||
if cas.Image == nil { | ||
|
||
image := "k8s.gcr.io/autoscaling/cluster-autoscaler:latest" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of mutable tags such as "latest"; I'd prefer using "v1.19.0" (or whatever the highest known version is) for newer Kubernetes versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any preference here.
@@ -521,6 +521,27 @@ func (b *BootstrapChannelBuilder) buildAddons() *channelsapi.Addons { | |||
} | |||
} | |||
|
|||
if b.Cluster.Spec.ClusterAutoscaler != nil && fi.BoolValue(b.Cluster.Spec.ClusterAutoscaler.Enabled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should default Enabled
to true if ClusterAutoscaler
is non-nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed to keep it disabled by default at last office hours.
Maybe in 1.20 we materialise it into to the cluster spec so that people know it's enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But default is to have a nil
ClusterAutoscaler
field. Why would someone specify a ClusterAutoscaler
field yet want Enabled
to default to false
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it may be a good idea in general, just don't know how used this pattern is in kops.
Sometimes when I test or just want to disable something I just comment the enabled: true
line, without actually but keep all other setting to make it easy to reenable.
One of the reasons for starting the discussion on #9661 was to establish some guidelines for enabled/disabled and managed/unmanaged, but seems not very popular topic. Maybe should open a more generic one.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johngmyers, olemarkus The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I am a bit back and forth on what I prefer on the above. So I think I'll just let this one go in and then do a follow-up. Especially if cluster-autoscaler should prove to be slower with their support for newer k8s in the future. |
Thanks for contributing with this addition, that makes deploying the CAS easier. We have a use case that uses a mix of different cluster AutoScaling methods (external and CAS) for various node pools. I think it would be vice to add a keyword to the instance group spec to either include or exclude the group with CAS. |
Thanks. I'd add this as a separate issue. I may not be able to implement this in the near future, but it should be a pretty easy "good first issue". |
Might late to join the party. Thanks @olemarkus for this feature. It will remove extra step to manage |
We see that when managing a larger number of clusters, getting the cluster autoscaler version correct is a bit of a hassle. We also need to take care to properly update CAS after cluster upgrades etc. So adding CAS as a configurable bootstrap addon seems like a good idea to make this more manageable.
If no one objects to this, I will finish this and add documentation and validation.