Skip to content
This repository has been archived by the owner on Apr 17, 2019. It is now read-only.

Cluster-autoscaler: AWS EC2 Spot Fleets support #2066

Closed
mumoshu opened this issue Nov 21, 2016 · 15 comments
Closed

Cluster-autoscaler: AWS EC2 Spot Fleets support #2066

mumoshu opened this issue Nov 21, 2016 · 15 comments

Comments

@mumoshu
Copy link
Contributor

mumoshu commented Nov 21, 2016

Hi, thanks for developing cluster-autoscaler!

Is this something you'd like to add to cluster-autoscaler?

With spot fleets, we can easily mix various types of spot instances into a single logical group managed by AWS to benefit from reduced infrastructure cost and reduced operational burden including bidding, choosing which type of instance type to bid, increased availability via mixing mutliple spot instance types(assuming it will be rare that we lose to all the bids to all the different spot types at once).

A spot fleet has the notion of "target capacity" where "capacity" means number of units(a.k.a InstanceWeight) required to handle your workload e.g. 1 for r3.2xlarge, 4 for r3.8xlarge.
Basically I guess we can treat 1 target capacity = 1 instance weight as unit of autoscaling in cluster-autoscaler so that we can reuse a lot of current codebase to support spot fleets.

I'm both interested in using it and if necessary implementing it.
If you have any comments, advices, questions, etc., please let me know!

@andrewsykim
Copy link

I would be interested in helping as well, though as mentioned in #1921, the CA needs some sort of logic for choosing a node group which I assume would be a pre-requisite for this.

@mumoshu
Copy link
Contributor Author

mumoshu commented Dec 5, 2016

Hi @andrewsykim, I've been thinking of how we'd like cluster-autoscaler to select which node pool(asg or spot fleet) to expand for a few days.

I've now realized that the current "random" strategy plus "most-pods" and "least-waste" proposed in #2118 would work in most cases, as long as we choose the diversified strategy and launch specifications with enough "diversity" for our spot fleets, to keep our fleets up and running.

I'd rather like cluster-autoscaler to not select an autoscaling group which is suspended, or a spot fleet which is losing to bids for all the launch specifications.
However, that's an another issue IMHO.

@andrewsykim
Copy link

andrewsykim commented Dec 6, 2016

@mumoshu my org and I have built a new image wattpad/cluster-autoscaler:v1.2 with the changes in #2118. Trying to get as much testing in as possible, it would help this issue move forward. When you have time could you test it in any clusters you and your org are running?

@mumoshu
Copy link
Contributor Author

mumoshu commented Dec 9, 2016

@andrewsykim Thanks for sharing the image. Sure, I'd definitely like to test it!

@motymichaely
Copy link

@mumoshu How can we move this forward?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 21, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 23, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@zytek
Copy link

zytek commented Nov 5, 2018

/reopen
/remove-lifecycle rotten

@k8s-ci-robot
Copy link
Contributor

@zytek: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Nov 5, 2018
@mumoshu
Copy link
Contributor Author

mumoshu commented Nov 5, 2018

I believe this is still relevant to every Kubernetes-on-AWS user who doesn't use a third-party solution like Spotinst.

@mumoshu
Copy link
Contributor Author

mumoshu commented Nov 5, 2018

/reopen
/remove-lifecycle rotten

@k8s-ci-robot
Copy link
Contributor

@mumoshu: Reopening this issue.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Nov 5, 2018
@mumoshu
Copy link
Contributor Author

mumoshu commented Nov 5, 2018

The CA project has been moved to https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler. Could anyone open an another issue that replicates this there?

@zytek
Copy link

zytek commented Nov 5, 2018

@mumoshu It's already opened: kubernetes/autoscaler#838
Please close this issue as I have asked for reopen by mistake. Sorry :)
/close

@mumoshu mumoshu closed this as completed Nov 5, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants