Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add autoscaling_enabled to example files #578

Closed
1 of 4 tasks
wperron opened this issue Oct 31, 2019 · 9 comments
Closed
1 of 4 tasks

Add autoscaling_enabled to example files #578

wperron opened this issue Oct 31, 2019 · 9 comments

Comments

@wperron
Copy link

wperron commented Oct 31, 2019

I have issues

The examples configurations do not currently include the autoscaling enabled parameter even though they are including other autoscaling-related parameters. This is misleading for first-time users as they are likely to use those examples as-is, and then will encounter an issue where their cluster won't be scaling as the number of pods increases.

I'm submitting a...

  • bug report
  • feature request
  • support request - read the FAQ first!
  • kudos, thank you, warm fuzzy

What's the proposed solution?

I suggest adding autoscaling_enabled = "true" to the example configurations to make them behave as one would expect to. I also suggest maybe linking the local.tf file in the README so that the default values are easier to find.

Environment details

  • Affected module version: 7.0.0
  • OS: Ubuntu 18.04
  • Terraform version: 0.12.12
@max-rocket-internet
Copy link
Contributor

even though they are including other autoscaling-related parameters

Where are these related parameters included?

FYI there's a link in the main README: Autoscaling: How to enable worker node autoscaling.

@wperron
Copy link
Author

wperron commented Oct 31, 2019

@max-rocket-internet I'm talking specifically about the example configurations in examples/

for instance, in the basic/main.tf there's this worker group configuration which includes asg_desired_capacity :

  worker_groups = [
    {
      name                          = "worker-group-1"
      instance_type                 = "t2.small"
      additional_userdata           = "echo foo bar"
      asg_desired_capacity          = 2
      additional_security_group_ids = [aws_security_group.worker_group_mgmt_one.id]
    },
    {
      name                          = "worker-group-2"
      instance_type                 = "t2.medium"
      additional_userdata           = "echo foo bar"
      additional_security_group_ids = [aws_security_group.worker_group_mgmt_two.id]
      asg_desired_capacity          = 1
    },
  ]

That will create the Autoscaling Group with the desired configuration in AWS, but it will initialize add the k8s.io/cluster-autoscaler/disabled = true annotation instead of k8s.io/cluster-autoscaler/enabled = true which means the cluster won't be able to autoscale even if the autoscaler pod is installed. Adding autoscaling_enabled to the example configurations would just make it clearer that there's an extra parameter needed for the autoscaling to work.

Concerning the link in the README, I was suggesting adding a link to the local.tf. When I was trying to figure out why my cluster wasn't scaling, that's where I ended up and saw the different default values, so I think it could be useful for other people.

The link you mentioned is also super useful but its not what I was referring to 😉

wperron added a commit to wperron/terraform-aws-eks that referenced this issue Nov 2, 2019
* added `autoscaling_enabled = true` to example configs
@max-rocket-internet
Copy link
Contributor

Sorry but I still don't follow 😅

there's this worker group configuration which includes asg_desired_capacity

That's a setting of the AWS ASG but it is not related to k8s autoscaling. Autoscaling from an ASG and normal k8s autoscaling are separate things. There's no implication that the examples allow autoscaling in k8s.

@max-rocket-internet
Copy link
Contributor

When I was trying to figure out why my cluster wasn't scaling, that's where I ended up and saw the different default values, so I think it could be useful for other people.

I'd rather not set autoscaling_enabled in the examples as it might imply to people that there's autoscaling out of the box when there isn't. I think this is what you figured? You must install the cluster-autoscaler and this is covered in the doc.

So rather than setting autoscaling_enabled in the examples, is there some other way that we could make it clearer?

@wperron
Copy link
Author

wperron commented Nov 4, 2019

Sorry my mistake; yes I meant the ASG autoscaling. For me seeing an autoscaling_enabled = true would do it, as it implies that it can also be set to false, though I can also see somebody not making that connexion. I guess we could leave the existing examples, and one or two this parameter is explicitly set to true and/or false.

What do you think?

@max-rocket-internet
Copy link
Contributor

Sorry my mistake; yes I meant the ASG autoscaling

Ah all good. If you're not super familiar with how autoscaling works in k8s on AWS then I can see it might be confusing.

To be clear: it is not possible or advisable to scale using the ASG itself. ASG autoscaling needs scaling policies (that are not part of this module) and also the ASG doesn't properly drain nodes when terminating them. So it's not really possible to use ASG based autoscaling in a reliable way. Using the cluster-autoscaler is the best way.

I think the doc and the note here is clear about what this parameter does. It adds and tag to the ASG and a policy to the instance role. Nothing more.

If you want to create a PR to add a note in the doc that emphasises the distinction, feel free 🙂

@stale
Copy link

stale bot commented Feb 4, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Feb 4, 2020
@barryib
Copy link
Member

barryib commented Feb 4, 2020

Resolved by #716 and #710

@barryib barryib closed this as completed Feb 4, 2020
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants