Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS] [request]: Nodegroup should support tagging ASGs #608

Open
bhops opened this issue Nov 27, 2019 · 121 comments
Open

[EKS] [request]: Nodegroup should support tagging ASGs #608

bhops opened this issue Nov 27, 2019 · 121 comments
Labels
EKS Managed Nodes EKS Managed Nodes EKS Amazon Elastic Kubernetes Service

Comments

@bhops
Copy link

bhops commented Nov 27, 2019

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
It would be great if we could pass tags to the underlying ASGs (and tell the ASGs to propagate tags) that are created from the managed node groups for EKS so that the underlying instances/volumes are tagged appropriately.

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Currently, managed node groups for EKS do not support propagating tags to ASGs (and therefore the instances) created by the node group. This leads to EC2 instances that are not tagged according to our requirements for tracking cost, and resource ownership.

@bhops bhops added the Proposed Community submitted issue label Nov 27, 2019
@mikestef9 mikestef9 added the EKS Amazon Elastic Kubernetes Service label Nov 27, 2019
@sriramgk
Copy link

sriramgk commented Dec 27, 2019

Passing the managed node tags to launch templates "Instance tags" will automatically apply to both EC2 and its volumes. If there are some challenges to do that, creating a separate "Custom Tags" section in the EKS managed node configuration page will also be helpful.

@mailjunze
Copy link

mailjunze commented Jan 15, 2020

Workaround to add custom tags to WorkerNodes using EKS managed NodeGroup :

  • Create a managed worker Node group in EKS console. (Set Minimum & desired count as 1)

  • EKS creates an ASG in the background. You will find ASG information for EKS NodeGroup details in EKS console. Select ASG associated with managed worker NodeGroup > tags > Add your custom tags for EC2.
    Note: Make sure to checkbox "Tag New Instances" while creating new tags.

  • Terminate the newly launched Ec2 without tags.

  • Scale up ManagedNodeGroup as per requirement.

  • After completing above steps, EKS managed Node groups will tag new EC2 Instance with custom tags.

@yardensachs
Copy link

This is crucial feature that is missing, and is the only reason our department is not moving from manual ASGs to node groups.

@evgmoskalenko
Copy link

Yes, this is a very important change. We also cannot use this because of the need for tags. Bad practice to use semi-automatic infrastructure as code.

@ozhankaraman
Copy link

Any update ?

@amitsehgal
Copy link

any updates ? can you open source node groups.. so, community can contribute ?

@jerry153fish
Copy link

any updates ?

@atrepca
Copy link

atrepca commented May 22, 2020

Is this a duplicate of #374?

@TBBle
Copy link

TBBle commented May 27, 2020

I don't think it's a duplicate. This one is for an API feature to add tags to the ASG created by the API, and also be able to set the flag on the ASG that propagates tags outwards: so it's only an API change to implement the same thing down manually in the workaround above.

#374 is for the EKS Cluster object itself to support propagating tags down, in the way ASGs already do. I imagine #374 would partially work by propagating tags to ASGs, and then turning on ASG tag propagation, rather than duplicating the behaviour.

@otterley
Copy link

otterley commented Jun 5, 2020

Team: Having this functionality available will enable customers to use Cluster Autoscaler's capacity autodiscovery feature instead of forcing them to maintain manual capacity mappings on the command line.

The documentation there isn't super clear (see kubernetes/autoscaler#3198 for documentation updates), but advertising capacity resources to Cluster Autoscaler via ASG tags will make the use of multiple heterogeneous Auto Scaling Groups much easier for customers.

@rtripat
Copy link

rtripat commented Jun 5, 2020

Team: Having this functionality available will enable customers to use Cluster Autoscaler's capacity autodiscovery feature instead of forcing them to maintain manual capacity mappings on the command line.

The documentation there isn't super clear (see kubernetes/autoscaler#3198 for documentation updates), but advertising capacity resources to Cluster Autoscaler via ASG tags will make the use of multiple heterogeneous Auto Scaling Groups much easier for customers.

@otterley While Managed Nodegroup doesn't support customer provided tags for ASGs today, we do add the necessary tags for CAS auto discovery to the ASG i.e. k8s.io/cluster-autoscaler/enabled and k8s.io/cluster-autoscaler/<CLUSTER NAME>.

@otterley
Copy link

otterley commented Jun 5, 2020

@rtripat Understood. Perhaps I wasn't clear, but I was specifically referring to the ability to autodiscover specific capacity dimensions of an ASG such as cpu, memory, ephemeral storage, GPU, etc.

@privomark
Copy link

Until this feature is ready, I've had success with creating a cloudwatch rule based upon EC2 "pending" status, invoking a lambda that checks the instance_id passed in through the event, checks the instance_id to see if it's part of a managed node cluster, then adds the appropriate tags. I'm doing this all through Terraform with the spin up of the eks cluster.

Obviously would be much easier with a tags option! 😛

@yann-soubeyrand
Copy link

It could be great to be able to tag the launch templates too with the option to propagate these tags to instances and volumes or not.

Is there some kind of best practice on tagging ASG vs tagging LT? It seems to me that tagging LT offers more flexibility (like the ability to tag the volumes).

@TBBle
Copy link

TBBle commented Jun 9, 2020

https://docs.aws.amazon.com/autoscaling/ec2/userguide/autoscaling-tagging.html touches upon the overlap in tag propagation between ASGs and Launch Templates.

@yann-soubeyrand
Copy link

https://docs.aws.amazon.com/autoscaling/ec2/userguide/autoscaling-tagging.html touches upon the overlap in tag propagation between ASGs and Launch Templates.

That's precisely the documentation page I had in mind when asking about best practices ;-) This page explains the overlap but there are no clear pros and cons of the two tagging approaches. But it seems to me that LT offers more flexibility and that ASG tags should be used only when necessary (like for the cluster autoscaler discovery tags).

@TBBle
Copy link

TBBle commented Jun 13, 2020

There's a related discussion about tagging ASGs and LTs for non-managed Nodegroups at eksctl-io/eksctl#1603. My understanding from there is that tagging LTs and enabling propagation would be sufficient, but there might be use-cases where the ASG needs to have the tag too, but it wouldn't then needed to also support propagation.

The difference observed in that ticket is that the ASG propagation applies the tags after launch, while LT propagation applies the tags as part of the launch.

@yann-soubeyrand
Copy link

Yes, I create my non-managed node groups using Terraform and put the tags on the LT with propagation to instances and volumes. The only tags I needed to put on ASG are the cluster autoscaler related tags. But propagation is not needed for these tags.

@Missshao
Copy link

need this feature too, will impact calculate costs if I add the tags manually later in ASG.

@gunzy83
Copy link

gunzy83 commented Jul 21, 2020

We have EKS deployed as a new part of our stacks in prod through preprod, stage and dev (alongside a very large ECS deployment in each environment). It is very annoying that the instances are not tagged for cost allocation.

@RogerWatkins-Anaplan
Copy link

Also having a problem with this - I wouldn't expect in a service called Managed Node Groups that I would have to work around tag propagation issues. It seems vendors such as weaveworks have implemented their own workarounds, sadly there is no such workaround in terraform.

This issue is pretty fundamental - would like it fixing please :)

@stevehipwell
Copy link

It seems vendors such as weaveworks have implemented their own workarounds, sadly there is no such workaround in terraform.

@RogerWatkins-Anaplan it's easy enough to do this with Terraform and I think there are links in some of the comments above on how to do it. That said you wouldn't expect to need to do this for a first party vendor solution.

@vikramaditya234
Copy link

Are the guys working on this? This is a must-have for us to implement the project

@bitva77
Copy link

bitva77 commented Mar 16, 2023

I think they're on to something here...I mean, managing node groups is a pain. Why spend time on something that does sort of suck.

This forces me to use Fargate + Karpenter which, when I think about it, is where I kind of wanted to go anyway and now I have the motivation to do so.

It's kind of like when I was in high school and I was lazy as all hell. Slow, out of shape (fat), unhealthy, etc......but then there was this girl who was cute but wouldn't give me the time of day because of who I was. So then I got in shape just to impress her.

Yeah, she still ignored me BUT it was the motivation I really needed to live a healthier life. She has no idea that she changed my life, but she did.

Maybe the same is true here with Crossplane. Like, maybe I shouldn't be using nodegroups...just like I shouldn't have been eating a big mac every day. Maybe Fargate + Karpenter REALLY is the healthier way forward.

Will report back!

@Obirah
Copy link

Obirah commented Mar 16, 2023

I think they're on to something here...I mean, managing node groups is a pain. Why spend time on something that does sort of suck.

This forces me to use Fargate + Karpenter which, when I think about it, is where I kind of wanted to go anyway and now I have the motivation to do so.

It's kind of like when I was in high school and I was lazy as all hell. Slow, out of shape (fat), unhealthy, etc......but then there was this girl who was cute but wouldn't give me the time of day because of who I was. So then I got in shape just to impress her.

Yeah, she still ignored me BUT it was the motivation I really needed to live a healthier life. She has no idea that she changed my life, but she did.

Maybe the same is true here with Crossplane. Like, maybe I shouldn't be using nodegroups...just like I shouldn't have been eating a big mac every day. Maybe Fargate + Karpenter REALLY is the healthier way forward.

Will report back!

"Ratio" is gonna be a thing on GitHub as well after that comment. Well played, sir!

@MartinEmrich
Copy link

@bitva77 Man so much wisdom here on Github. Mind blown.

Yes, AWS apparently is that cliché beautiful blonde that gets all the sports club guys with big wallets... And yes, they treat her to a nice ice cream and soda every day, so why shall she ever bother with us mere mortals?

Back to topic: as karpenter now apparently has drift detection/resolvement [1] and workload balancing [2], I too shall give it a try again to relieve the NodeGroup/cluster-autoscaler.

[1] aws/karpenter-provider-aws#1738
[2] aws/karpenter-provider-aws#1091

But even then, AWS again shifted some functionality to our side of the shared responsability model :/

@robertd
Copy link

robertd commented Mar 18, 2023

@ellistarn I laughed so hard on your rebuttal comment. It made my day. 😂

@bitva77
Copy link

bitva77 commented Mar 18, 2023

@robertd they removed his comment. Which is actually more work than this request.

@JDavis10213
Copy link

You can set the tags on the launch template. We use managed nodes with a custom AMI and things work fine with adding tags to the instances. The trick is to use the tag_specifications if you are using Terraform here is some configuration from the template resource to add variables to the nodes.

  tag_specifications {
    resource_type = "instance"

    tags = merge(
      {
        "Name" = "${each.value.node_group_name}-${aws_eks_cluster.main.name}-eks-managed-node"
      },
      var.tags,
    )
  }

@MartinEmrich
Copy link

@JDavis10213 (and many others). Yes, but

a) it does (most probably) not tag the Auto Scaling Group itself.
b) obviously works only if one uses Terraform in the first place, not if one "clicks" a Managed Nodegroup from the AWS Console or uses another IAC/management tool.

To be sure I read the OP again, it is about tagging the ASG itself, not propagating tags to resources below. And the reason is cost tracking and compliance (e.g. to enforced AWS Config rules), which do not allow for tagging "afterwards".

On behalf of probably most subscribers here who want the original issue fixed: We appreciate the sentiment to try to be helpful, but a workaround or a solution to an different challenge just is not really helpful here. I have a vague gut feeling that the more "tips and tricks" are shared here, and the more the discussion shifts to workarounds on the client side, the less likely AWS engineers reading this are about to fix the original issue on AWS side.

@JDavis10213
Copy link

JDavis10213 commented Apr 21, 2023

@JDavis10213 (and many others). Yes, but

a) it does (most probably) not tag the Auto Scaling Group itself. b) obviously works only if one uses Terraform in the first place, not if one "clicks" a Managed Nodegroup from the AWS Console or uses another IAC/management tool.

To be sure I read the OP again, it is about tagging the ASG itself, not propagating tags to resources below. And the reason is cost tracking and compliance (e.g. to enforced AWS Config rules), which do not allow for tagging "afterwards".

On behalf of probably most subscribers here who want the original issue fixed: We appreciate the sentiment to try to be helpful, but a workaround or a solution to an different challenge just is not really helpful here. I have a vague gut feeling that the more "tips and tricks" are shared here, and the more the discussion shifts to workarounds on the client side, the less likely AWS engineers reading this are about to fix the original issue on AWS side.

@MartinEmrich Yes, that is correct. It doesn't populate the ASG with the tags. So we for sure need this original request. Just providing a bit of a workaround to at least get the nodes.

@Faustinekitten
Copy link

In combination with the AWS MAP program this feature is heavily needed

@ronberna
Copy link

ronberna commented Aug 4, 2023

Any updates or traction on this issue?

@rishabhcldcvr
Copy link

I badly need this feature to autoscale one of my worker node groups from 0 to more. I don't understand why this feature is missing from the Nodegroup resource. CA fails most of the time to scale from 0 to 1 and it just adds more pain for me and my team

@TBBle
Copy link

TBBle commented Aug 10, 2023

@rishabhcldcvr

Scale to/from zero is a different issue, see #724 (comment) which details Cluster Autoscaler's support for Scale-to-0 which does not rely on ASG tags, and the entire rest of that issue for why. (And also workarounds if you can't use the supported solution.)

@fvpinheiro
Copy link

This is really needed to let the cluster autoscaler scale up an eks node group with labels & taints from 0 after the CA pod restarts. I had to manually tag my ASG to get it to work.

@TBBle
Copy link

TBBle commented Aug 25, 2023

@fvpinheiro: See #608 (comment)

@fvpinheiro
Copy link

@fvpinheiro: See #608 (comment)

Well, I needed to manually tag ASG with the EKS Node Group's labels & taints in the following format, k8s.io/cluster-autoscaler/node-template/label/ and k8s.io/cluster-autoscaler/node-template/taint/, so that a restarted CA pod could scale up from 0.

This is related in the way that I would want to be able to through my cloudformation template to propagate tags from the EKS::NodeGroup resource to its underlying ASG, which isn't possible, as far as I know...

@TBBle
Copy link

TBBle commented Aug 25, 2023

Yes, I understand your need. And there's a ticket specifically for that need (scaling up a node-group from zero in Cluster Autoscaler), that I linked to in that comment, with an AWS-supported solution you can deploy now, without waiting for a different new feature from AWS.

nerahou pushed a commit to quortex/terraform-aws-eks-cluster that referenced this issue Apr 11, 2024
autoscaler)

The tags specified on the resource type "aws_eks_node_group" are not propagated to the ASG that represents this node group (issue aws/containers-roadmap#608).

As a workaround, we add tags to the ASG after the nodegroup creation/updates using the AWS command-line.

This will fix scaling up from 0, in EKS-managed node groups, when pods
have affinities/nodeSelectors defined on custom tags.
nerahou pushed a commit to quortex/terraform-aws-eks-cluster that referenced this issue Apr 15, 2024
autoscaler)

The tags specified on the resource type "aws_eks_node_group" are not propagated to the ASG that represents this node group (issue aws/containers-roadmap#608).

As a workaround, we add tags to the ASG after the nodegroup creation/updates using the AWS command-line.

This will fix scaling up from 0, in EKS-managed node groups, when pods
have affinities/nodeSelectors defined on custom tags.
@the-gigi
Copy link

It's 6 months later. Any updates? AWS? are you listening?👂

@morsik
Copy link

morsik commented Apr 29, 2024

@the-gigi you wrote "It's been 5 years" quite weird... ;)

This is so basic feature that I can't even...

This is why I recommend using GCP if someone wants to go full Kubernetes. EKS was as joke 3 years ago, it's better now, but even though, it's still a joke. It doesn't even have built-in cluster-autoscaler and AWS tells you that you can just install it on your own. Yeah, I know AWS, as we say: "Thanks for nothing"!

Really, sometimes this gets really frustrating.

@maxsxu
Copy link

maxsxu commented May 4, 2024

+1 for this feature request.

This is a critical missing feature when using eks managed node groups. There're lots of scenarios we need to add extra tags on the ASGs, e.g, when using the cluster-autoscaler, when using restrictive IAM policies, etc.

@the-gigi
Copy link

@the-gigi you wrote "It's been 5 years" quite weird... ;)

Yeah, I was referring to the previous comment before me.

This is so basic feature that I can't even...

Same

This is why I recommend using GCP if someone wants to go full Kubernetes. EKS was as joke 3 years ago, it's better now, but even though, it's still a joke. It doesn't even have built-in cluster-autoscaler

To be fair, they have Karpenter that I plan to evaluate and consider as an alternative. I hear good things about it.

But, overall, I agree it's ridiculous. Forcing tons of developers to first discover that cluster-autoscaler on AWS is not even looking at node group labels, but looks for ASG tags, then realize their node group labels don't propagate automatically to the ASG and then have to take care of it themselves. It's putting stress on other projects too. For example, Pulumi had to add a kludge in the form of https://www.pulumi.com/registry/packages/aws/api-docs/autoscaling/tag/ where they explicitly say:

Manages an individual Autoscaling Group (ASG) tag. This resource should only be used in cases where ASGs are created outside the provider (e.g., ASGs implicitly created by EKS Node Groups).

NOTE: This tagging resource should not be combined with the resource for managing the parent resource. For example, using aws.autoscaling.Group and aws.autoscaling.Tag to manage tags of the same ASG will cause a perpetual difference where the aws.autoscaling.Group resource will try to remove the tag being added by the aws.autoscaling.Tag resource.

@suankan
Copy link

suankan commented Nov 18, 2024

Resource aws_eks_node_group is still unable to setup tags of ASG which it creates under the hood.

Workaround:

Resource aws_eks_node_group has attribute resources[0].autoscaling_groups[0].name which you can use to configure the ASG further after aws_eks_node_group is created.

Example 1: Set tags to the EKS Node Group ASG.

Suppose you have a var as a simple map(string):

my_tags = {
  key1 = "value1",
  key2 = "value2",
  key3 = "value3",
}

And you created a resource "aws_eks_node_group" "eks_workers" {}
Then you can get the ASG Name via aws_eks_node_group.eks_workers.resources[0].autoscaling_groups[0].name.
And setup ASG Tags using the loop:

locals {
  my_tags_list = [
    for key, value in var.my_tags : 
    {
      key   = key
      value = value
    }
  ]
}

resource "aws_autoscaling_group_tag" "eks_node_group_asg_tag" {
  count = length(local.my_tags_list)
  autoscaling_group_name = local.eks_node_group_asg_name
  tag {
    key                 = local.my_tags_list[count.index].key
    value               = local.my_tags_list[count.index].value
    propagate_at_launch = true|false
  }
}

Proper more simpler solution in resource "aws_eks_node_group"

Is still needed.
Simple operation to tag your aws resources should not be such a pain, it should be simple and straightforward.
Once implemented, it should also document clearly the behaviour of propagation tags to the EC2/EbsVolumes in cases:

  1. EC2/EbsVolumes are tagged via resource "aws_launch_template"
  2. EC2/EbsVolumes are tagged via resource "aws_eks_node_group"

Example 2: Set AutoScaling schedule to the EKS Node Group ASG.

This is out of scope of this issue, but maybe someone needs it.

resource "aws_autoscaling_schedule" "turn_off" {
  autoscaling_group_name = aws_eks_node_group.eks_workers.resources[0].autoscaling_groups[0].name
  scheduled_action_name  = "turn-off"
  desired_capacity       = 0
  max_size               = var.eks_node_group_config.max_size
  min_size               = var.eks_node_group_config.min_size
  recurrence             = "0 7 * * *" # UTC time! Aim is 6pm Sydney time
}

resource "aws_autoscaling_schedule" "turn_on" {
  autoscaling_group_name = aws_eks_node_group.eks_workers.resources[0].autoscaling_groups[0].name
  scheduled_action_name  = "turn-on"
  desired_capacity       = var.eks_node_group_config.desired_size
  max_size               = var.eks_node_group_config.max_size
  min_size               = var.eks_node_group_config.min_size
  recurrence             = "0 21 * * MON-FRI" # UTC time! Aim is 8am Sydney time
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Managed Nodes EKS Managed Nodes EKS Amazon Elastic Kubernetes Service
Projects
None yet
Development

No branches or pull requests