-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS] [request]: Managed Node Groups support for node taints #864
Comments
When this was raised in #585, #507 was tagged as an existing request for this feature, but I think that was confusion... #507 seems to be about Container Insights correctly monitoring tainted nodes, while what we want here (and in #585) is to support setting the taints on Managed Nodegroups as part of a rollout, e.g. with eksctl. The comment in #585 had nine thumbs-up, on top of the three currently here. |
@TBBle correct, I wanted to open a separate issue to explicitly track tainting node groups through the EKS API |
@mikestef9 we would like to see "tainting node groups through the EKS API" progressing and bumped it from #12 👍 to #37 as of now. |
It looks like the bootstrap script used by eks nodes already support taints. My understanding is that it would be a small feature to implement because it would only require to modify the userdata in the launch template to add extra args, just like its done for labels currently. |
We would love to have this! |
"When nodes are created dynamically by the Kubernetes autoscaler, they need to be created with the proper taint and label. https://docs.cloudbees.com/docs/cloudbees-ci/latest/cloud-admin-guide/eks-auto-scaling-nodes |
@jhcook-ag You can't specify the UserData for Managed Node Groups when you create them. You can modify the UserData in the Launch Configuration in the AWS console after creation, but then the Managed Node Groups feature will refuse to touch your Launch Configuration again, and you're effectively now using unmanaged Node Groups, although eksctl will still try to use the Managed Node Groups API and fail. |
@mhausenblas we really need this 👍 |
Absolutely would love the idea. |
It is a must-have feature! |
👍 |
This is a must-have feature for us as well. We can't use managed node groups because of this. When would you expect this to be released? (just roughly) 👍 |
Hi @martinoravsky, I believe this feature is available now. We did it by customizing the userdata on the custom launch template and specifying the taints for the kubelet (using the register-with-taints argument). |
Hi @Dudssource , are you using custom AMIs? I'm using launch templates with EKS optimized AMIs which include UserData that bootstraps the node to the cluster automatically (with --kubelet-extra-args empty). This userdata is not editable for us, we can only add our own UserData as MIME multipart file which has no effect on bootstrapping the cluster. I'm curious if you were able to get this to work without custom AMIs. |
@martinoravsky, yes unfortunately we had to use a custom AMI for this to work. |
The approach that @Dudssource used here is certainly an option, but we do plan to add taints directly to the EKS API (similar to labels), so that a custom AMI is not required. |
I've found a solution (admittedly quite hackish) to allow setting taints with the offical AMIs: Set the userdata for the Launch Template similar to this:
This script is run before the bootstrap script, which is managed by EKS, patching the |
@lwimmer That is superbly hacky! Good work. I'm really surprised this feature is missing, and overall I'm shocked how feature incomplete node groups are. |
+1 |
1 similar comment
+1 |
Guys, I found the same implementation on AWS terraform workshop Having say that I am hoping to simple pass in a simple flag in terraform param and not the twist and turn way and get it done and worry about when next release, this feature doesn't work again, just like the eksctl prebootstrap. I don't know how difficult will be "if define" this and plug this value in just like the "helm chart" to be implemented on terraform or eksctl . We, community can twist and turn to provide a solution. But overall I think all these reasonable production ready feature shall be inside allow all of us to extend the functionality properly. These has to be answered by AWS and committed, if in next release it is wipe out again, why should anyone waste time doing it? Cheers. |
Hi, this has been merged and it seems to still work with official AMI. I have just tested with the following configuration:
And it is working as expected, this PR is based on the fix found here |
Still not working for me, even with |
Thanks for the PR, I will try it out. Your solution is very elegant. Unfortunately, some of our already running environments are provisioned using rawly terraform resources. I have no clue how much effort would it take to migrate to use the terraform aws eks module. I might give it a shot on our development environments in the next following weeks. Although I strongly support your development, I still think taints should be accepted here: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_node_group. What happens if I open a ticket with mr. Bezos and he replies that they will go further with the incident because I'm using a community module (that modifies the kubelet default behavior) instead of the official product API? I really don't know what are the implications of using community modules, but according to the documentation, only Enterprise Business Support can have "Third-party software support". So #AWS, if I encorage my team to fully migrate to this module, will I have issues with your definition of "Third-party software support"? Does "Third-party software support" includes kubelet default behavior modifications? We eagerly wait for a response. |
I could not find a definition for "Third-party software support" other than: Third-party software support – Help with Amazon Elastic Compute Cloud (Amazon EC2) instance operating systems and configuration. Also, help with the performance of the most popular third-party software components on AWS. Third-party software support isn't available for customers on Basic or Developer Support plans. |
Hello guys, I have been working on these for couple of months (with or without terraform). It will not work no matter how hard you tried. That's a problem on EKS manage nodegroup it will plug their own user-data behind your user-data. AWS created another secondary launchtemplate on your behave and the user-data from running instance is getting from the new the new launchtemplate. You can verify this on your EC2 node then you will know what am i talking about. And you can compare your launch template on the AWS console and on the running instance launch template. # ssh [your_eks_node]
$ curl http://169.254.169.254/latest/user-data
You can also manually view the launchtemplate (sorted by recent date) You can still do it, but the status of the nodegroup creation is "NodeCreationFailure" after waited 20 minutes for each try. Cheers, |
I honestly do not know about the support but using custom launch template is supposed to be supported on AWS, if you have support and are using the official AMI, I do not see why you would loose the support. I guess the same thing could apply to people using custom AMI that AWS have not way to verified, do they also loose support ? |
@teochenglim Not sure what you are referring to, but providing user data in managed nodegroup launch template works fine and is merged in the EKS created launch template. edit
|
Hi ArchiFleKs, My 2 cents, if everything need to custom, why EKS? We should run on perm Kubernetes. Yes, that's an option that custom launch template is supported on AWS now. But is has bug. And to be fair people just mixing around everything now. Some are talking about terraform module, some are talking about eksctl, some are talking about custom workgroup or manage workgroup. And you are talking about office AMI? But base on my simple troubleshooting, an extra launchtemplate is created and your manage node group is pointing to that. This behaviour is the same for terraform or manually do it on AWS console. I am yet to try eksctl but why should i try it since i am no longer using it? |
I try it today and it doesn't work for me. Can you show me your working version? |
Given this has gone from "We're Working On It" to "Coming Soon", presumably it's mostly done, and is being tested/validated/integrated, so "AWS sucks, everyone else has had this forever" isn't really a useful contribution. Workarounds in the meantime are a useful contribution, I think, but support questions about them does generate a bit of noise in this ticket. Is there a terraform-specific place to debug the terraform-based workaround instead, so this ticket can remain focussed on the Managed Node Groups API for this, and maybe just catalog the workarounds (all using custom launch templates now?). If custom launch templates aren't working correctly, that's not really a "here" thing either. #585 would be closer, but this isn't really a support forum anyway, so you may not have much luck there. |
You still need tools to orchestrate your infrastructure, whether it is managed or not. Even if you do it by hand with the AWS console or the awscli, Cloudformation, Terraform or eksctl I agree that AWS EKS managed node group API should expose a native Taint options like it does for the labels. Exposing the kubelet args allow people to customize kubelet as they wish. This allow power user to do custom configuration even with managed node group. Even if using managed service, you still need to use an AMI (by official I mean this one https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html) or you can build your own. The behavior if building your own is different than th official one if using user data as explained here. (https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html#launch-template-user-data).There is a merged involved with official AMI that you do not have when using official AMI (that prevent the pre userdata of being used). If you can explain your bug in more detail maybe someone here can help. We are trying to build tools (eksctl or terraform-aws-eks) to abstract this part for the user (just like a manage service does). Personally I'm using the terraform-aws-eks module and this feature has just been release and is working at least with official AMI, I have not tested with custom AMI. Let me know if I can help you with this. |
Are you using the master version of the module ? Latest release with this PR is only out today in https://github.com/terraform-aws-modules/terraform-aws-eks/releases/tag/v15.2.0 |
Oh, I thought it was included in version 15.1.0, I'll try with version 15.2.0 then, thanks! :) |
yes, I had my ticket dropped few years ago. |
Some features took 15 days to change from Comming Soon to Shipped. Other features took months. How long should I wait? Does it make sense using community terraform workarunds if we are now on comming soon? @TBBle "so "AWS sucks, everyone else has had this forever" isn't really a useful contribution.". Disagre totaly. As a product owner think this is REALLY useful contribution to my product. |
That depends on your needs and priorities. If you need a terraform deployment today, then you can't wait, so don't wait. If you are just tracking this as a blocker for migrating to Managed Node Groups, and are happy with self/un-managed Node Groups in the meantime, then waiting is fine. (I'm in the latter boat, but it's not the only "migration-blocking" feature I'm tracking, and really only applies to the "next cluster" I build, since existing clusters work now) As for the other part, since you stripped the context of my quote, including the important part, I'll requote it
Leaving aside the toxic phrasing of this feedback, "AWS sucks, everyone else has had this forever" tells a Product Owner nothing about a feature which is already in the delivery pipeline. That sort of information is more useful when deciding if and where to prioritise a feature, or if the PO has (for whatever reason) never looked at their competition's offerings. Once it's at the stage of the pipeline I presumed it to be at, it's very unlikely that someone is going to slap their forehead and say "Oh! We should just ship that, instead of sitting on the ready-to-go feature in order to feast on the tears of our users" (or whatever reaction one expects from such comments). This by far the most 👍'd feature request in the Coming Soon bucket (by a multiple of 5 from its next-closest) and I certainly assume that the person/people managing this backlog can count. |
@ArchiFleKs |
@EvertonSA If the Product Owner is serious about his product and take into consideration multiple users has different need base on what they have existing, he/she shall make things flexible. We (Community) spend time and effort and use his product. If he/she decided out of long list of Kubelet flags (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/) he can only choose to expose 1 flag and "is good enough" and solve all the problem in the world, And community shall be happy about it? Besides, "KUBELET_EXTRA_ARGS" exists long time ago and he/she decided to remove it and make this problem? eksctl overrideBootstrapCommand also get behavioural changes. My point is few months back we have freedom to choose what to do and now everything is buggy and we worry about next version it get totally change again. So for each release (every 3 months roughly), we have to revisit this again? And pray hard it works this time? Most other user drop the case (github issue get close without known why and given up) already, they will just claim eks doesn't work for them. But I am still here. |
thanks for the input. I did not mean to be toxic or rude. I totally understand your reactions. Our opinions might not get allong but fine. regarding terraform development, I will wait until I get feedback from my team. |
thanks for the input. Let's say she don't really care of what I do with the kubelet. As long as AWS business support don't turn their backs on us if we need then. I totally understand your reactions. |
I think you can use this input
https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/latest#input_worker_create_cluster_primary_security_group_rules
To avoid doing this trick
--
Kevin Lefevre
… On Friday, May 07, 2021 at 7:36 PM, Dru Goradia ***@***.*** ***@***.***)> wrote:
For anyone experiencing the same with the terraform eks module (15.2.0) I was able to resolve it using worker_additional_security_group_ids
locals { cluster_primary_security_group_id = module.mycluster.cluster_primary_security_group_id } module "mycluster" { source = "terraform-aws-modules/eks/aws" version = "15.2.0" ... worker_additional_security_group_ids = [ local.cluster_primary_security_group_id, ] ... node_groups_defaults = { ami_type = "AL2_x86_64" disk_size = 40 subnets = data.aws_subnet_ids.private.ids key_name = var.key_name source_security_group_ids = [data.aws_security_group.bastion_only.id] } node_groups = { notaint = { desired_capacity = 2 max_capacity = 20 min_capacity = 2 instance_types = ["t3.medium"] k8s_labels = { "workload/type" = "notaint" } additional_tags = { Name = "eks-ng-notaint" } } sometaint = { desired_capacity = 3 max_capacity = 6 min_capacity = 3 instance_types = ["m5a.large"] k8s_labels = { "workload/type" = "sometaint" } create_launch_template = true kubelet_extra_args = "--register-with-taints=workload=sometaint:NoSchedule" additional_tags = { Name = "eks-ng-sometaint" } } }
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub (#864 (comment)), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAJXRKNX67XMRGMRDGH4HNDTMQQKPANCNFSM4MS53PWA).
|
Hey folks, Native support for Kubernetes taints is now available in managed node groups! |
@mikestef9 thank you for that saw it on my console today. |
Community Note
Tell us about your request
Add support for tainting nodes through managed node groups API
Which service(s) is this request for?
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Managed nodes supports adding Kubernetes labels as part of node group creation. This makes it easy for all nodes in a node group to have consistent labels. However, taints are not supported through the API.
Are you currently working around this issue?
Manual kubectl commands after new nodes in node group come up.
The text was updated successfully, but these errors were encountered: