-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support CNI Custom Networking #1096
Comments
@lilley2412 Great writeup! Do you think there's any scenario in which someone might want to use public subnets for the pod? I always used private, but when starting to code it previously I was wondering if it should have the option of both or not.
For
Agreed |
@tiffanyfay thanks for the feedback
I can't think of a good reason use custom CNI with public pod subnets, all the use cases for custom CNI lend themselves to security or saving public or private "routable" IP space, so public pod subnets just don't seem to make sense. That doesn't mean someone won't want it one day, so I also see no strong reason to prohibit it. Not sure the cleanest way, maybe simply pod-public and prod-private for subnets? ex:
I think this makes sense, it's more flexible and possibly even less work than I was thinking, if one wanted to use larger than /10 from the 100.64 range they could just add multiple cidrs as /16's or smaller (not that it would be restricted to 100.64, that's just a common use-case for non-routable pods). I will edit the request to reflect this. |
@lilley2412 Yah, I can't think of a case, but who knows...can always handle public later if it's asked for. What do you think in terms of how to specify which set is for which nodegroup? What we had so far in terms of specifying, in the same naming scheme as yours, is something like this...but I need to look at the code again to see if it grabs the main cidr by default through the whole thing as the way I originally had it is having pod subnets in line with cluster subnets, but need to change that to make it under it.
This way if |
@lilley2412 For the case when it creates all the subnets and cluster, there's also the case there of we will need to be able to set a worker node CIDR for the case that they want it to be a non-RFC1918. ATM I am having troubles getting the autogenerate to work for the deepcopy file. |
At first I thought this wasn't required because ENIConfig just has to match the AZ of the node. Poking around I found #806 and #475 (comment) (@errordeveloper indicated this suggestion would be implemented). If it's implemented this way, I think maybe it makes sense to extend on that config by nesting the pod subnets under the respective node subnets, because a pod subnet is meaningless without a corresponding node subnet. Something like this .. it would also take care of private / public and any supported combo of IP ranges, and doesn't require re-specifying the AZ to match the node:
This would change how ENIConfig is resolved, since nodes don't have a subnet label already, something like:
This may need to also allow he same pod subnets to be used for two different node subnets, not sure if that is a useful use case though. |
@lilley2412 Hmm...This is what I had shared with @errordeveloper and he said something like that and to discuss naming later.
And had figured for the nodegroup it would have a field for subnet groups. But then again, this doesn't cover the case of using these for worker node subnets... It sounds like we need to check with Ilya as to whether he wants it specifically like #475 (comment), but he is pretty swamped with IAM for pods. The way you have it seems different. Her way seems to imply each named group is a single subnet. But that has the issue that you covered with if you want to make a group and have them created based on a cidr block. So would it create for every az if only the CIDR is listed. Seems really nested though. Also, are you in any of the slack groups? |
For current-state eksctl I only see the ability to specify an AZ list on a nodeGroup, I don't see any way to specify a subnet, so that's why my first reaction is that it doesn't matter, if you can only specify AZ on the nodeGroup, then the node -> AZ -> ENConfig -> subnet association is enough because there is no way to control nodeGroup -> subnet (without doing it in the ASG outside of eksctl). So I was thinking this is only a concern if some of the other feature requests are getting implemented to add a subnet property to nodeGroup. However it ends up I personally don't care too much so long as it's clear, so the approach you show is fine I think, but like I said unless I'm missing something I don't see the ng -> subnet association yet.
I am not, but will take a look to catch up on the discussions. |
@tiffanyfay thinking about this more, I guess the point of what you and @errordeveloper where proposing is even If nodeGroup -> worker subnet is not implemented yet, let's support for pod subnets with this feature? It probably makes sense to keep them decoupled even if it means two subnet properties on nodeGroup (or refactoring when worker subnets are implemented?) So think it makes sense
And some way to specify ng -> subnet So I guess this also means don't use failure zone label for ENIConfig, it would just have to be a node label based on the specified subnet on the ng. |
@lilley2412 Sounds good to me. |
Custom networking will also affect max-pods, aws/amazon-vpc-cni-k8s#331. You'll need to update the map or use the formula in the issue I referenced. |
@jicowan at first I was thinking that it can be handled with --max-pods as a kubelet arg manually, assuming there would be a solution elsewhere. Doesn't look like that's easily achievable though since at node creation it cannot be known what may get configured with ENIConfig, so I agree eksctl should handle this during node group build. I can't see any good alternative to maintaining a file similar to amazon-eks-ami/eni-max-pods.txt though to support the formula logic. |
Just noticed eksctl already https://github.com/weaveworks/eksctl/blob/master/pkg/nodebootstrap/maxpods_generate.go, based on https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt, unfortunately the source file does not have the IPs per interface, so still need to do something further to have that available. |
This issue is opened for almost 3 years |
@yanivpaz, we're occupied with higher-priority tasks at the moment but we'll review the proposal soon. |
@yanivpaz thanks for showing interest in this issue. We are working towards prioritizing this. We're happy to accept PRs if you have a solution in mind 💡 @lilley2412 I know it's been a long time but are you still keen on this feature for eksctl? |
i can think of 1 use-case where this is desirable, EKS as a service consumes alot of IPs, if we're deploying into a VPC not with that in mind, we maybe faced with a situation where the subnets doesn't have enough IP to handle all the pod-IPs. |
This is a very important case that we provide alternate CIDR support without those manual eniconfigs and hassle . eksctl should be able to support that and also it helps us in overcoming two problems
Not sure if I have missed anything . Please provide additional inputs to this . |
Some key differences in
With that in mind, I'm splitting the implementation in two cases: 1. Pod subnets are user defined:
2. Pod subnets are not user defined, hence The other implementation steps (notes) presented in the opening comment are pretty much still relevant. All ENI config related steps shall be covered as part of a post cluster creation task. |
@TiberiuGC Any idea when this could be implemented, for example I want to use submariner but pod and node cannot be within the same range, of course I can try to customzie aws cni but would be nice to have that within eksctl. |
Why do you want this feature?
Proposed feature request to support AWS CNI custom networking. There is an existing WIP PR 786 here, opening a feature for discussion because I think it's a big lift and implementation approach could greatly vary; I'm using custom CNI extensively to increase VPC IP space for pods on a corporate-connected VPCs and would like to start using eksctl for cluster provisioning.
CNI custom networking supports 3 main use cases:
Custom CNI can be enabled outside of eksctl today, but requires manual intervention after eksctl builds the cluster.
Important Concepts:
What feature/behavior/change do you want?
Implementation Notes
Resources should be created in this order, if nodes register prior to custom CNI being fully configured, node ENIs will immediately get secondary ENIs assigned from non-pod subnets, requiring the nodes (or kubelet process) to be re-started after CNI is configured (or nodeGroup re-creation).
Proposed Future Enhancements
The text was updated successfully, but these errors were encountered: