Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flavour for using AWS VPC CNI #931

Closed
Sn0rt opened this issue Jul 26, 2019 · 24 comments
Closed

Add flavour for using AWS VPC CNI #931

Sn0rt opened this issue Jul 26, 2019 · 24 comments
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@Sn0rt
Copy link
Contributor

Sn0rt commented Jul 26, 2019

/kind feature

from my backgroup, I consider to use native network work with k8s, like EKS.
however, cluster-api-provider-aws not support it yet.
can we consider to support amazon-vpc-cni-k8s support ?
or accept a PR to implement this feature ?

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 26, 2019
@detiber
Copy link
Member

detiber commented Jul 26, 2019

@Sn0rt I don't see any issues with adding support for amazon-vpc-cni-k8s.

At a cursory glance I believe it would require:

  • Swapping out the Calico manifests in addons.yaml with the manifests for amazon-vpc-cni-k8s
  • Updating the nodes.cluster-api-provider-aws.sigs.k8s.io policy as per the docs
  • The user to override the kubelet --max-pods appropriately for each Machine* object they define to avoid overscheduling any individual Node.

/cc @randomvariable
cc'ing Naadir in case he has thoughts on how we can potentially scope down the IAM permissions needed vs the broad permissions listed in the docs.

@rudoi
Copy link
Contributor

rudoi commented Jul 26, 2019

👍

We're big VPC CNI users and would be happy to help out on this.

/cc @sethp-nr

@randomvariable
Copy link
Member

Can do either of the following:

  • Create a new policy and attach it to the node manually
  • Add an option to the CloudFormation generation to do it automatically. Might be better not to modify the existing policies.

@ncdc ncdc added the kind/documentation Categorizes issue or PR as related to documentation. label Jul 29, 2019
@ncdc ncdc added this to the 0.3.x milestone Jul 29, 2019
@ncdc ncdc added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jul 29, 2019
@sethp-nr
Copy link
Contributor

FWIW this works today by applying a custom policy to the control plane machines and worker nodes with the Machine's spec.providerSpec.value.iamInstanceProfile. It doesn't look like any of the ENI stuff is scoped to the CAPA tag(s), despite some evidence that we wanted to – @rudoi do you remember if we tried to get the CNI permissions to be scoped to just the CAPA machines?

@Sn0rt
Copy link
Contributor Author

Sn0rt commented Aug 9, 2019

I finished a POC

1: create a cluster

create the cluster and the control panel

apiVersion: "cluster.k8s.io/v1alpha1"
kind: Cluster
metadata:
  name: aws-eni
spec:
  clusterNetwork:
    services:
      cidrBlocks: ["10.96.0.0/12"]
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    serviceDomain: "cluster.local"
  providerSpec:
    value:
      apiVersion: "awsprovider/v1alpha1"
      kind: "AWSClusterProviderSpec"
      region: "us-east-2"
      sshKeyName: "guohao"

2: create machine deployment

apiVersion: "cluster.k8s.io/v1alpha1"
kind: MachineDeployment
metadata:
  name: aws-eni-machinedeployment
  labels:
    cluster.k8s.io/cluster-name: aws-eni
spec:
  replicas: 1
  selector:
    matchLabels:
      cluster.k8s.io/cluster-name: aws-eni
      set: node
  template:
    metadata:
      labels:
        cluster.k8s.io/cluster-name: aws-eni
        set: node
    spec:
      versions:
        kubelet: v1.14.4
      providerSpec:
        value:
          apiVersion: awsprovider/v1alpha1
          kind: AWSMachineProviderSpec
          instanceType: "t2.medium"
          iamInstanceProfile: "nodes.cluster-api-provider-aws.sigs.k8s.io"
          keyName: "guohao"

3: create the IAM permission

create a policy, there is assign permission to the node.

guohao@buffer ~ $ aws iam get-policy --policy-arn arn:aws:iam::179516646050:policy/amazon-vpc-cni-k8s-IAM
{
    "Policy": {
        "PolicyName": "amazon-vpc-cni-k8s-IAM",
        "PolicyId": "ANPASTTAGUKRHOLMEGMU2",
        "Arn": "arn:aws:iam::179516646050:policy/amazon-vpc-cni-k8s-IAM",
        "Path": "/",
        "DefaultVersionId": "v1",
        "AttachmentCount": 1,
        "PermissionsBoundaryUsageCount": 0,
        "IsAttachable": true,
        "Description": "the permission of aws eni",
        "CreateDate": "2019-08-09T02:35:54Z",
        "UpdateDate": "2019-08-09T02:35:54Z"
    }
}

4: attache the permission policy of AWS-ENI-CNI to nodes.cluster-api-provider-aws.sigs.k8s.io role, which is set to work node

guohao@buffer ~ $ aws iam list-attached-role-policies --role-name nodes.cluster-api-provider-aws.sigs.k8s.io

and the output as follows

{
    "AttachedPolicies": [
        {
            "PolicyName": "amazon-vpc-cni-k8s-IAM",
            "PolicyArn": "arn:aws:iam::179516646050:policy/amazon-vpc-cni-k8s-IAM"
        },
        {
            "PolicyName": "nodes.cluster-api-provider-aws.sigs.k8s.io",
            "PolicyArn": "arn:aws:iam::179516646050:policy/nodes.cluster-api-provider-aws.sigs.k8s.io"
        }
    ]
}

5: check the node of the cluster

get the kubeconfig by the clusterctl.

guohao@buffer ~/workspace $ kubectl --kubeconfig kubeconfig get node
NAME                                       STATUS     ROLES    AGE   VERSION
ip-10-0-0-133.us-east-2.compute.internal   NotReady   master   18h   v1.14.4
ip-10-0-0-172.us-east-2.compute.internal   NotReady   node     17h   v1.14.4

6: apply the aws-eni-ds

guohao@buffer ~/workspace $ kubectl --kubeconfig kubeconfig apply -f https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/master/config/v1.5/aws-k8s-cni.yaml
clusterrole.rbac.authorization.k8s.io/aws-node created
serviceaccount/aws-node created
clusterrolebinding.rbac.authorization.k8s.io/aws-node created
daemonset.apps/aws-node created
customresourcedefinition.apiextensions.k8s.io/eniconfigs.crd.k8s.amazonaws.com created

7: check the pod status

you can found the pod is createing status too long.

kube-system   coredns-584795fc57-lnn5h                                           0/1     ContainerCreating   0          20h    <none>       ip-10-0-0-133.us-east-2.compute.internal   <none>           <none>
kube-system   coredns-584795fc57-nmcsj                                           0/1     ContainerCreating   0          20h    <none>       ip-10-0-0-133.us-east-2.compute.internal   <none>           <none>

delete it, and k8s will rebuild the pod.

kube-system   coredns-584795fc57-ztmbx                                           1/1     Running             0          22m    10.0.0.237   ip-10-0-0-172.us-east-2.compute.internal   <none>           <none>

8: the ip pool which is assigned to ec2 instance

get the status of the instance

guohao@buffer ~ $ aws ec2 describe-instances  --instance-id i-053b8794d7f90a110

{
    "Reservations": [
        {
....
                    "NetworkInterfaces": [
                        {
                            "Attachment": {
                                "AttachTime": "2019-08-08T09:06:33.000Z",
                                "AttachmentId": "eni-attach-09285aff116268f94",
                                "DeleteOnTermination": true,
                                "DeviceIndex": 0,
                                "Status": "attached"
                            },
                            "Description": "",
                            "Groups": [
                                {
                                    "GroupName": "aws-eni-lb",
                                    "GroupId": "sg-021eaefb3018d0551"
                                },
                                {
                                    "GroupName": "aws-eni-node",
                                    "GroupId": "sg-04cfe4c2052f87031"
                                }
                            ],
                            "Ipv6Addresses": [],
                            "MacAddress": "02:8e:50:b0:02:8a",
                            "NetworkInterfaceId": "eni-03b66efcc616b8c86",
                            "OwnerId": "179516646050",
                            "PrivateDnsName": "ip-10-0-0-172.us-east-2.compute.internal",
                            "PrivateIpAddress": "10.0.0.172",
                            "PrivateIpAddresses": [
                                {
                                    "Primary": true,
                                    "PrivateDnsName": "ip-10-0-0-172.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.172"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-232.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.232"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-170.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.170"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-237.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.237"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-205.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.205"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-222.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.222"
                                }
                            ],
                            "SourceDestCheck": true,
                            "Status": "in-use",
                            "SubnetId": "subnet-0892c669597c0a9aa",
                            "VpcId": "vpc-0eadd8ecf99f5b4c6",
                            "InterfaceType": "interface"
                        },
                        {
                            "Attachment": {
                                "AttachTime": "2019-08-09T05:17:24.000Z",
                                "AttachmentId": "eni-attach-0591fb5b94cb67eb8",
                                "DeleteOnTermination": true,
                                "DeviceIndex": 1,
                                "Status": "attached"
                            },
                            "Description": "aws-K8S-i-053b8794d7f90a110",
                            "Groups": [
                                {
                                    "GroupName": "aws-eni-lb",
                                    "GroupId": "sg-021eaefb3018d0551"
                                },
                                {
                                    "GroupName": "aws-eni-node",
                                    "GroupId": "sg-04cfe4c2052f87031"
                                }
                            ],
                            "Ipv6Addresses": [],
                            "MacAddress": "02:58:f9:8c:b5:3c",
                            "NetworkInterfaceId": "eni-0da443a1cf644f334",
                            "OwnerId": "179516646050",
                            "PrivateDnsName": "ip-10-0-0-56.us-east-2.compute.internal",
                            "PrivateIpAddress": "10.0.0.56",
                            "PrivateIpAddresses": [
                                {
                                    "Primary": true,
                                    "PrivateDnsName": "ip-10-0-0-56.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.56"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-183.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.183"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-74.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.74"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-91.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.91"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-235.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.235"
                                },
                                {
                                    "Primary": false,
                                    "PrivateDnsName": "ip-10-0-0-236.us-east-2.compute.internal",
                                    "PrivateIpAddress": "10.0.0.236"
                                }
                            ],
                            "SourceDestCheck": true,
                            "Status": "in-use",
                            "SubnetId": "subnet-0892c669597c0a9aa",
                            "VpcId": "vpc-0eadd8ecf99f5b4c6",
                            "InterfaceType": "interface"
                        }
                    ],
                    "RootDeviceName": "/dev/sda1",
                    "RootDeviceType": "ebs",
                    "SecurityGroups": [
                        {
                            "GroupName": "aws-eni-lb",
                            "GroupId": "sg-021eaefb3018d0551"
                        },
                        {
                            "GroupName": "aws-eni-node",
                            "GroupId": "sg-04cfe4c2052f87031"
                        }
                    ],
                    "SourceDestCheck": true,
                    "Tags": [
                        {
                            "Key": "sigs.k8s.io/cluster-api-provider-aws/role",
                            "Value": "node"
                        },
                        {
                            "Key": "sigs.k8s.io/cluster-api-provider-aws/cluster/aws-eni",
                            "Value": "owned"
                        },
                        {
                            "Key": "Name",
                            "Value": "aws-eni-machinedeployment-5745b4948d-tg55f"
                        },
                        {
                            "Key": "kubernetes.io/cluster/aws-eni",
                            "Value": "owned"
                        }
                    ],
 ...
    ]
}

and check the eni ds status

guohao@buffer ~/workspace $ kubectl --kubeconfig kubeconfig logs aws-node-lc7ph -n kube-system
====== Starting amazon-k8s-agent ======
Checking if ipamd is serving
Waiting for ipamd health check
Ipamd is up and serving
Copying AWS CNI plugin and config
Node ready, watching ipamd health

@vincepri vincepri changed the title network: support cni amazon-vpc-cni-k8s support network: support cni amazon-vpc-cni-k8s Aug 9, 2019
@Sn0rt
Copy link
Contributor Author

Sn0rt commented Aug 16, 2019

FWIW this works today by applying a custom policy to the control plane machines and worker nodes with the Machine's spec.providerSpec.value.iamInstanceProfile. It doesn't look like any of the ENI stuff is scoped to the CAPA tag(s), despite some evidence that we wanted to – @rudoi do you remember if we tried to get the CNI permissions to be scoped to just the CAPA machines?

hi, are you still work this feature?

@Sn0rt
Copy link
Contributor Author

Sn0rt commented Aug 16, 2019

/assign

@vincepri
Copy link
Member

vincepri commented Aug 16, 2019

Folks, just as a reminder, use /lifecycle active if you're actively working on something 😃

@sethp-nr
Copy link
Contributor

@Sn0rt It's working for us as-is, so we haven't touched it in quite a while. Feel free to pick this ticket up!

@Sn0rt
Copy link
Contributor Author

Sn0rt commented Aug 18, 2019

/lifecycle active

@k8s-ci-robot k8s-ci-robot added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Aug 18, 2019
@Sn0rt
Copy link
Contributor Author

Sn0rt commented Aug 19, 2019

@sethp-nr

We should consider a cluster-level flag to indicate the current cluster's CNI solution.

The max-pod parameter of amazon-vpc-cni-k8s depends on the type of instance which can be found here.

such as I consider set a cluster-level Annotation as follow.

or labels?

from my experience, the annotation to configure and labels to select some elements.

apiVersion: "cluster.k8s.io/v1alpha1"
kind: Cluster
metadata:
  name: test1
  annotation:
    cluster.k8s.io/network-cni: amazon-vpc-cni-k8s // support amazon-vpc-cni-k8s, calico
spec:
  clusterNetwork:
    services:
      cidrBlocks: ["10.96.0.0/12"]
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    serviceDomain: "cluster.local"
  providerSpec:
    value:
      apiVersion: "awsprovider/v1alpha1"
      kind: "AWSClusterProviderSpec"
      region: "us-east-2"
      sshKeyName: "guohao"

then the CAPA controller can set the parameter of kubelet by this cluster-level label.

what do you think?

@ncdc
Copy link
Contributor

ncdc commented Oct 10, 2019

/milestone v0.5.0

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Mar 5, 2020
@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 5, 2020
@detiber
Copy link
Member

detiber commented Mar 5, 2020

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 5, 2020
@randomvariable randomvariable changed the title network: support cni amazon-vpc-cni-k8s Add flavour for using AWS VPC CNI Aug 14, 2020
@randomvariable
Copy link
Member

Given #1747 allows customisation of CNI rules, and clusterawsadm now allows customisation of policies, it should be easier to add a template flavour that uses the AWS VPC CNI.

/help

@k8s-ci-robot
Copy link
Contributor

@randomvariable:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

Given #1747 allows customisation of CNI rules, and clusterawsadm now allows customisation of policies, it should be easier to add a template flavour that uses the AWS VPC CNI.

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Aug 14, 2020
@richardcase
Copy link
Member

I think we can close this now as VPC CNI is automtically installed in EKS and is available as a EKS addon to set the specific version.

@sedefsavas
Copy link
Contributor

This issue is not only related to EKS side, so reopening it to track adding a template for AWS native CNI with unmanaged clusters.

@sedefsavas sedefsavas reopened this Mar 7, 2022
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Mar 7, 2022
@richardcase
Copy link
Member

/remove-lifecycle frozen

@k8s-ci-robot k8s-ci-robot removed the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jul 8, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 6, 2022
@richardcase
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 10, 2022
@Skarlso
Copy link
Contributor

Skarlso commented Oct 31, 2022

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 31, 2022
@Skarlso
Copy link
Contributor

Skarlso commented Oct 31, 2022

This is ultimately a documentation that should define how to install calico using clusterResourceSet or AddonProviders ( which will eventually deprecate ClusterResourceSet ).

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 8, 2023
@Sn0rt Sn0rt closed this as completed May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests