-
Notifications
You must be signed in to change notification settings - Fork 744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CoreDNS version breaks networking in pods #328
Comments
I've tried numerous ways to try and solve this today but it's still unsolvable. I've set the bootstrap.sh kubelet-extra-args using the CloudFormation templates and escaping single-quotes and double-quotes. I reset docker network bridge and tested networking inside the nodes. All networking from the host network interface is fine. Attaching a docker container to the host network works and networking inside works as well. Docker info and settings are identical to the docker settings on nodes that have working pod networking. All of this leads me to believe it is not docker networking issue but specifically CNI. Whenever the taints and labels are on the nodes, the CNI networking from inside the pods fails to work, DNS resolution also fails. /etc/resolv.conf is identical across all the nodes. Kubelet starts up fine and shows no errors, which is really confusing. Eventually, I decided to bypass the bootstrap.sh script all together and use
This works and the taints and labels are there but the CNI networking from inside the pods is still broken... This looks like a amazon-vpc-cni-k8s issue. We are looking for a resolution please. I would like to make the case to use EKS for our infrastructure migration from kops, but with issues like these we might have to look into GKE. |
Accidental close! I have confirmed this is an issue. We'll get someone to take a look at it,. |
This issue was eventually resolved and turned out to be an issue with the version of coredns that comes on a new 1.11 EKS cluster. |
I am creating two node groups and passing in
--register-with-taints
to the kubelet of one of them. The group that has extra argument to kubelet registers the taints properly but then networking inside of the containers on those nodes stop working. Nothing in the logs for aws-node, kubelet, or the CNI. Everything is fine when this same nodegroup starts up without the kubelet extra argument--register-with-taints
.The taint is passed to bootstrap.sh as such:
Using the latest ami for us-east-1 and using
amazon-k8s-cni:v1.3.2
image for aws-node ds.The text was updated successfully, but these errors were encountered: