CoreDNS version breaks networking in pods #328

0verc1ocker · 2019-02-21T21:47:33Z

I am creating two node groups and passing in --register-with-taints to the kubelet of one of them. The group that has extra argument to kubelet registers the taints properly but then networking inside of the containers on those nodes stop working. Nothing in the logs for aws-node, kubelet, or the CNI. Everything is fine when this same nodegroup starts up without the kubelet extra argument --register-with-taints.

The taint is passed to bootstrap.sh as such:

--kubelet-extra-args '--register-with-taints=\"dedicated=jobs:NoSchedule\" --node-labels=testing/role=shared'

Using the latest ami for us-east-1 and using amazon-k8s-cni:v1.3.2 image for aws-node ds.

The text was updated successfully, but these errors were encountered:

0verc1ocker · 2019-02-22T21:01:01Z

I've tried numerous ways to try and solve this today but it's still unsolvable. I've set the bootstrap.sh kubelet-extra-args using the CloudFormation templates and escaping single-quotes and double-quotes. I reset docker network bridge and tested networking inside the nodes. All networking from the host network interface is fine. Attaching a docker container to the host network works and networking inside works as well. Docker info and settings are identical to the docker settings on nodes that have working pod networking. All of this leads me to believe it is not docker networking issue but specifically CNI. Whenever the taints and labels are on the nodes, the CNI networking from inside the pods fails to work, DNS resolution also fails. /etc/resolv.conf is identical across all the nodes. Kubelet starts up fine and shows no errors, which is really confusing.

Eventually, I decided to bypass the bootstrap.sh script all together and use sed command inside the UserData section of the CloudFormation template to set the taints and labels in the kubelet systemd manifest files because maybe the bootstrap.sh was causing something to fail:

sed -i 's#/usr/bin/kubelet#/usr/bin/kubelet --register-with-taints=dedicated=jobs:NoSchedule --node-labels=testing/role=shared,testin/tools-any=true,testing/tenants-any=true#g' /etc/systemd/system/kubelet.service

This works and the taints and labels are there but the CNI networking from inside the pods is still broken...

This looks like a amazon-vpc-cni-k8s issue. We are looking for a resolution please. I would like to make the case to use EKS for our infrastructure migration from kops, but with issues like these we might have to look into GKE.

micahhausler · 2019-02-22T21:09:56Z

Accidental close! I have confirmed this is an issue. We'll get someone to take a look at it,.

0verc1ocker · 2019-02-25T20:11:46Z

This issue was eventually resolved and turned out to be an issue with the version of coredns that comes on a new 1.11 EKS cluster.

See awslabs/amazon-eks-ami#200

0verc1ocker changed the title ~~Adding --register-with-taints breaks networking on nodes~~ Adding --register-with-taints breaks networking in pods Feb 21, 2019

micahhausler mentioned this issue Feb 22, 2019

CoreDNS version breaks networking in pods awslabs/amazon-eks-ami#200

Closed

micahhausler closed this as completed Feb 22, 2019

micahhausler reopened this Feb 22, 2019

0verc1ocker closed this as completed Feb 25, 2019

0verc1ocker changed the title ~~Adding --register-with-taints breaks networking in pods~~ CoreDNS version breaks networking in pods Feb 25, 2019

pierluigilenoci mentioned this issue Nov 16, 2021

[EKS Add-On] [CoreDNS]: Patched Add-On never recovers from 'Degraded' State aws/containers-roadmap#1389

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CoreDNS version breaks networking in pods #328

CoreDNS version breaks networking in pods #328

0verc1ocker commented Feb 21, 2019

0verc1ocker commented Feb 22, 2019

micahhausler commented Feb 22, 2019

0verc1ocker commented Feb 25, 2019

CoreDNS version breaks networking in pods #328

CoreDNS version breaks networking in pods #328

Comments

0verc1ocker commented Feb 21, 2019

0verc1ocker commented Feb 22, 2019

micahhausler commented Feb 22, 2019

0verc1ocker commented Feb 25, 2019