Existing pod network not cleanedup when using security groups with stateful sets #1374

hintofbasil · 2021-02-05T14:34:57Z

What happened:
When using security groups with stateful sets we noticed that pods often lost connectivity when restarted.
The security group they were bound to allowed all connections inbound and outbound on 0.0.0.0/0.

After some investigation we discovered the bug seems to affect pods re-created on the same node with the same name.

Attach logs
eks_i-08aff468a2d6ce527_2021-02-05_1421-UTC_0.6.2.tar.gz

What you expected to happen:

The pod should launch normally

How to reproduce it (as minimally and precisely as possible):

Create a securityGroupPolicy

apiVersion: vpcresources.k8s.aws/v1beta1
kind: SecurityGroupPolicy
metadata:
  name: cni-test
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: cni-test
  securityGroups:
    groupIds:
      - <id>

Create a pod which uses the security group policy

kind: Pod
metadata:
  name: cni-test
  namespace: default
  labels:
    app: cni-test
spec:
  containers:
    - name: alpine
      image: alpine
      command:
      - sleep
      - "1000000000"

Kill the pod then recreate the pod ensuring it is scheduled to the same node. Then attempt to make an outbound connection from the pod

apk add curl
curl 1.1.1.1

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-12T01:09:16Z", GoVersion:"go1.15.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.9-eks-d1db3c", GitCommit:"d1db3c46e55f95d6a7d3e5578689371318f95ff9", GitTreeState:"clean", BuildDate:"2020-10-20T22:18:07Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

CNI Version

v1.7.5-eksbuild.1

OS (e.g: cat /etc/os-release):

NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"

Kernel (e.g. uname -a):

Linux ip-172-16-214-206.eu-central-1.compute.internal 4.14.209-160.339.amzn2.x86_64 #1 SMP Wed Dec 16 22:44:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

The text was updated successfully, but these errors were encountered:

SaranBalaji90 · 2021-02-05T15:08:02Z

I'm trying to repro the issue but so far no luck. Will retry few more times. Can you let me know the name of the pod affected in the logs attached (I couldn't find cni-test pod in logs), I can dig through the logs. If its happening consistently on your cluster, we can schedule a call to dig further into this issue (you can reach me at [email protected]).

srajakum@147dda5e4851 yaml-files % kubectl get pods -owide          
NAME                                READY   STATUS              RESTARTS   AGE     IP                NODE                                           NOMINATED NODE   READINESS GATES
cni-test                            1/1     Running             0          6s      192.168.160.222   ip-192-168-65-167.us-west-2.compute.internal   <none>           <none>

srajakum@147dda5e4851 yaml-files % kubectl describe pod cni-test    
Annotations:  kubernetes.io/psp: eks.privileged
              vpc.amazonaws.com/pod-eni:
                [{"eniId":"eni-0529ea213a202db76","ifAddress":"06:03:de:64:a3:05","privateIp":"192.168.160.222","vlanId":1,"subnetCidr":"192.168.160.0/19"...

srajakum@147dda5e4851 yaml-files % kubectl exec -it cni-test /bin/sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ # apk add curl
...
/ # curl 1.1.1.1
<html>

<!--- recreated the pod !-->

srajakum@147dda5e4851 yaml-files % kubectl get pods -owide          
NAME                                READY   STATUS              RESTARTS   AGE     IP                NODE                                           NOMINATED NODE   READINESS GATES
cni-test                            1/1     Running             0          6s      192.168.176.242   ip-192-168-65-167.us-west-2.compute.internal   <none>           <none>

Annotation on the pod:
    vpc.amazonaws.com/pod-eni: '[{"eniId":"eni-0cce00f448a476759","ifAddress":"06:17:57:b6:5b:79","privateIp":"192.168.176.242","vlanId":2,"subnetCidr":"192.168.160.0/19"}]'

able to curl again

SaranBalaji90 · 2021-02-05T15:39:53Z

We should probably add the unique ID in the annotation as well and return the details of ENI from ipamd based on unique ID. This will ensure even when kubelet invokes delete after network is removed for old pod, we won't delete the new pod network. (AddNetwork and DelNetwork). Created issue here as well - aws/amazon-vpc-resource-controller-k8s#19 for enhancing this functionality.

hintofbasil · 2021-02-05T15:40:02Z

The pod was called prometheus-scaling-0 I believe. It was definitely a pod starting prometheus-.

It seems that updating the CNI to v1.7.8 fixes the issue.

SaranBalaji90 · 2021-02-05T15:54:45Z

Thanks for the info @hintofbasil. May be this pod - prometheus-scaling-1 looks like sequence is right too but let me know if you notice the issue again, I will be happy to jump on a call and assist with the issue. 1.7.8 has fix for pod deletion path so not sure if that's helping here.

hintofbasil · 2021-02-05T16:09:23Z

Yes. That would be the one. Should have written it down earlier.
We will keep an eye on it and see if it crops up again. For now we are going to update our clusters to use 1.7.8.

Thanks Sri

SaranBalaji90 · 2021-02-05T17:39:03Z

@hintofbasil can you ensure you have terminationGracePeriodSeconds set on your yaml? Because for pods using security group we describe pods during deletion and if terminationGracePeriodSeconds is not set then pods data will get removed from Kubernetes datastore (etcd) and cni plugin will have dangling records in ip rule which will affect pod network.

hintofbasil · 2021-02-05T18:18:42Z

@sri, we do not. We only set terminationGracePeriodSeconds.
We can try setting that value and using the 1.7.5 CNI. An experiment for Monday.

SaranBalaji90 · 2021-02-05T18:22:33Z

@hintofbasil sorry I meant terminationGracePeriodSeconds (the one you mentioned) fixed my comment.

SaranBalaji90 · 2021-02-05T19:11:53Z

@hintofbasil I have created PR to clean up network even if pods are force deleted by the controllers. This will help with network issues you noticed with new pods.

hintofbasil · 2021-02-08T16:10:54Z

Hi Sri,

It seems we were a bit early to announce that 1.7.8 fixed the issue. Unfortunately we are still seeing it.

I've even installed a version built from master (99ecb4c).
It now occurs always after a stateful set pod is scheduled onto the same node. However it seems to have fixed the issue with pods of the same name.

I've attached further logs from the master branch version. This time the failing pod is prometheus-prometheus-operator-prometheus-0.

eks_i-06368ccddf952df32_2021-02-08_1603-UTC_0.6.2.tar.gz

SaranBalaji90 · 2021-02-08T16:24:44Z

@hintofbasil our next release which will be this week, will clean up dangling rules which was blocking pod traffic. This occurs when pod is deleted from K8s datastore(etcd) even before CNI is able to read the pod information during deletion (to read annotation). Fix I mentioned above (which is merged to master and 1.7 branch) will take care of cleaning up all dangling rules. This will be prevented once when we have #kubernetes/kubernetes#69882. Even this might help to some level to avoid the race condition - kubernetes/kubernetes#88543

Regarding prometheus-prometheus-operator-prometheus-0, I see that pod network is setup properly. Can you send me your cluster arn to [email protected] to investigate further.

SaranBalaji90 · 2021-02-09T22:14:51Z

Local store support for pods using security group - #1313. This will mitigate invoking APIServer on the deletion path instead use local file to read the vlan associated.

SaranBalaji90 · 2021-02-16T21:53:47Z

Closing this as we are tracking the issue using #1313 and our https://docs.aws.amazon.com/eks/latest/userguide/sec-group-reqs.html is updated to include terminationPeriodInSeconds on pod spec to avoid deleting the pod objects from etcd before network is cleanedup.

hintofbasil added the bug label Feb 5, 2021

SaranBalaji90 assigned abhipth Feb 5, 2021

SaranBalaji90 mentioned this issue Feb 5, 2021

Fix deletion of hostVeth rule for pods using security group #1376

Merged

couralex6 mentioned this issue Feb 5, 2021

Fix deletion of hostVeth rule for pods using security group #1377

Merged

SaranBalaji90 changed the title ~~Incorrect ENI when using security groups with stateful sets~~ Existing pod network not cleanedup when using security groups with stateful sets Feb 8, 2021

SaranBalaji90 closed this as completed Feb 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Existing pod network not cleanedup when using security groups with stateful sets #1374

Existing pod network not cleanedup when using security groups with stateful sets #1374

hintofbasil commented Feb 5, 2021 •

edited

Loading

SaranBalaji90 commented Feb 5, 2021 •

edited

Loading

SaranBalaji90 commented Feb 5, 2021 •

edited

Loading

hintofbasil commented Feb 5, 2021

SaranBalaji90 commented Feb 5, 2021 •

edited

Loading

hintofbasil commented Feb 5, 2021

SaranBalaji90 commented Feb 5, 2021 •

edited

Loading

hintofbasil commented Feb 5, 2021

SaranBalaji90 commented Feb 5, 2021

SaranBalaji90 commented Feb 5, 2021

hintofbasil commented Feb 8, 2021

SaranBalaji90 commented Feb 8, 2021 •

edited

Loading

SaranBalaji90 commented Feb 9, 2021 •

edited

Loading

SaranBalaji90 commented Feb 16, 2021

Existing pod network not cleanedup when using security groups with stateful sets #1374

Existing pod network not cleanedup when using security groups with stateful sets #1374

Comments

hintofbasil commented Feb 5, 2021 • edited Loading

SaranBalaji90 commented Feb 5, 2021 • edited Loading

SaranBalaji90 commented Feb 5, 2021 • edited Loading

hintofbasil commented Feb 5, 2021

SaranBalaji90 commented Feb 5, 2021 • edited Loading

hintofbasil commented Feb 5, 2021

SaranBalaji90 commented Feb 5, 2021 • edited Loading

hintofbasil commented Feb 5, 2021

SaranBalaji90 commented Feb 5, 2021

SaranBalaji90 commented Feb 5, 2021

hintofbasil commented Feb 8, 2021

SaranBalaji90 commented Feb 8, 2021 • edited Loading

SaranBalaji90 commented Feb 9, 2021 • edited Loading

SaranBalaji90 commented Feb 16, 2021

hintofbasil commented Feb 5, 2021 •

edited

Loading

SaranBalaji90 commented Feb 5, 2021 •

edited

Loading

SaranBalaji90 commented Feb 5, 2021 •

edited

Loading

SaranBalaji90 commented Feb 5, 2021 •

edited

Loading

SaranBalaji90 commented Feb 5, 2021 •

edited

Loading

SaranBalaji90 commented Feb 8, 2021 •

edited

Loading

SaranBalaji90 commented Feb 9, 2021 •

edited

Loading