podCIDR allocation is not working as expected #5231

sohnaeo · 2019-10-02T00:13:51Z

Problem:

**pods are not getting the ips from the podCIDR which assigned to the nodes **

1- Checkout master branch

2- Create inventory, changed only 3 variables

a) Change the etcd deployment to host

b) Change the pod subnets and service addresses

kube_service_addresses: 10.242.0.0/21
kube_pods_subnet: 10.242.64.0/21
kube_network_node_prefix: 24

3- Once Cluster is up , check the pods CIDR assigned to each node
/usr/local/bin/kubectl get nodes node1 -ojsonpath='{.spec.podCIDR}'
node1-->10.242.64.0/24
node3-->10.242.65.0/24
node4-->10.242.66.0/24
node5-->10.242.67.0/24

4- kubectl apply -f nginx.yml with replicas of 6

nginx-5754944d6c-8kzhj 1/1 Running 0 66m 10.242.70.2 node5
nginx-5754944d6c-b2tvh 1/1 Running 0 66m 10.242.67.3 node4
nginx-5754944d6c-dj4qq 1/1 Running 0 66m 10.242.66.1 node3
nginx-5754944d6c-wbhdb 1/1 Running 0 66m 10.242.70.3 node5
nginx-5754944d6c-x7gdq 1/1 Running 0 66m 10.242.66.2 node3
nginx-5754944d6c-z9vcv 1/1 Running 0 66m 10.242.67.2 node4

look at above pods are getting the ips from range which not assigned to the hosts. It is working fine in kubernetes 1.9

Environment: master branch

Cloud provider or hardware configuration: AWS
OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Version of Ansible (ansible --version):
ansible 2.7.12
config file = /home/farhan/workspaces/kubespray-orignal/ansible.cfg
configured module search path = ['/home/farhan/workspaces/kubespray-orignal/library']
ansible python module location = /usr/lib/python3.7/site-packages/ansible
executable location = /usr/bin/ansible
python version = 3.7.4 (default, Jul 16 2019, 07:12:58) [GCC 9.1.0]

**Kubespray version (commit) (git rev-parse --short HEAD):86cc703

Network plugin used: defult

Copy of your inventory file:
[all]
node1 ansible_host=13.x.x.x ip=13.211.170.14 # ip=10.3.0.1 etcd_member_name=etcd1
node2 ansible_host=3.x.x.x ip=3.104.120.158 # ip=10.3.0.2 etcd_member_name=etcd2
node3 ansible_host=13.x.x.x ip=13.210.80.241 # ip=10.3.0.3 etcd_member_name=etcd3

[kube-master]
node1

[etcd]
node5

[kube-node]
node2
node3
node4

[calico-rr]

[k8s-cluster:children]
kube-master
kube-node
calico-rr

The text was updated successfully, but these errors were encountered:

mattymo · 2019-10-02T07:02:03Z

@sohnaeo I'm not sure this is a bug. It looks like you're probably using calico here. Calico assigns blocks of IPs to a node and then if it fills up, it assigns another block from the 10.242.64.0/21 pool. All the IPs here are from that range, so I don't see what the problem is.

sohnaeo · 2019-10-02T07:12:09Z

@mattymo

Thanks for quick reply, actually I dig more into and it seems IP addresses given to pods are managed by the chosen CNI IPAM plugin. Calico's IPAM plugin doesn't respect the values given to Node.Spec.PodCIDR, and instead manages its own per-node.

In our private network, we cant use BIRD (BGP) and have to rely on the static routes so we would like to be dead sure what routes need to be added on control planes and nodes. But due to new feature of calico, each node can have pod of that big range 10.242.64.0/21. We would like to make sure podCIDR works for each node so each node have pods that CIDR is assigned to it. for example

/usr/local/bin/kubectl get nodes node1 -ojsonpath='{.spec.podCIDR}'
node1-->10.242.64.0/24
So we would like to have node1 pods with ip range from 10.242.64 subnet so we can add route for that subnet on other nodes. I hope I make it clear.

sohnaeo · 2019-10-02T07:35:08Z

@mattymo

I fixed this issue by hacking the below

network_plugin/calico/templates/cni-calico.conflist.j2

FROM
{% else %}
"ipam": {
"type": "calico-ipam",
"assign_ipv4": "true",
"ipv4_pools": ["{{ calico_pool_cidr | default(kube_pods_subnet) }}"]
},

TO:

{% else %}
"ipam": {
"type": "host-local",
"subnet": "usePodCidr"
},

Is it possible to provide an option to use "host-local" for ETCD as well ? Could I raise PR for this ?
In our case, it makes sense to use ETCD and ipam type should be host-local to use uePodCidr.

mattymo · 2019-10-02T08:33:19Z

I'm not sure if this is a supported way Calico can operate here. Maybe you should switch to flannel, which respects the node pod cidr allocations.

sohnaeo · 2019-10-02T08:40:09Z

@mattymo

We can't use Flannel due to security as it is an overlay network, we have to use layer 3 network protocol Calico. We also cant run BIRD/BGP that's the reason we need to add static routes so pods can reachable on the podcidr allocated nodes

radut · 2019-12-20T12:35:06Z

Encountered similar issue.
Since calico >3.6 : projectcalico/calico#2592

Thanks for the hack @sohnaeo
I am still looking for a clear config though...

Edit: For me the hack you provided didn't worked.
instead it worked like this: (kubespray v2.12.0 has calico v3.7.3 , https://docs.projectcalico.org/v3.7/reference/cni-plugin/configuration#using-host-local-ipam )

      "ipam": {
         "type": "host-local",
         "ranges": [
                    [
                      { "subnet": "usePodCidr" }
                    ]
                   ]
       },

+1

fejta-bot · 2020-03-20T14:19:55Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

radut · 2020-03-20T14:21:03Z

/remove-lifecycle stale

fejta-bot · 2020-06-18T14:28:15Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

radut · 2020-06-18T14:29:48Z

/remove-lifecycle stale

fejta-bot · 2020-09-16T14:31:22Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

radut · 2020-09-16T18:18:18Z

/remove-lifecycle stale

sohnaeo added the kind/bug Categorizes issue or PR as related to a bug. label Oct 2, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 20, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 20, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 18, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 18, 2020

bmelbourne mentioned this issue Aug 24, 2020

Add support for Calico CNI "host-local" IPAM plugin #6580

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 16, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 16, 2020

k8s-ci-robot closed this as completed in #6580 Sep 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

podCIDR allocation is not working as expected #5231

podCIDR allocation is not working as expected #5231

sohnaeo commented Oct 2, 2019

mattymo commented Oct 2, 2019

sohnaeo commented Oct 2, 2019

sohnaeo commented Oct 2, 2019

mattymo commented Oct 2, 2019

sohnaeo commented Oct 2, 2019

radut commented Dec 20, 2019 •

edited

Loading

fejta-bot commented Mar 20, 2020

radut commented Mar 20, 2020

fejta-bot commented Jun 18, 2020

radut commented Jun 18, 2020

fejta-bot commented Sep 16, 2020

radut commented Sep 16, 2020

podCIDR allocation is not working as expected #5231

podCIDR allocation is not working as expected #5231

Comments

sohnaeo commented Oct 2, 2019

mattymo commented Oct 2, 2019

sohnaeo commented Oct 2, 2019

sohnaeo commented Oct 2, 2019

mattymo commented Oct 2, 2019

sohnaeo commented Oct 2, 2019

radut commented Dec 20, 2019 • edited Loading

fejta-bot commented Mar 20, 2020

radut commented Mar 20, 2020

fejta-bot commented Jun 18, 2020

radut commented Jun 18, 2020

fejta-bot commented Sep 16, 2020

radut commented Sep 16, 2020

radut commented Dec 20, 2019 •

edited

Loading