Pods on different nodes cannot communicate (flannel/vxlan) #1719

rytis · 2020-04-30T08:23:45Z

Version:

k3s version v1.17.4+k3s1 (3eee8ac)

K3s arguments:

/usr/local/bin/k3s server --no-deploy=traefik

Describe the bug

Pods on different nodes cannot communicate. Pods on the same node can.

To Reproduce

Two VMs running Fedora32 server
** Default install
** SELinux disabled
** Grub options: cgroup_memory=1 cgroup_enable=memory cgroup_enable=cpuset systemd.unified_cgroup_hierarchy=0
** Firewall rules added:
firewall-cmd --permanent --add-port=6443/tcp # kubernetes api
firewall-cmd --permanent --add-port=10250/tcp # kubelet
firewall-cmd --permanent --add-port=8472/udp # flannel
firewall-cmd --permanent --zone=trusted --add-source=10.42.0.0/16 # pods
firewall-cmd --permanent --zone=trusted --add-source=10.43.0.0/16 # services
firewall-cmd --reload
k3s installed on master and worker nodes (192.168.1.72 and .73)
[root@k3s-master ~]# k3s kubectl get nodes
NAME STATUS ROLES AGE VERSION
k3s-worker.localdomain Ready 61m v1.17.4+k3s1
k3s-master.localdomain Ready master 66m v1.17.4+k3s1

Expected behavior

Pods on different nodes can communicate. Pings to flannel interface IPs on different nodes should work.

Actual behavior

Deployed pods cannot communicate (master -> worker)

Pings to flannel interface don't work either

Whenever I ping from master to worker node I can see ICMP requests arriving at the worker node, but there's no echo reply sent back.

Additional context / logs

Master:

[root@k3s-master ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG    100    0        0 enp0s3
10.42.0.0       0.0.0.0         255.255.255.0   U     0      0        0 cni0
10.42.1.0       10.42.1.0       255.255.255.0   UG    0      0        0 flannel.1
192.168.1.0     0.0.0.0         255.255.255.0   U     100    0        0 enp0s3
[root@k3s-master ~]# ping 10.42.1.0
PING 10.42.1.0 (10.42.1.0) 56(84) bytes of data.
^C
--- 10.42.1.0 ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 4090ms

Worker:

[root@k3s-worker ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG    100    0        0 enp0s3
10.42.0.0       10.42.0.0       255.255.255.0   UG    0      0        0 flannel.1
10.42.1.0       0.0.0.0         255.255.255.0   U     0      0        0 cni0
192.168.1.0     0.0.0.0         255.255.255.0   U     100    0        0 enp0s3


[root@k3s-worker ~]# tcpdump -i any -nn port 8472
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
09:18:49.249265 IP 192.168.1.72.59205 > 192.168.1.73.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.0 > 10.42.1.0: ICMP echo request, id 4, seq 1, length 64
09:18:50.267200 IP 192.168.1.72.59205 > 192.168.1.73.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.0 > 10.42.1.0: ICMP echo request, id 4, seq 2, length 64
09:18:51.293035 IP 192.168.1.72.59205 > 192.168.1.73.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.0 > 10.42.1.0: ICMP echo request, id 4, seq 3, length 64
09:18:52.315312 IP 192.168.1.72.59205 > 192.168.1.73.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.0 > 10.42.1.0: ICMP echo request, id 4, seq 4, length 64
09:18:53.338992 IP 192.168.1.72.59205 > 192.168.1.73.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.0 > 10.42.1.0: ICMP echo request, id 4, seq 5, length 64

The text was updated successfully, but these errors were encountered:

rytis · 2020-04-30T09:34:08Z

BTW, as per flannel-io/flannel#1243 (comment) I did try ip route add 10.42.0.0/16 dev cni0, but that made no difference.

Same behaviour on Fedora31.

On the same hosts, I tried (without k3s installed, so to avoid id clash):

ip link add vxlan1 type vxlan id 1 remote 192.168.1.73 dstport 8472 dev enp0s3
ip link set vxlan up
ip addr add 10.0.0.1/24 dev vxlan1

And it worked fine, I could ping vxlan device IPs between the hosts.

niusmallnan · 2020-04-30T13:15:05Z

@rytis Did you try ethtool -K flannel.1 tx-checksum-ip-generic off ?

rytis · 2020-04-30T13:18:59Z

I've tried it now (on both sides, master and worker), same effect, no ICMP replies (reqs are appearing on the other node just as before):

[root@k3s-master ~]# ethtool -K flannel.1 tx-checksum-ip-generic off
Actual changes:
tx-checksumming: off
        tx-checksum-ip-generic: off
tcp-segmentation-offload: off
        tx-tcp-segmentation: off [requested on]
        tx-tcp-ecn-segmentation: off [requested on]
        tx-tcp-mangleid-segmentation: off [requested on]
        tx-tcp6-segmentation: off [requested on]
[root@k3s-master ~]# 
[root@k3s-master ~]# ping 10.42.1.0
PING 10.42.1.0 (10.42.1.0) 56(84) bytes of data.
^C
--- 10.42.1.0 ping statistics ---
6 packets transmitted, 0 received, 100% packet loss, time 5111ms

[root@k3s-master ~]# ping 10.42.1.1
PING 10.42.1.1 (10.42.1.1) 56(84) bytes of data.
^C
--- 10.42.1.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1058ms

niusmallnan · 2020-04-30T13:25:44Z

Please check the fdb info, the dst IP should be the node IP :

bridge fdb show dev flannel.1

rytis · 2020-04-30T13:31:13Z

It's the peer's IP:

Master:

[root@k3s-master ~]# ip addr show enp0s3
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000
    link/ether 08:00:27:15:3a:2a brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.72/24 brd 192.168.1.255 scope global dynamic noprefixroute enp0s3
       valid_lft 63120sec preferred_lft 63120sec
    inet6 fe80::62ef:58e2:63b0:4b7c/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
[root@k3s-master ~]# bridge fdb show dev flannel.1
d6:ad:6b:77:37:eb dst 192.168.1.73 self permanent
[root@k3s-master ~]#

Worker:

[root@k3s-worker ~]# ip addr show enp0s3
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000
    link/ether 08:00:27:8a:89:53 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.73/24 brd 192.168.1.255 scope global dynamic noprefixroute enp0s3
       valid_lft 63053sec preferred_lft 63053sec
    inet6 fe80::fc16:8c40:f56d:827a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
[root@k3s-worker ~]# bridge fdb show dev flannel.1
ca:a3:3a:8f:57:36 dst 192.168.1.72 self permanent

rytis · 2020-05-04T19:57:44Z

@niusmallnan just a bit more info, I just realised that the MAC addresses in FDB don't correspond to anything on those two machines:

Master:

[root@k3s-master ~]# bridge fdb show dev flannel.1
d6:ad:6b:77:37:eb dst 192.168.1.73 self permanent
[root@k3s-master ~]# ip addr|grep ether
    link/ether 08:00:27:15:3a:2a brd ff:ff:ff:ff:ff:ff
    link/ether 6a:c0:83:56:f6:88 brd ff:ff:ff:ff:ff:ff
    link/ether da:ce:08:e7:19:e4 brd ff:ff:ff:ff:ff:ff
    link/ether 2e:44:8c:0e:e1:c6 brd ff:ff:ff:ff:ff:ff link-netns cni-9ab7f860-f12f-cb12-0252-1428ff7fa8e9
    link/ether 3e:4e:8e:21:6a:c9 brd ff:ff:ff:ff:ff:ff link-netns cni-082af5cd-cbdf-4307-0986-85464bfeebdb
    link/ether 8a:42:06:6c:38:3a brd ff:ff:ff:ff:ff:ff link-netns cni-e35503a9-917a-3cfc-ac70-0044221c6ec9
    link/ether 4e:c1:b8:a4:7f:8c brd ff:ff:ff:ff:ff:ff link-netns cni-8554fc99-11f7-b82a-0185-aef6fa240da0

Worker:

[root@k3s-worker ~]# bridge fdb show dev flannel.1
ca:a3:3a:8f:57:36 dst 192.168.1.72 self permanent
[root@k3s-worker ~]# ip addr|grep ether
    link/ether 08:00:27:8a:89:53 brd ff:ff:ff:ff:ff:ff
    link/ether 26:2f:80:14:19:61 brd ff:ff:ff:ff:ff:ff
    link/ether f6:06:ee:66:1a:cc brd ff:ff:ff:ff:ff:ff
    link/ether 5e:7a:08:82:83:d6 brd ff:ff:ff:ff:ff:ff link-netns cni-7b70e9eb-c375-aae2-dff4-10900fdcdbbb
    link/ether 42:76:3c:e8:07:ed brd ff:ff:ff:ff:ff:ff link-netns cni-e915e880-0f0d-c8b3-be9b-9d0fb83a53c6
    link/ether 1e:10:c4:a6:72:50 brd ff:ff:ff:ff:ff:ff link-netns cni-bedcfc25-37db-b656-f5ea-44d527581c6d

What are they?..

vbohinc · 2020-05-07T21:04:02Z

Check if you enabled masqerading and iptables-legacy.
sudo firewall-cmd --add-masquerade --permanent
sudo update-alternatives --set iptables /usr/sbin/iptables-legacy

For a more complete list of firewall rules and required open ports: https://rancher.com/docs/rancher/v2.x/en/installation/options/firewall/.

Update 1:
Done a bit of testing on Fedora 32 VM's and can confirm, inter-node communication over vxlan breaks, the work nodes also loose internet connectivity. Though I still have to look for the cause.

Update 2:
firewall-cmd --permanent --direct --add-rule ipv4 filter INPUT 1 -i cni0 -s 10.42.0.0/16 -j ACCEPT
fixes inter-node communication temporarily; though still not sure why this doesn't stick.

Update 3:
On my main cluster (bare-metal setup wihtout HA) the master node runs Fedora 31, and I added a worker node with Fedora 32 (KVM VM on a host in a different subnet), networking works fine with Calico; inter-node communication and internet access works.

Update 4:
Fedora 32 node does behave strange, connectivity inside pods is sometimes lost completely, then Calico synchronization restores it. Most of the time network issues inside pods appear to be caused by the loss of proper DNS resolution (pinging the ip works), but then again only on this node. This happens because the firewall drops packets. This happens with firewalld with either the iptables or nftables backends.

stale · 2021-07-31T05:59:22Z

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

rytis mentioned this issue May 5, 2020

Fedora 32 - no network access from within pods running on k3s #1711

Closed

dweomer mentioned this issue May 11, 2020

after upgrade k3s continuous "1 controller.go:135] error syncing 'system-upgrade/k3s-agent' messages" rancher/system-upgrade-controller#72

Closed

stale bot added the status/stale label Jul 31, 2021

stale bot closed this as completed Aug 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pods on different nodes cannot communicate (flannel/vxlan) #1719

Pods on different nodes cannot communicate (flannel/vxlan) #1719

rytis commented Apr 30, 2020

rytis commented Apr 30, 2020

niusmallnan commented Apr 30, 2020

rytis commented Apr 30, 2020

niusmallnan commented Apr 30, 2020

rytis commented Apr 30, 2020

rytis commented May 4, 2020

vbohinc commented May 7, 2020 •

edited

Loading

stale bot commented Jul 31, 2021

Pods on different nodes cannot communicate (flannel/vxlan) #1719

Pods on different nodes cannot communicate (flannel/vxlan) #1719

Comments

rytis commented Apr 30, 2020

rytis commented Apr 30, 2020

niusmallnan commented Apr 30, 2020

rytis commented Apr 30, 2020

niusmallnan commented Apr 30, 2020

rytis commented Apr 30, 2020

rytis commented May 4, 2020

vbohinc commented May 7, 2020 • edited Loading

stale bot commented Jul 31, 2021

vbohinc commented May 7, 2020 •

edited

Loading