Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing routes with many nodes on vxlan #958

Closed
svenwltr opened this issue Mar 6, 2018 · 10 comments
Closed

Missing routes with many nodes on vxlan #958

svenwltr opened this issue Mar 6, 2018 · 10 comments

Comments

@svenwltr
Copy link

svenwltr commented Mar 6, 2018

Expected Behavior

When adding a new instance, it should always add all routes to all existing instances.

Current Behavior

When there are many nodes (> 30), there are occasionally missing routes between some instances.

For each missing route we get this error:

vxlan_network.go:145] AddFDB failed: no buffer space available

Missing route means, that one entry like this is missing on ip route:

10.10.27.0/24 via 10.10.27.0 dev flannel.1 onlink 

Possible Solution

Besides fixing the underlying problem on the OS or network settings it might be a good idea to retry such things or even to let flannel fail completely (see Context).

Steps to Reproduce

  1. Set up a autoscaling group with instances which use flannel.
  2. Scale up to 50 nodes without ramping up. I am not sure if the parallel booting it a problem here.
  3. Run journalctl -u flanneld | grep AddFDB on each instance and see some errors. There are around 4 missing routes on that scale.

systemd unit

$ systemctl cat flanneld
# /usr/lib/systemd/system/flanneld.service
[Unit]
Description=flannel - Network fabric for containers (System Application Container)
Documentation=https://github.com/coreos/flannel
After=etcd.service etcd2.service etcd-member.service
Requires=flannel-docker-opts.service

[Service]
Type=notify
Restart=always
RestartSec=10s
TimeoutStartSec=300
LimitNOFILE=40000
LimitNPROC=1048576

Environment="FLANNEL_IMAGE_TAG=v0.9.0"
Environment="FLANNEL_OPTS=--ip-masq=true"
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/lib/coreos/flannel-wrapper.uuid"
EnvironmentFile=-/run/flannel/options.env

ExecStartPre=/sbin/modprobe ip_tables
ExecStartPre=/usr/bin/mkdir --parents /var/lib/coreos /run/flannel
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/lib/coreos/flannel-wrapper.uuid
ExecStart=/usr/lib/coreos/flannel-wrapper $FLANNEL_OPTS
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/lib/coreos/flannel-wrapper.uuid

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/flanneld.service.d/40-pod-network.conf
[Service]
Environment="FLANNELD_ETCD_ENDPOINTS=http://0.etcd.k8s.rebuy.loc:2379,http://1.etcd.k8s.rebuy.loc:2379,http://2.etcd.k8s.rebuy.loc:2379"
ExecStartPre=/usr/bin/etcdctl --endpoints=http://0.etcd.k8s.rebuy.loc:2379,http://1.etcd.k8s.rebuy.loc:2379,http://2.etcd.k8s.rebuy.loc:2379 set /coreos.com/network/config \
    '{"Network":"10.10.0.0/16", "Backend": {"Type": "vxlan"}}'

logs

$ journalctl -u flanneld | cat
-- Logs begin at Tue 2018-03-06 09:18:56 CET, end at Tue 2018-03-06 10:46:55 CET. --
Mar 06 09:19:26 localhost systemd[1]: Starting flannel - Network fabric for containers (System Application Container)...
Mar 06 09:19:28 ip-172-20-202-32.eu-west-1.compute.internal rkt[859]: rm: unable to resolve UUID from file: open /var/lib/coreos/flannel-wrapper.uuid: no such file or directory
Mar 06 09:19:28 ip-172-20-202-32.eu-west-1.compute.internal rkt[859]: rm: failed to remove one or more pods
Mar 06 09:19:29 ip-172-20-202-32.eu-west-1.compute.internal etcdctl[932]: {"Network":"10.10.0.0/16", "Backend": {"Type": "vxlan"}}
Mar 06 09:19:29 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: + exec /usr/bin/rkt run --uuid-file-save=/var/lib/coreos/flannel-wrapper.uuid --trust-keys-from-https --mount volume=coreos-notify,target=/run/systemd/notify --volume coreos-notify,kind=host,source=/run/systemd/notify --set-env=NOTIFY_SOCKET=/run/systemd/notify --net=host --volume coreos-run-flannel,kind=host,source=/run/flannel,readOnly=false --volume coreos-etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true --volume coreos-usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume coreos-etc-hosts,kind=host,source=/etc/hosts,readOnly=true --volume coreos-etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true --mount volume=coreos-run-flannel,target=/run/flannel --mount volume=coreos-etc-ssl-certs,target=/etc/ssl/certs --mount volume=coreos-usr-share-certs,target=/usr/share/ca-certificates --mount volume=coreos-etc-hosts,target=/etc/hosts --mount volume=coreos-etc-resolv,target=/etc/resolv.conf --inherit-env --stage1-from-dir=stage1-fly.aci quay.io/coreos/flannel:v0.9.0 -- --ip-masq=true
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: pubkey: prefix: "quay.io/coreos/flannel"
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: key: "https://quay.io/aci-signing-key"
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: gpg key fingerprint is: BFF3 13CD AA56 0B16 A898  7B8F 72AB F5F6 799D 33BC
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]:         Quay.io ACI Converter (ACI conversion signing key) <[email protected]>
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Trusting "https://quay.io/aci-signing-key" for prefix "quay.io/coreos/flannel" without fingerprint review.
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Added key for prefix "quay.io/coreos/flannel" at "/etc/rkt/trustedkeys/prefix.d/quay.io/coreos/flannel/bff313cdaa560b16a8987b8f72abf5f6799d33bc"
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading signature:  0 B/473 B
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading signature:  473 B/473 B
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading signature:  473 B/473 B
Mar 06 09:19:34 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading ACI:  0 B/18.4 MB
Mar 06 09:19:34 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading ACI:  8.19 KB/18.4 MB
Mar 06 09:19:34 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading ACI:  18.4 MB/18.4 MB
Mar 06 09:19:35 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: image: signature verified:
Mar 06 09:19:35 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]:   Quay.io ACI Converter (ACI conversion signing key) <[email protected]>
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.559691     950 main.go:470] Determining IP address of default interface
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.559900     950 main.go:483] Using interface with name eth0 and address 172.20.202.32
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.559912     950 main.go:500] Defaulting external address to interface address (172.20.202.32)
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.559977     950 main.go:235] Created subnet manager: Etcd Local Manager with Previous Subnet: 0.0.0.0/0
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.559984     950 main.go:238] Installing signal handlers
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.567211     950 main.go:348] Found network config - Backend type: vxlan
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.567251     950 vxlan.go:119] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.919205     950 local_manager.go:234] Picking subnet in range 10.10.1.0 ... 10.10.255.0
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.923001     950 local_manager.go:220] Allocated lease (10.10.122.0/24) to current node (172.20.202.32)
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.939560     950 main.go:295] Wrote subnet file to /run/flannel/subnet.env
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.939578     950 main.go:299] Running backend.
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.939878     950 vxlan_network.go:56] watching for new subnet leases
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal systemd[1]: Started flannel - Network fabric for containers (System Application Container).
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.943885     950 main.go:391] Waiting for 23h0m0.085901025s to renew lease
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: E0306 08:19:37.952090     950 vxlan_network.go:145] AddFDB failed: no buffer space available
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.665323     950 ipmasq.go:75] Some iptables rules are missing; deleting and recreating rules
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.665356     950 ipmasq.go:97] Deleting iptables rule: -s 10.10.0.0/16 -d 10.10.0.0/16 -j RETURN
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.666565     950 ipmasq.go:97] Deleting iptables rule: -s 10.10.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.667897     950 ipmasq.go:97] Deleting iptables rule: ! -s 10.10.0.0/16 -d 10.10.122.0/24 -j RETURN
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.668949     950 ipmasq.go:97] Deleting iptables rule: ! -s 10.10.0.0/16 -d 10.10.0.0/16 -j MASQUERADE
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.670394     950 ipmasq.go:85] Adding iptables rule: -s 10.10.0.0/16 -d 10.10.0.0/16 -j RETURN
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.672491     950 ipmasq.go:85] Adding iptables rule: -s 10.10.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.758989     950 ipmasq.go:85] Adding iptables rule: ! -s 10.10.0.0/16 -d 10.10.122.0/24 -j RETURN
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.761328     950 ipmasq.go:85] Adding iptables rule: ! -s 10.10.0.0/16 -d 10.10.0.0/16 -j MASQUERADE

We tried to adjust some sysctl settings, but none of them worked:

net.ipv4.tcp_rmem = 10240 87380 12582912
net.ipv4.tcp_wmem = 10240 87380 12582912

net.ipv4.tcp_rmem = 102400 873800 125829120
net.ipv4.tcp_wmem = 102400 873800 125829120

net.core.wmem_max = 125829120
net.core.rmem_max = 125829120

net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_timestamps = 1

net.ipv4.tcp_sack = 1
net.core.netdev_max_backlog = 5000

net.ipv4.udp_mem = 102400 873800 125829120
net.ipv4.udp_mem = 102400 873800 125829120

net.ipv4.udp_rmem_min = 10240
net.ipv4.udp_wmem_min = 10240

Context

We are scaling our Kubernetes Cluster inside of an AWS ASG. When adding new nodes, we rely on a working network. Even a not working network would be better than some missing nodes, because the cluster might behave flaky in rare cases and it is not evident where this comes from. We had for example DNS problems. A very small subset of our applications had a high error rate on resolving domain names and we didn't know where this came from for a long time. Now we know, that this was caused by a missing route between the instance where the faulty application ran on and the instance where the DNS server ran on.

Currently we need to manually grep the journal logs and replace broken instances, because it is hard to automatically figuring out, whether a route is missing.

Your Environment

  • Flannel version: v0.9.0 and v0.10.0
  • Backend used (e.g. vxlan or udp): vxlan (with and without DirectRouting)
  • Etcd version: 3.2.11
  • Kubernetes version (if used): v1.8.5+coreos.0
  • Operating System and version: Container Linux by CoreOS 1576.5.0 (Ladybug)
@jpiper
Copy link

jpiper commented Mar 10, 2018

Looks like this might be related to #779

@svenwltr
Copy link
Author

We already tried the proposed solution, but it didn't work.

IIUC correctly we have to change net.core.rmem_max and net.core.wmem_max. These are the values on a failed node:

# sysctl -a | grep [wr]mem_max
net.core.rmem_max = 125829120
net.core.wmem_max = 125829120

@chestack
Copy link

related to "get/set receive buffer size" in netlink: vishvananda/netlink@ef84ebb

@stale
Copy link

stale bot commented Jan 25, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jan 25, 2023
@rbrtbnfgl
Copy link
Contributor

This Bug it's still happening. I left this open. There is a workaround to avoid it. We'll update the docs with it until it's fixed.

@stale
Copy link

stale bot commented Jul 25, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 25, 2023
@rbrtbnfgl rbrtbnfgl removed the wontfix label Jul 28, 2023
@aslafy-z
Copy link

aslafy-z commented Dec 1, 2023

We detected a kind of the same issue. Some routes were missing after a network outage of a few hours. I would expect flannel to reconcile these routes. I'm I right expecting this?

@rbrtbnfgl
Copy link
Contributor

rbrtbnfgl commented Dec 1, 2023

Which version of flannel are you using? Maybe your issue is not directly related to this. This issue was related to missing rules when flannel starts with multiple nodes. On you case seems that the rules were somehow removed and aren't recreated again.

@aslafy-z
Copy link

aslafy-z commented Dec 1, 2023

I'm using v0.17.0 shipped with RKE1. I understand this is a kind of old version and my issue may have been fixed in the meantime. It looks like rancher is still shipping this version with the latest releases of RKE1.

Copy link

stale bot commented May 30, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label May 30, 2024
@stale stale bot closed this as completed Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants