Flannel stops working on EC2 #87

dennybritz · 2014-11-24T11:27:47Z

First of all, sorry for the high-level error report, but I'm not sure what is going wrong or how to best debug it. I am running a ~8 node CoreOS cluster on EC2, stable channel. My etcd is running on a separate node. Using flannel from the current master branch built within a container.

Everything works fine for about 12 hours but then the routing stops working and packets are being dropped. This is reproducible and happens every time I re-provision my cluster (restart flannel and docker). I was initially running the UDP backend and I didn't find any error messages in the journal logs. I then switched to the vxlan backend and I am seeing the following:

Nov 24 11:07:33 kalm flanneld[17341]: I1124 11:07:33.506152 17341 vxlan.go:264] L3 miss: 10.10.109.8
Nov 24 11:07:33 kalm flanneld[17341]: I1124 11:07:33.506189 17341 device.go:228] calling NeighSet: 10.10.109.8, 12:7d:bc:75:19:1f
Nov 24 11:07:33 kalm flanneld[17341]: I1124 11:07:33.506235 17341 vxlan.go:275] AddL3 succeeded
Nov 24 11:07:33 kalm flanneld[17341]: I1124 11:07:33.507063 17341 vxlan.go:248] L2 miss: 12:7d:bc:75:19:1f
Nov 24 11:07:33 kalm flanneld[17341]: I1124 11:07:33.507078 17341 device.go:202] calling NeighAdd: 10.0.0.129, 12:7d:bc:75:19:1f
Nov 24 11:07:33 kalm flanneld[17341]: I1124 11:07:33.507115 17341 vxlan.go:259] AddL2 succeeded
Nov 24 11:08:01 kalm flanneld[17341]: I1124 11:08:01.843236 17341 vxlan.go:243] Ignoring not a miss: 12:7d:bc:75:19:1f, 10.10.109.8
Nov 24 11:08:12 kalm flanneld[17341]: I1124 11:08:12.576140 17341 vxlan.go:243] Ignoring not a miss: 12:7d:bc:75:19:1f, 10.10.109.8
Nov 24 11:08:13 kalm flanneld[17341]: I1124 11:08:13.578175 17341 vxlan.go:243] Ignoring not a miss: 12:7d:bc:75:19:1f, 10.10.109.8
Nov 24 11:08:14 kalm flanneld[17341]: I1124 11:08:14.580082 17341 vxlan.go:243] Ignoring not a miss: 12:7d:bc:75:19:1f, 10.10.109.8
Nov 24 11:08:36 kalm flanneld[17341]: I1124 11:08:36.658179 17341 vxlan.go:264] L3 miss: 10.10.109.8
Nov 24 11:08:36 kalm flanneld[17341]: I1124 11:08:36.658219 17341 device.go:228] calling NeighSet: 10.10.109.8, 12:7d:bc:75:19:1f
Nov 24 11:08:36 kalm flanneld[17341]: I1124 11:08:36.658292 17341 vxlan.go:275] AddL3 succeeded
Nov 24 11:09:04 kalm flanneld[17341]: I1124 11:09:04.946135 17341 vxlan.go:243] Ignoring not a miss: 12:7d:bc:75:19:1f, 10.10.109.8

Any ideas what could be causing this? Is it perhaps something specific to EC2?

The text was updated successfully, but these errors were encountered:

eyakubovich · 2014-11-24T18:32:59Z

@dennybritz Ignoring not a miss: should be benign and can be normal. It's logged as vxlan is new and I want extra logging. I think the problem is somewhere else. You're running in a VPC, correct? When it stops working, are you still able to reach other machines via eth0, not flannel? Next, can you ping the flannel interface (flannel0 or flannel.1) on a local machine? Then try flannel0/flannel.1 of a remote host. If remote flannel interface is not pingable, could you run tcpdump to capture traffic on UDP 8285 (if running udp backend) or UDP 8472 (if running vxlan). You can trun tcpdump via toolbox (https://github.com/coreos/toolbox) which should be installed on your CoreOS machine. The tcpdump should show where the traffic is getting dropped.

dennybritz · 2014-11-25T02:57:52Z

@eyakubovich Yes, I'm running in a VPC.

It's logged as vxlan is new and I want extra logging. I think the problem is somewhere else.

I see. However, the same log statements appear nowhere else in the logs and only start appearing around the same time flannel stops working.

Pinging eth0 on the remote machine works:

PING 10.0.0.126 (10.0.0.126) 56(84) bytes of data.
64 bytes from 10.0.0.126: icmp_seq=1 ttl=64 time=0.638 ms
64 bytes from 10.0.0.126: icmp_seq=2 ttl=64 time=0.370 ms

Pinging flannel0 (this is docker0, right?) on the local machine works for both machines:

PING 10.10.109.1 (10.10.109.1) 56(84) bytes of data.
64 bytes from 10.10.109.1: icmp_seq=1 ttl=64 time=0.046 ms
64 bytes from 10.10.109.1: icmp_seq=2 ttl=64 time=0.035 ms

Pinging flannel.1 on the local machine works for both machines:

PING 10.10.109.0 (10.10.109.0) 56(84) bytes of data.
64 bytes from 10.10.109.0: icmp_seq=1 ttl=64 time=0.054 ms
64 bytes from 10.10.109.0: icmp_seq=2 ttl=64 time=0.040 ms

Pinging either docker0 or flannel.1 on the remote machine does not work.

This happens in tcpdump:

Sending machine:

sh-4.2# tcpdump -v 'port 8472'
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
02:56:02.327762 IP (tos 0x0, ttl 64, id 11321, offset 0, flags [none], proto UDP (17), length 134)
    ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP (tos 0x0, ttl 64, id 16107, offset 0, flags [DF], proto ICMP (1), length 84)
    ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 1, length 64
02:56:03.327960 IP (tos 0x0, ttl 64, id 11922, offset 0, flags [none], proto UDP (17), length 134)
    ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP (tos 0x0, ttl 64, id 16494, offset 0, flags [DF], proto ICMP (1), length 84)
    ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 2, length 64
02:56:04.327951 IP (tos 0x0, ttl 64, id 12315, offset 0, flags [none], proto UDP (17), length 134)
    ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP (tos 0x0, ttl 64, id 17120, offset 0, flags [DF], proto ICMP (1), length 84)
    ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 3, length 64
02:56:05.327949 IP (tos 0x0, ttl 64, id 12358, offset 0, flags [none], proto UDP (17), length 134)
    ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP (tos 0x0, ttl 64, id 17866, offset 0, flags [DF], proto ICMP (1), length 84)
    ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 4, length 64
02:56:06.327948 IP (tos 0x0, ttl 64, id 12663, offset 0, flags [none], proto UDP (17), length 134)
    ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP (tos 0x0, ttl 64, id 18023, offset 0, flags [DF], proto ICMP (1), length 84)
    ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 5, length 64
02:56:07.327952 IP (tos 0x0, ttl 64, id 12918, offset 0, flags [none], proto UDP (17), length 134)
    ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP (tos 0x0, ttl 64, id 18240, offset 0, flags [DF], proto ICMP (1), length 84)
    ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 6, length 64
^C
6 packets captured
6 packets received by filter
0 packets dropped by kernel

Receiving machine (packets from other machine than the sending machine are received as well):

sh-4.2# tcpdump 'port 8472'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
02:56:00.372779 IP ip-10-0-0-130.ec2.internal.36461 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-85-0.ec2.internal.55896 > ip-10-10-109-8.ec2.internal.amqp: Flags [P.], seq 439869726:439869734, ack 4278089106, win 218, options [nop,nop,TS val 684964124 ecr 684939401], length 8
02:56:00.372860 IP ip-10-0-0-129.ec2.internal.44026 > ip-10-0-0-130.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-109-8.ec2.internal.amqp > ip-10-10-85-0.ec2.internal.55896: Flags [.], ack 8, win 217, options [nop,nop,TS val 684954401 ecr 684964124], length 0
02:56:02.326245 IP ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 1, length 64
02:56:03.326735 IP ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 2, length 64
02:56:04.214092 IP ip-10-0-0-129.ec2.internal.44026 > ip-10-0-0-130.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-109-8.ec2.internal.amqp > ip-10-10-85-0.ec2.internal.55896: Flags [P.], seq 1:9, ack 8, win 217, options [nop,nop,TS val 684958242 ecr 684964124], length 8
02:56:04.214546 IP ip-10-0-0-130.ec2.internal.36461 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-85-0.ec2.internal.55896 > ip-10-10-109-8.ec2.internal.amqp: Flags [.], ack 9, win 218, options [nop,nop,TS val 684967966 ecr 684958242], length 0
02:56:04.326424 IP ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 3, length 64
02:56:05.326529 IP ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 4, length 64
02:56:06.326458 IP ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 5, length 64
02:56:07.326805 IP ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 6, length 64
02:56:07.761686 IP ip-10-0-0-128.ec2.internal.60215 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-78-0.ec2.internal.57891 > ip-10-10-109-8.ec2.internal.amqp: Flags [P.], seq 1085651941:1085651949, ack 4057287077, win 218, options [nop,nop,TS val 684972546 ecr 684946790], length 8
02:56:07.761783 IP ip-10-0-0-129.ec2.internal.34077 > ip-10-0-0-128.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-109-8.ec2.internal.amqp > ip-10-10-78-0.ec2.internal.57891: Flags [.], ack 8, win 217, options [nop,nop,TS val 684961790 ecr 684972546], length 0
02:56:08.326477 IP ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 7, length 64
02:56:09.326445 IP ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 8, length 64
02:56:10.326695 IP ip-10-0-0-126.ec2.internal.43647 > ip-10-0-0-129.ec2.internal.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP ip-10-10-1-0.ec2.internal > ip-10-10-109-1.ec2.internal: ICMP echo request, id 22316, seq 9, length 64
^C
15 packets captured
15 packets received by filter
0 packets dropped by kernel

So it seems like packets are arriving on the overlay network...

dennybritz · 2014-11-25T03:35:29Z

Based on the above it seems like this is a one-way issue. If I switch receiving/sending machines I get

 ping 10.10.1.1
PING 10.10.1.1 (10.10.1.1) 56(84) bytes of data.
From 10.10.109.0 icmp_seq=1 Destination Host Unreachable
From 10.10.109.0 icmp_seq=2 Destination Host Unreachable
From 10.10.109.0 icmp_seq=3 Destination Host Unreachable
From 10.10.109.0 icmp_seq=4 Destination Host Unreachable

And no ICMP packets are being captured on port 8472.

eyakubovich · 2014-11-25T04:31:13Z

@dennybritz and the tcpdump confirms that as well -- we see Echo Request make it from one host to another but no sightings of Echo Reply. Let's do a few tests on the box that had nothing captured. As next step, let's see if the routes are still there: netstat -rn (we want to see a line that routes 10.10.0.0/16 via flannel.1).

Now, keep running ping 10.10.1.0. We'll see if there's an ARP entry that maps 10.10.1.0 to some MAC:
arp. Next we want to see if vxlan's forwarding database has the route (VXLAN is L2 technology so it has a bridge-like forwarding database that maps MAC addresses to remote host's IPs). Run bridge fdb and we want to see an entry mapping the MAC we saw in arp output to 10.0.0.126.

Can you also paste the output if ifconfig and ip addr just to make sure there's nothing odd here.

I'm also going to setup an cluster on EC2 with VPC and run it over the next few days.

eyakubovich · 2014-11-25T04:39:20Z

Please also do a quick dmesg to make sure there's nothing odd showing up there.

dennybritz · 2014-11-25T04:44:55Z

@eyakubovich

ifconfig
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 8951
        inet 10.10.109.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::5484:7aff:fefe:9799  prefixlen 64  scopeid 0x20<link>
        ether 56:84:7a:fe:97:99  txqueuelen 0  (Ethernet)
        RX packets 81863  bytes 34079868 (32.5 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 69889  bytes 28946691 (27.6 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9001
        inet 10.0.0.129  netmask 255.255.255.0  broadcast 10.0.0.255
        inet6 fe80::80e:65ff:fe61:cea4  prefixlen 64  scopeid 0x20<link>
        ether 0a:0e:65:61:ce:a4  txqueuelen 1000  (Ethernet)
        RX packets 24867028  bytes 17914452744 (16.6 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 20723760  bytes 3348891261 (3.1 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 8951
        inet 10.10.109.0  netmask 255.255.0.0  broadcast 0.0.0.0
        inet6 fe80::107d:bcff:fe75:191f  prefixlen 64  scopeid 0x20<link>
        ether 12:7d:bc:75:19:1f  txqueuelen 0  (Ethernet)
        RX packets 48029  bytes 26666114 (25.4 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 45038  bytes 28195557 (26.8 MiB)
        TX errors 0  dropped 2051 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 0  (Local Loopback)
        RX packets 27187  bytes 5091098 (4.8 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 27187  bytes 5091098 (4.8 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth3834: flags=67<UP,BROADCAST,RUNNING>  mtu 8951
        inet6 fe80::7c84:40ff:fecc:1f16  prefixlen 64  scopeid 0x20<link>
        ether 7e:84:40:cc:1f:16  txqueuelen 1000  (Ethernet)
        RX packets 17864  bytes 5507830 (5.2 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 17913  bytes 1527268 (1.4 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth473d: flags=67<UP,BROADCAST,RUNNING>  mtu 8951
        inet6 fe80::d426:3dff:feca:d64d  prefixlen 64  scopeid 0x20<link>
        ether d6:26:3d:ca:d6:4d  txqueuelen 1000  (Ethernet)
        RX packets 14456  bytes 1480514 (1.4 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8313  bytes 548832 (535.9 KiB)
        TX errors 0  dropped 1 overruns 0  carrier 0  collisions 0

veth7d7f: flags=67<UP,BROADCAST,RUNNING>  mtu 8951
        inet6 fe80::7009:a4ff:fef9:efc2  prefixlen 64  scopeid 0x20<link>
        ether 72:09:a4:f9:ef:c2  txqueuelen 1000  (Ethernet)
        RX packets 9  bytes 738 (738.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 78  bytes 4404 (4.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP qlen 1000
    link/ether 0a:0e:65:61:ce:a4 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.129/24 brd 10.0.0.255 scope global dynamic eth0
       valid_lft 3083sec preferred_lft 3083sec
    inet6 fe80::80e:65ff:fe61:cea4/64 scope link 
       valid_lft forever preferred_lft forever
105: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8951 qdisc noqueue state UNKNOWN 
    link/ether 12:7d:bc:75:19:1f brd ff:ff:ff:ff:ff:ff
    inet 10.10.109.0/16 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::107d:bcff:fe75:191f/64 scope link 
       valid_lft forever preferred_lft forever
110: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8951 qdisc noqueue state UP 
    link/ether 56:84:7a:fe:97:99 brd ff:ff:ff:ff:ff:ff
    inet 10.10.109.1/24 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::5484:7aff:fefe:9799/64 scope link 
       valid_lft forever preferred_lft forever
120: veth7d7f: <BROADCAST,UP,LOWER_UP> mtu 8951 qdisc pfifo_fast master docker0 state UP qlen 1000
    link/ether 72:09:a4:f9:ef:c2 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::7009:a4ff:fef9:efc2/64 scope link 
       valid_lft forever preferred_lft forever
122: veth473d: <BROADCAST,UP,LOWER_UP> mtu 8951 qdisc pfifo_fast master docker0 state UP qlen 1000
    link/ether d6:26:3d:ca:d6:4d brd ff:ff:ff:ff:ff:ff
    inet6 fe80::d426:3dff:feca:d64d/64 scope link 
       valid_lft forever preferred_lft forever
124: veth3834: <BROADCAST,UP,LOWER_UP> mtu 8951 qdisc pfifo_fast master docker0 state UP qlen 1000
    link/ether 7e:84:40:cc:1f:16 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::7c84:40ff:fecc:1f16/64 scope link 
       valid_lft forever preferred_lft forever

netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         10.0.0.1        0.0.0.0         UG        0 0          0 eth0
10.0.0.0        0.0.0.0         255.255.255.0   U         0 0          0 eth0
10.0.0.1        0.0.0.0         255.255.255.255 UH        0 0          0 eth0
10.10.0.0       0.0.0.0         255.255.0.0     U         0 0          0 flannel.1
10.10.109.0     0.0.0.0         255.255.255.0   U         0 0          0 docker0

The ARP table while running ping, seems like there is no mapping?

ip-10-10-109-4.ec2.inte          (incomplete)                              docker0
ip-10-10-85-0.ec2.inter  ether   06:c6:77:33:ac:46   C                     flannel.1
ip-10-0-0-128.ec2.inter  ether   0a:0a:6b:b6:37:2a   C                     eth0
ip-10-10-109-5.ec2.inte  ether   d2:9f:58:80:9c:a3   C                     docker0
ip-10-10-109-8.ec2.inte  ether   16:eb:b7:b4:78:9a   C                     docker0
ip-10-0-0-130.ec2.inter  ether   0a:79:35:13:84:b0   C                     eth0
ip-10-10-109-3.ec2.inte  ether   22:31:58:f0:29:55   C                     docker0
ip-10-0-0-126.ec2.inter  ether   0a:d1:53:10:59:72   C                     eth0
ip-10-10-92-0.ec2.inter  ether   b6:15:9f:ed:a9:99   C                     flannel.1
ip-10-10-1-1.ec2.intern          (incomplete)                              flannel.1
ip-10-10-1-0.ec2.intern          (incomplete)                              flannel.1
ip-10-10-17-3.ec2.inter          (incomplete)                              flannel.1
ip-10-0-0-1.ec2.interna  ether   0a:09:1c:0d:72:c1   C                     eth0
ip-10-0-0-2.ec2.interna  ether   0a:09:1c:0d:72:c1   C                     eth0
ip-10-10-78-0.ec2.inter  ether   86:cb:0b:78:50:88   C                     flannel.1
ip-10-10-109-7.ec2.inte  ether   96:df:6b:c6:22:ba   C                     docker0
ip-10-0-0-167.ec2.inter  ether   0a:a1:99:3d:39:b7   C                     eth0

bridge fdb
33:33:00:00:00:01 dev eth0 self permanent
01:00:5e:00:00:01 dev eth0 self permanent
33:33:ff:61:ce:a4 dev eth0 self permanent
06:c6:77:33:ac:46 dev flannel.1 dst 10.0.0.130 self 
86:cb:0b:78:50:88 dev flannel.1 dst 10.0.0.128 self 
56:84:7a:fe:97:99 dev docker0 vlan 0 permanent
96:df:6b:c6:22:ba dev veth473d vlan 0 
72:09:a4:f9:ef:c2 dev veth7d7f vlan 0 permanent
d6:26:3d:ca:d6:4d dev veth473d vlan 0 permanent
7e:84:40:cc:1f:16 dev veth3834 vlan 0 permanent
16:eb:b7:b4:78:9a dev veth3834 vlan 0 
33:33:00:00:00:01 dev veth7d7f self permanent
01:00:5e:00:00:01 dev veth7d7f self permanent
33:33:ff:f9:ef:c2 dev veth7d7f self permanent
33:33:00:00:00:01 dev veth473d self permanent
01:00:5e:00:00:01 dev veth473d self permanent
33:33:ff:ca:d6:4d dev veth473d self permanent
33:33:00:00:00:01 dev veth3834 self permanent
01:00:5e:00:00:01 dev veth3834 self permanent
33:33:ff:cc:1f:16 dev veth3834 self permanent

This is the tail of dmesg:

[627839.752731] docker0: port 1(veth7d7f) entered forwarding state
[627839.758218] docker0: port 1(veth7d7f) entered forwarding state
[627854.816087] docker0: port 1(veth7d7f) entered forwarding state
[627867.690771] docker0: port 2(vetha248) entered disabled state
[627867.694547] device vetha248 left promiscuous mode
[627867.696624] docker0: port 2(vetha248) entered disabled state
[627930.186824] device veth473d entered promiscuous mode
[627930.192813] IPv6: ADDRCONF(NETDEV_UP): veth473d: link is not ready
[627930.201281] IPv6: ADDRCONF(NETDEV_CHANGE): veth473d: link becomes ready
[627930.206117] docker0: port 2(veth473d) entered forwarding state
[627930.210761] docker0: port 2(veth473d) entered forwarding state
[627945.248064] docker0: port 2(veth473d) entered forwarding state
[628006.978642] docker0: port 3(vethf5ca) entered disabled state
[628006.982581] device vethf5ca left promiscuous mode
[628006.984739] docker0: port 3(vethf5ca) entered disabled state
[628037.723530] device veth3834 entered promiscuous mode
[628037.729605] IPv6: ADDRCONF(NETDEV_UP): veth3834: link is not ready
[628037.737120] IPv6: ADDRCONF(NETDEV_CHANGE): veth3834: link becomes ready
[628037.741944] docker0: port 3(veth3834) entered forwarding state
[628037.746370] docker0: port 3(veth3834) entered forwarding state
[628052.768068] docker0: port 3(veth3834) entered forwarding state
[684719.043883] device veth21bd entered promiscuous mode
[684719.049382] IPv6: ADDRCONF(NETDEV_UP): veth21bd: link is not ready
[684719.058117] IPv6: ADDRCONF(NETDEV_CHANGE): veth21bd: link becomes ready
[684719.062250] docker0: port 4(veth21bd) entered forwarding state
[684719.066120] docker0: port 4(veth21bd) entered forwarding state
[684719.100427] docker0: port 4(veth21bd) entered disabled state
[684719.107517] device veth21bd left promiscuous mode
[684719.109867] docker0: port 4(veth21bd) entered disabled state
[684858.707589] Netfilter messages via NETLINK v0.30.
[684858.710216] device eth0 entered promiscuous mode
[684906.146435] device eth0 left promiscuous mode
[685098.401399] device eth0 entered promiscuous mode
[685114.527857] device eth0 left promiscuous mode
[685252.427875] device eth0 entered promiscuous mode
[685264.962632] device eth0 left promiscuous mode
[686772.549967] device eth0 entered promiscuous mode
[686784.037602] device eth0 left promiscuous mode
[687274.354654] device eth0 entered promiscuous mode
[687277.540947] device eth0 left promiscuous mode
[687446.522131] device eth0 entered promiscuous mode
[687537.721096] device eth0 left promiscuous mode
[691412.256376] device eth0 entered promiscuous mode

eyakubovich · 2014-11-25T05:06:01Z

@dennybritz exactly. I'm assuming you were pinging 10.10.1.0 or 10.10.1.1 and we see ARP resolution in progress but not resolved:

ip-10-10-1-1.ec2.intern          (incomplete)                              flannel.1
ip-10-10-1-0.ec2.intern          (incomplete)                              flannel.1

That would be causing the problem. Now, can you look into flannel logs. At that time there should be something like:

vxlan.go:264] L3 miss: 10.10.1.1
device.go:228] calling NeighSet: 10.10.1.1, xx:xx:xx:xx:xx:xx
vxlan.go:275] AddL3 succeeded (or failed)

ARP resolutions are forwarded up to flanneld and it inserts entries into the ARP table based on its knowledge.

dennybritz · 2014-11-25T05:18:21Z

@eyakubovich Yep, that's right, though I don't see any calling NeighSet: 10.10.1.1, xx:xx:xx:xx:xx:xx lines.

Nov 25 05:11:47 rabi flanneld[10068]: I1125 05:11:47.892862 10068 vxlan.go:264] L3 miss: 10.10.1.1
Nov 25 05:11:47 rabi flanneld[10068]: I1125 05:11:47.892912 10068 vxlan.go:268] Route for 10.10.1.1 not found
Nov 25 05:11:48 rabi flanneld[10068]: I1125 05:11:48.895367 10068 vxlan.go:264] L3 miss: 10.10.1.1
Nov 25 05:11:48 rabi flanneld[10068]: I1125 05:11:48.895418 10068 vxlan.go:268] Route for 10.10.1.1 not found
Nov 25 05:11:49 rabi flanneld[10068]: I1125 05:11:49.897387 10068 vxlan.go:264] L3 miss: 10.10.1.1
Nov 25 05:11:49 rabi flanneld[10068]: I1125 05:11:49.897435 10068 vxlan.go:268] Route for 10.10.1.1 not found

This is what my etcd looks like, if that helps? Is PublicIP supposed to be 0.0.0.0? https://gist.github.com/dennybritz/5ba9d14fd52fad51a700

eyakubovich · 2014-11-25T05:29:20Z

@dennybritz no, PublicIP needs to be the IP of your eth0. And it's 0.0.0.0 for all entries. But since flannel worked for a while, it had to be valid for some period of time. This is really odd.

eyakubovich · 2014-11-25T05:32:33Z

flannel gets a lease for 24 hours (via etcd) and renews it with 1 hour to go. I wonder if something broke in the renewing logic and it renews after 12 hours but sets PublicIP to 0.0.0.0. But then your TTL still shows over 10 hours remaining. I'm going to look into this.

dennybritz · 2014-11-25T05:35:23Z

Thanks. I'm not sure if this is related, but my log is full of these messages (on all machines):

Nov 25 05:03:47 rabi flanneld[10068]: W1125 05:03:47.174537 10068 subnet.go:405] Watch of subnet leases failed because etcd index outside history window

However, these were appearing when flannel was working as well.

eyakubovich · 2014-11-25T05:35:28Z

@dennybritz If it does renew, it should print:
Lease renewed, new expiration: ... in the log. Also, this propagates to all other nodes which should print: Subnet added: <ip-address>

eyakubovich · 2014-11-25T05:38:09Z

This happens if etcd advances too fast but flannel has logic to recover (although I wonder if there could be a bug there). Is your cluster busy? In general, it shouldn't constantly fall behind. But this gives me some clues to investigate tomorrow.

dennybritz · 2014-11-25T05:42:45Z

@eyakubovich Looking at the logs:

journalctl -u flannel.service | grep 'renewed'
Nov 19 03:42:17 rabi flanneld[2132]: I1119 03:42:17.692125 02132 subnet.go:415] Lease renewed, new expiration: 2014-11-20 03:42:17.690585721 +0000 UTC
Nov 20 02:42:17 rabi flanneld[2132]: I1120 02:42:17.696573 02132 subnet.go:415] Lease renewed, new expiration: 2014-11-21 02:42:17.694406103 +0000 UTC
Nov 21 06:46:59 rabi flanneld[22435]: I1121 06:46:59.382488 22435 subnet.go:442] Lease renewed, new expiration: 2014-11-22 06:46:59.379720337 +0000 UTC
Nov 22 09:39:30 rabi flanneld[28096]: I1122 09:39:30.420100 28096 subnet.go:442] Lease renewed, new expiration: 2014-11-23 09:39:30.419292114 +0000 UTC
Nov 23 09:51:50 rabi flanneld[22225]: I1123 09:51:50.133352 22225 subnet.go:442] Lease renewed, new expiration: 2014-11-24 09:51:50.133835176 +0000 UTC
Nov 24 15:18:17 rabi flanneld[10068]: I1124 15:18:17.699239 10068 subnet.go:442] Lease renewed, new expiration: 2014-11-25 15:18:17.694001804 +0000 UTC

journalctl -u flannel.service | grep 'Subnet added: 10.10.1.0'
Nov 23 16:18:43 rabi flanneld[10068]: I1123 16:18:43.897713 10068 vxlan.go:190] Subnet added: 10.10.1.0/24
Nov 24 15:18:43 rabi flanneld[10068]: I1124 15:18:43.894272 10068 vxlan.go:190] Subnet added: 10.10.1.0/24

I don't think my cluster is busy, I've mainly been testing things. However, these are t2.small instances that can't handle much. But looking at CPU/memory utilization everything should be fine.

eyakubovich · 2014-11-25T05:50:34Z

@dennybritz The renewals seem to be happening no less than 23 hours apart. Not sure what could be happening after 12 hours but something definitely messes up etcd. Will investigate and let you know.

eyakubovich · 2014-11-25T23:59:02Z

@dennybritz I found and fixed the bug (introduced during some refactor) that was zeroing out PublicIP during the lease renewal. Can you check if it stays up now?

dennybritz · 2014-11-26T00:57:22Z

That's great news, thanks! I am trying it.

dennybritz · 2014-11-27T05:40:40Z

@eyakubovich Has been up for more than 24 hours now so I think this is resolved!

eyakubovich · 2014-11-27T18:30:02Z

Great. Will consider this resolved.

eyakubovich mentioned this issue Nov 25, 2014

bug fix: LeaseAttrs lost on lease-renew #89

Merged

eyakubovich added the kind/bug label Nov 27, 2014

eyakubovich closed this as completed Nov 27, 2014

davidsiefert mentioned this issue Mar 28, 2016

docker containers can only ping one way #424

Closed

smugcloud mentioned this issue Aug 3, 2016

Routing errors in Kubernetes Cluster #481

Closed

CocaCola183 mentioned this issue Feb 3, 2017

centos flannel vxlan, ping cross host failed #604

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flannel stops working on EC2 #87

Flannel stops working on EC2 #87

dennybritz commented Nov 24, 2014

eyakubovich commented Nov 24, 2014

dennybritz commented Nov 25, 2014

dennybritz commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

dennybritz commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

dennybritz commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

dennybritz commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

dennybritz commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

dennybritz commented Nov 26, 2014

dennybritz commented Nov 27, 2014

eyakubovich commented Nov 27, 2014

Flannel stops working on EC2 #87

Flannel stops working on EC2 #87

Comments

dennybritz commented Nov 24, 2014

eyakubovich commented Nov 24, 2014

dennybritz commented Nov 25, 2014

dennybritz commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

dennybritz commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

dennybritz commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

dennybritz commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

dennybritz commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

eyakubovich commented Nov 25, 2014

dennybritz commented Nov 26, 2014

dennybritz commented Nov 27, 2014

eyakubovich commented Nov 27, 2014