Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to debug flannel issues across nodes? #370

Closed
stevef1uk opened this issue Nov 16, 2015 · 4 comments
Closed

How to debug flannel issues across nodes? #370

stevef1uk opened this issue Nov 16, 2015 · 4 comments

Comments

@stevef1uk
Copy link

Hi,

I have followed the documentation and have been trying to get flannel to work using etcd and docker (1.9) on physical linux machines (RPi2s) so that I can get containers created on each node to communicate. I have got each node's set-up working fine, but the containers on each can't ping each other. I can't see anything obvious in ipchains that would stop this but I could do with some advice on how to diagnose my issue(s) as I am not a networking guru.

I have manage to get docker vanilla docker swarm to work between the nodes, but by default containers created on each node can't ping each other. I then tried the docker overlay network and that worked fine on each node with containers able to ping each other, but again didn't give me visibility across nodes.

I would really like to be able to use swarm with flannel as the overlay network as my end-state, but I am stuck at step one so any advice welcome.

Regards

Steve

@eyakubovich
Copy link
Contributor

@stevef1uk Are you using default (udp) backend or did you specify another one such as vxlan? Can you ping the flannel0 (udp) or flannel.1 (vxlan) interface from the other host (on the host, not in the container)?

@stevef1uk
Copy link
Author

Simply tried udp. I didn't try that ping test.

@stevef1uk
Copy link
Author

OK, I have recreated my environment.
Node 1:
pi@rpi2fr ~ $ cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.20.0.0/16
FLANNEL_SUBNET=10.20.38.1/24
FLANNEL_MTU=1472
FLANNEL_IPMASQ=false

Node 2:
steve@mark-mint ~ $ cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.20.0.0/16
FLANNEL_SUBNET=10.20.41.1/24
FLANNEL_MTU=1472
FLANNEL_IPMASQ=false

From Node 1 ping test (hopefully to the right thing?):
pi@rpi2fr ~ $ ping 10.20.41.1
PING 10.20.41.1 (10.20.41.1) 56(84) bytes of data.
From 10.20.41.0: icmp_seq=2 Redirect Host(New nexthop: 10.20.41.1)

From Node 2 ping test:
steve@mark-mint ~ $ ping 10.20.38.1
PING 10.20.38.1 (10.20.38.1) 56(84) bytes of data.
From 10.20.38.0: icmp_seq=2 Redirect Host(New nexthop: 10.20.38.0)

Is this so far so good?

@stevef1uk
Copy link
Author

Then on Node 1:

root@rpi2fr:# source /run/flannel/subnet.env
nohup docker daemon -H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=${FLANNEL_SUBNET} --mtu=${FLANNEL_MTU} &
root@rpi2fr:
# ps ax | grep docker
2455 ? Ss 0:00 runsv docker
2464 ? S 0:00 svlogd -t /var/log/docker
20404 pts/0 Sl 0:01 docker daemon -H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=10.20.38.1/24 --mtu=1472
20448 pts/0 S+ 0:00 grep docker

root@rpi2fr:~# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 192.168.0.1 0.0.0.0 UG 202 0 0 eth0
default 192.168.0.1 0.0.0.0 UG 303 0 0 wlan0
10.20.0.0 * 255.255.0.0 U 0 0 0 flannel0
10.20.38.0 * 255.255.255.0 U 0 0 0 docker0
172.18.0.0 * 255.255.0.0 U 0 0 0 br-24597bc23fd3
192.168.0.0 * 255.255.255.0 U 202 0 0 eth0
192.168.0.0 * 255.255.255.0 U 303 0 0 wlan0

On node 2:
The same steps to start docker

steve@mark-mint ~ $ ps ax | grep docker
25036 pts/3 Sl 0:00 docker daemon -H tcp://127.0.0.1:4243 -H unix:///var/rundocker.sock --bip=10.20.41.1/24 --mtu=1472
27341 pts/3 S+ 0:00 grep --colour=auto docker

mark-mint ~ # route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 192.168.0.1 0.0.0.0 UG 0 0 0 eth0
10.20.0.0 * 255.255.0.0 U 0 0 0 flannel0
10.20.41.0 * 255.255.255.0 U 0 0 0 docker0
172.17.0.0 * 255.255.0.0 U 0 0 0 br-52b68ce74035
172.19.0.0 * 255.255.0.0 U 0 0 0 docker_gwbridge
192.168.0.0 * 255.255.255.0 U 1 0 0 eth0

On node 1:
root@c1c5e685c11e:/# ping 10.20.41.2
PING 10.20.41.2 (10.20.41.2) 56(84) bytes of data.
64 bytes from 10.20.41.2: icmp_req=1 ttl=60 time=1.42 ms

On Node 2:
ping 10.20.38.2
PING 10.20.38.2 (10.20.38.2) 56(84) bytes of data.
64 bytes from 10.20.38.2: icmp_seq=1 ttl=60 time=3.59 ms

So it now works! I must have messed up one of the manual steps previously, probably on starting docker!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants