Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Site-to-site networking #4

Closed
lmagyar opened this issue Jun 20, 2023 · 6 comments
Closed

Site-to-site networking #4

lmagyar opened this issue Jun 20, 2023 · 6 comments

Comments

@lmagyar
Copy link
Owner

lmagyar commented Jun 20, 2023

--snat-subnet-routes=false (8f836eb) seems to be not enough to make IP forwarding to work.

I've tried to set up routing on local non-tailscale device toward tailscaled, it works, but no routing happens between eth0 and tailscale0. I've tried to ping another devices's tailnet IP, not another LAN's device, but this should work even without subnet routing, so I'm trying with a minimal test case.

Enabling IP routing on HA needs more investigation, but currently I run out of free time...

Possible search keywords to find something relevant: hassio supervisor docker tailscale nmcli "network manager" "ip forwarding"

@Gyosa3
Copy link

Gyosa3 commented Jun 20, 2023

Hi, thanks for creating this issue. Let's restart from your initial observation:

In case of site-to-site networking, not only the device (who has eth0 and tailscale0), but other devices on it's local LAN also can access LANs of other devices. As I understand Site-to-site networking requires:

  1. enable subnet routes in tailscale admin console (I think you did this)
  2. enable IP forwarding (as I know it is enabled)
  3. --tun=userspace-networking option is not used (the last released official add-on doesn't do this, it uses this option)
  4. --snat-subnet-routes=false (the last released official add-on doesn't do this)
  5. configure local LAN non-tailscale devices' routing toward local tailscaled's (inside HA add-on) eth0 (you did this)
  1. Yes, local routes are announced and approved on each of the 2 subnet router nodes. The complete route is expected to be:
local device  <--> homeAssistant host <--> tailscale addon  <--|--> Tailscale addon  <----> HomeAssistant host <---> Local device
192.168.5.10          192.168.5.3            100.100.1.1       |      100.100.1.2              192.168.2.6          192.168.3.50
                         eth0                 tailscale0

  1. I don't know. IP forwarding seems to be setup in /etc/sysctl.conf and mine is empty while it should probably have something like net.ipv4.ip_forward=1, or I missed something? This may be the actual issue. I can't see any traffic forwarded from eth0 to tailscale0 and vice-versa.
  2. As discussed in issue Using Tailscale addon for site-to-site networking with 2 HA instances do not work hassio-addons/addon-tailscale#216 the behaviour of this option is rather unexpected. subnet routing does not work when this option is disabled, therefore interface tailscale0 does not exist, therefore no IP forwarding, therefore no site-2-site routing...
  3. I've tried with "true" and "false" but it doesn't change anything at this stage until points 2 and 3 are resolved. As an example, I use the AdGuard addon on the same Host and when I make DNS requests from a Tailscale client, the visible IP is the one of the host (192.168.5.3) regardless of this option, while non-Tailscale clients on the LAN are seen by their real IP.
  4. Yes, and I may also have missed something here. For each distant LAN I have created a route with 192.168.5.3 as next hop on my router. That should work, however on my laptop which has a tailscale client, the route table indicates that the next hop for the distant lan is the tailscale IP of the laptop, indicating to use the tailscale interface as next hop not the physical lan interface. So I tried the following:
  • I created 2 routes, one for the distant LAN and one for the Tailscale IP range, both with next hop to 192.168.5.3
  • when I traceroute to the Tailscale IP of the addon, I have:
    laptop (192.168.5.10) -> router (192.168.5.1) -> HA host (100.100.1.1)
    and it works, demonstrating that the Tailscale IP was routed by the router to the HA host, which "realised" that it's Tailscale IP is actually the target of the traceroute.
  • when I traceroute an IP in the Tailscale network or in the distant LAN I have:
    laptop (192.168.5.10) -> router (192.168.5.1) -> HA host (192.168.5.3) -> lost forever
    so for me there is an issue with IP forwarding from eth0 to tailscale0, but this is not surprising considering the issue with point 3...

Hope this helps...

@Gyosa3
Copy link

Gyosa3 commented Jun 21, 2023

ok, I'm getting somewhere here... I've done the following:

  1. deleted all tailscale addons from my setup, rebooted
  2. Installed this latest beta version
  3. added the node with default parameters
  4. verify the setup with debug logs: route 192.168.5.0/24 is correctly announced automatically, userspace-networking is activated
  5. changed the configuration to
advertise_exit_node: true
log_level: debug
snat_subnet_routes: true
accept_dns: false
userspace_networking: false
  1. verify the routes: 192.168.5.0/24 is linked to eth0, 192.168.3.0/24 is linked to tailscale2
  2. verify routes on router (192.168.5.1): 192.168.3.0/24 is linked to next hop 192.168.5.3
  3. verify access to distant LAN from a distant Tailscale client: my mobile phone can access http server at 192.168.3.50
  4. verify access to local LAN from a distant Tailscale client: my mobile phone can access an http server at 192.168.5.2 - yes!
  5. verify access to distant LAN from local LAN via Tailscale: my PC without Tailscale in network 192.168.5.0 can access an http server at 192.168.3.50 - yes yes!
    image

Please note that I am still using Tsujamin Tailscale addon at the target destination, with userspace-networking enabled, because it's the only addon that I know which can announce other routes than the vlan where HA is located. This is a capability that will be much needed in the official addon I think.

So distant node announces 192.168.2.0/28, 192.168.3.0/24, and a few more. they are well reported in the local node:
2023/06/21 10:13:42 monitor: RTM_NEWROUTE: src=, dst=192.168.2.0/28, gw=, outif=25, table=52
2023/06/21 10:13:42 monitor: RTM_NEWROUTE: src=, dst=192.168.3.0/24, gw=, outif=25, table=52

Detailed setup is the following.

Site1:

PC <-->router<-->HA host<-->beta tailscale addon<-->HA host<-->router <--> internet 
.5.x   .5.1      .5.3             100.x.x.x          .5.3       .5.1       to Tailscale target node
                 eth0             tailscale2         eth0

Site 2:

internet<-->router<-->HA host<-->Tsujamin Tailscale addon<-->HA host<-->router<-->web server
public IP    .2.1      .2.6        userspace-networking       .2.6     .2.1/.3.1    .3.50

I've relaunched the beta addon with snat_subnet_routes: false and I'll have to see if it makes any difference and where. At least I can testify that it works with both settings.

So now the last point for me is to be able to use your addon to declare extra routes to announce on the distant node. and any device on any of these VLANs will have to be able to pass through the addon to reach the "other side". That'll be the past piece of my setup to make it complete.

thanks for your help !! So happy that it's progressing :)

@lmagyar
Copy link
Owner Author

lmagyar commented Jun 22, 2023

I've made a config error, I've forgot to enable subnet routing for the source side also. After enabling subnet routing on both side, everything works. I've set up a similar network, 2 different LAN domains:

  • add-on config on both side: userspace_networking: false, snat_subnet_routes: true
  • subnet routing is enabled on both side on tailscale admin console
  • devices on source LAN has routing settings for the 100.x.x.x tailnet and for the other LAN domain toward the source side tailscaled eth0
  • I can ping and curl devices on destination LAN from devices on source LAN
  • snat_subnet_routes: false doesn't work, device on destination LAN receives the ICMP packages, but no reply, even when firewall is turned off, I didn't investigate further, this temporary snat_subnet_routes option is not needed currently to make site-to-site work, and it will be removed from beta now

So the problem is with your network setup. Please try to debug it step by step on your own:

  • first pinging from source tailscaled command line to destination tailscaled 100.x.x.x IP (testing internal tailscale routing)
  • then pinging from source tailscaild command line to destination lan device IP (testing incoming routing on dest side)
  • then pinging from source lan to destination tailscaled 100.x.x.x IP (testing outgoing routing on source side)
  • and only last, from source lan to dest lan (testing end-to-end)
  • and only after the above is working, try extra subnets

And just to clarify some things, if you disagree, modify your mental model until you agree, these are tested facts:

  • subnet routing on destination side is independent of userspace networking: this is about the incoming traffic (tailnet->lan) on the destination side, I've tested it again, even when the destination tailscale add-on is in userspace networking mode (no tailscale0), I can access the local LAN if subnet routng is enabled but the add-on is in userspace networking mode
  • subnet routing on source side requires non userspace networking, ie. to have tailscale0, to be able to route outgoing traffic (lan->tailnet) on the source side
  • if you see packages going out on source tailscaled log but not arriving at destination tailscaled log, you forget to enable subnet routing on source side on tailscale admin console (like me), without this, tailscale won't route packages originating from source lan, only packages originating from source tailscale0
  • tailscale0 is dangerous: it can soft brick your device, only physical keyboard and monitor can bring it back to life: if somebody uses the default 192.168.1.0/24 network on his all sites and enables subnet routing, and turns off userspace networking (yeah, great, new feature, try it out), tailscaled will redirect the local 192.168.1.0/24, even HA's local IP won't work, the device is unreachable, the device is soft bricked, even power cycling won't bring it back
  • use sysctl net.ipv4.ip_forward, the result will be net.ipv4.ip_forward = 1, IP forwarding it is enabled in HA
  • "Userspace networking enabled" = "--tun=userspace-networking is used by the add-on" = "you don't have tailscale0"

  1. verify access to distant LAN from local LAN via Tailscale: my PC without Tailscale in network 192.168.5.0 can access an http server at 192.168.3.50 - yes yes!

OK, it means eth0<->tailscale0 routing works, it means tailscaled<->tailscaled routing works, what is the issue from now on?

So, I will remove the snat_subnet_routes, add some comment to the DOCS, and close this issue, because everything is working fine.

@lmagyar lmagyar closed this as completed Jun 22, 2023
@lmagyar
Copy link
Owner Author

lmagyar commented Jun 22, 2023

Oh, and please remove your comment under PR 199, that doesn't belong to the PR, and please close issue 216, and tailscale issue 8370 also.

@Gyosa3
Copy link

Gyosa3 commented Jun 26, 2023

OK, it means eth0<->tailscale0 routing works, it means tailscaled<->tailscaled routing works, what is the issue from now on?

Hi, that was probably not clear enough, I meant exactly that: everything was working for me, I was super happy :)

The last point that I wanted to mention was to be able to announce extra routes but this is for another issue. Site-to-site does work thanks to your changes.

And also thanks for the explanation on routes and interfaces, I'll have to digest that, I'm not a network guy, I start from far...

Oh, and please remove your comment under PR 199, that doesn't belong to the PR,

done

and please close issue 216,

done

and tailscale issue 8370 also.

you still have a question open there, people may answer it. they can also close the issue when they wish.

@lmagyar
Copy link
Owner Author

lmagyar commented Jun 27, 2023

Yes, thank you! :) On 8370 I've asked about the MSS clamping later, because I plan to make a PR about MSS if it proves useful/necessary, and also snat-subnet-routes, because that can be useful for some people and has more than 0% chance to get merged.

And I'm not a networking guru at all, I'm coming from Windows world, I'm using these projects around HA to learn how things work around Linux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants