Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iroh does not work with relays without STUN on networks without UPnP #2876

Closed
link2xt opened this issue Oct 31, 2024 · 5 comments · Fixed by #2877
Closed

iroh does not work with relays without STUN on networks without UPnP #2876

link2xt opened this issue Oct 31, 2024 · 5 comments · Fixed by #2877
Assignees
Labels
bug Something isn't working
Milestone

Comments

@link2xt
Copy link
Contributor

link2xt commented Oct 31, 2024

To reproduce, create iroh.toml file in the root of this repo checkout:

$ cat iroh.toml
[[relay_nodes]]
url = "https://iroh.testrun.org.:4443"
stun_only = false
stun_port = 3478

https://iroh.testrun.org.:4443 is running iroh-relay 0.25.0.

Then run:

$ cargo run -p iroh-cli -- --config ./iroh.toml doctor accept
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.27s
     Running `target/debug/iroh --config ./iroh.toml doctor accept`
Error: wait for relay connection

Caused by:
    deadline has elapsed

On a network with UPNP it works:

$ cargo run -p iroh-cli -- --config ./iroh.toml doctor accept
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.29s
     Running `target/debug/iroh --config ./iroh.toml doctor accept`
Connect to this node using one of the following commands:

	Using the relay url and direct connections:
iroh doctor connect ...
...

The problem is present since iroh 0.27.0. More precisely, commit 5e40fe1, PR #2804

This PR pulled in commit dariusc93/rust-igd@56934bb of rust-igd (igd-next crate) that was merged from PR dariusc93/rust-igd#8

@link2xt link2xt changed the title iroh does not work with 0.25.0 relay on networks without UPNP iroh does not work with 0.25.0 relay on networks without UPnP Oct 31, 2024
@link2xt
Copy link
Contributor Author

link2xt commented Oct 31, 2024

If you don't have a host without UPnP, a server such as Hetzner VPS is sufficient.
Here is a log from such server, running NO_COLOR=1 RUST_LOG=trace,iroh_net=trace,rust_igd=trace cargo run -p iroh-cli -- --config ./iroh.toml doctor accept &> log.txt:
iroh-log.txt

@Arqu
Copy link
Collaborator

Arqu commented Oct 31, 2024

Well that mostly depends on your firewall setup as most VPSs will get a public IP and won't need additional port forwarding shenanigans.
Alternatively as you suggested elsewhere, running over a mobile hotspot will for most users produce a NAT without UPNP type topology.

@link2xt
Copy link
Contributor Author

link2xt commented Oct 31, 2024

This VPS has a public IP 65.109.140.51 seen in the log. There is no firewall configured on the host. Yet doctor accept times out.

@link2xt
Copy link
Contributor Author

link2xt commented Nov 1, 2024

One more detail: STUN server on port 3478 is accidentally firewalled away.
https://devina.io/stun-tester when pointed to stun:iroh.testrun.org:3478 reports "STUN server is not operational".
Not fixing it to keep the bug reproducible. We will eventually shutdown iroh.testrun.org after migrating all users away from it, so it is kept at version 0.25.0 and with firewalled UDP.

@dignifiedquire
Copy link
Contributor

dignifiedquire commented Nov 1, 2024

So, we figured out roughly the list of things that went wrong

  • for netcheck OVERALL_REPORT_TIMEOUT is set to 5seconds, which means, that if this timer expires, the report is considered faulty and is thrown away
  • usually enough probes return before this, so this is never hit
  • [email protected] has a bug, which doesn't respect the timeout on search_gateway anymore, but rather hardcoding a 5 second timeout, instead of the 1 second we set
  • in the case of using iroh.testrun.org with disabled STUN, all STUN probes would hang/timeout when this is combined with no upnp gateway, we end up timing out the whole report (because upnp waits for 5seconds, instead of configured 3), resulting in netcheck considering the whole report failed, which ends up generating the above error

A prototype fix is in 2eb3c8f which will need cleanup, but in all my testing shows to fix the issue

@dignifiedquire dignifiedquire changed the title iroh does not work with 0.25.0 relay on networks without UPnP iroh does not work with relays without STUN on networks without UPnP Nov 1, 2024
@dignifiedquire dignifiedquire added this to the v0.28.0 milestone Nov 1, 2024
@dignifiedquire dignifiedquire self-assigned this Nov 1, 2024
github-merge-queue bot pushed a commit that referenced this issue Nov 1, 2024
Closes #2876

---------

Co-authored-by: Philipp Krüger <[email protected]>
Co-authored-by: Divma <[email protected]>
@github-project-automation github-project-automation bot moved this to ✅ Done in iroh Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants