Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge packet loss [>80%] pinging server #1643

Open
ShantanuNair opened this issue Nov 24, 2019 · 21 comments
Open

Huge packet loss [>80%] pinging server #1643

ShantanuNair opened this issue Nov 24, 2019 · 21 comments

Comments

@ShantanuNair
Copy link

My setup:

Server is running on EC2 t3.micro instance in the Bahrain AWS region.
Client is MacOS running wireguard official gui.

Issue:

  • Wg has in general been inconsistently very sluggish when deployed via Algo.
  • I say it's inconsistent as it works ok-ish sometimes�—poor initial webpage load times, but when things start loading, it chugs along. For example if I test speed on fast.com, it will stay at 0kbps and not load anything for maybe 30s and then when it finally does start loading, it pulls ~40-50mbps.
  • I have tested a commercial VPN provider who provides wireguard support and they work fine, so I don't believe it's ISP level DPI/throttling.
  • Pinging sites resulted in inconsistent packet-loss. Seemingly at random it would work fine, but most times I would get a timeout.

While digging deeper, I noticed that when pinging 10.19.49.1 from the wg client, I get huge packet loss. This may be the weak link where I'm losing packets. My internet connection is definitely not the issue. Why would I be having packet loss when pinging 10.19.49.1? Is it an issue with my EC2 networking? It surely can't be my wireguard client as then it wouldn't work with the commercial wg VPN provider.

Here is the ping results:

ping 10.19.49.1
PING 10.19.49.1 (10.19.49.1): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
64 bytes from 10.19.49.1: icmp_seq=2 ttl=64 time=15.560 ms
Request timeout for icmp_seq 3
Request timeout for icmp_seq 4
Request timeout for icmp_seq 5
Request timeout for icmp_seq 6
Request timeout for icmp_seq 7
Request timeout for icmp_seq 8
Request timeout for icmp_seq 9
Request timeout for icmp_seq 10
Request timeout for icmp_seq 11
Request timeout for icmp_seq 12
Request timeout for icmp_seq 13
64 bytes from 10.19.49.1: icmp_seq=14 ttl=64 time=19.007 ms
Request timeout for icmp_seq 15
64 bytes from 10.19.49.1: icmp_seq=16 ttl=64 time=16.106 ms
^C
--- 10.19.49.1 ping statistics ---
17 packets transmitted, 3 packets received, 82.4% packet loss
round-trip min/avg/max/stddev = 15.560/16.891/19.007/1.513 ms
@ShantanuNair
Copy link
Author

A few minutes later, I repeated the experiment and got this as a result:

ping 10.19.49.1
PING 10.19.49.1 (10.19.49.1): 56 data bytes
Request timeout for icmp_seq 0
64 bytes from 10.19.49.1: icmp_seq=1 ttl=64 time=18.920 ms
64 bytes from 10.19.49.1: icmp_seq=2 ttl=64 time=18.239 ms
64 bytes from 10.19.49.1: icmp_seq=3 ttl=64 time=18.819 ms
64 bytes from 10.19.49.1: icmp_seq=4 ttl=64 time=18.541 ms
64 bytes from 10.19.49.1: icmp_seq=5 ttl=64 time=17.512 ms
64 bytes from 10.19.49.1: icmp_seq=6 ttl=64 time=18.581 ms
64 bytes from 10.19.49.1: icmp_seq=7 ttl=64 time=18.902 ms
64 bytes from 10.19.49.1: icmp_seq=8 ttl=64 time=17.337 ms
64 bytes from 10.19.49.1: icmp_seq=9 ttl=64 time=18.055 ms
64 bytes from 10.19.49.1: icmp_seq=10 ttl=64 time=18.713 ms
64 bytes from 10.19.49.1: icmp_seq=11 ttl=64 time=20.228 ms
64 bytes from 10.19.49.1: icmp_seq=12 ttl=64 time=17.781 ms
64 bytes from 10.19.49.1: icmp_seq=13 ttl=64 time=16.004 ms
64 bytes from 10.19.49.1: icmp_seq=14 ttl=64 time=16.590 ms
64 bytes from 10.19.49.1: icmp_seq=15 ttl=64 time=17.788 ms
64 bytes from 10.19.49.1: icmp_seq=16 ttl=64 time=16.421 ms
64 bytes from 10.19.49.1: icmp_seq=17 ttl=64 time=18.228 ms
64 bytes from 10.19.49.1: icmp_seq=18 ttl=64 time=17.454 ms
64 bytes from 10.19.49.1: icmp_seq=19 ttl=64 time=16.652 ms
^C
--- 10.19.49.1 ping statistics ---
20 packets transmitted, 19 packets received, 5.0% packet loss
round-trip min/avg/max/stddev = 16.004/17.935/20.228/1.021 ms

I would really appreciate if someone with more networking/VPN troubleshooting skills could point me in the right direction.

@TC1977
Copy link
Contributor

TC1977 commented Nov 24, 2019

So my first question, given recent reports of Wireguard problems with macOS 10.15, is what version of macOS you're running, and what version of Wireguard client you're running.

@ShantanuNair
Copy link
Author

ShantanuNair commented Nov 24, 2019

@TC1977 Hey so here's the versions, but I'm not sure if that's the issue as at the same time I was having packet loss on my laptop, I experienced a complete slowdown on my phone as well.
Macos Version: Catalina 10.15.
Wireguard (macos) client version 1.0.8.

@ShantanuNair
Copy link
Author

Here's what I received running sudo tcpdump -i wg0 on the server:

5390002 packets captured
6419604 packets received by filter
1029597 packets dropped by kernel
15 packets dropped by interface

Not sure of the significance really, but does that 'dropped by kernel' seem alright?

@jackivanov
Copy link
Collaborator

Try to mtr/traceroute/ping to the server‘s public ip, not to the local one, to find out where the problem is

@davidemyers
Copy link
Contributor

I believe Algo configures the EC2 firewall to block ICMP, so you'll need to change that before you can testing pinging the public IP address outside of the VPN tunnel.

@ShantanuNair
Copy link
Author

No issues as of now. I thought at first that the issue was more noticeable when switching wireguard profiles/right after handshakes but it seems like that's not reproducing it right now making diagnostics really difficult.

@ShantanuNair
Copy link
Author

ShantanuNair commented Nov 24, 2019

@jackivanov Here's the mtr output during a period when I was having connection issues:
sudo mtr <EC2-External-IP>

Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                                                                                                                              Packets               Pings
 Host                                                                                                                                       Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 10.19.49.1                                                                                                                              67.9%   218   15.3  57.9  14.5 1175. 201.5
 2. (waiting for reply)

and for reddit: sudo mtr reddit.com

Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                                                                                                                              Packets               Pings
 Host                                                                                                                                       Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 10.19.49.1                                                                                                                              64.0%   431   14.9  37.5  13.9 1203. 135.5
 2. (waiting for reply)
 3. (waiting for reply)
 4. (waiting for reply)
 5. 100.65.0.161                                                                                                                            40.0%   430   15.7  36.6  14.7 933.4 113.1
 6. 52.93.227.51                                                                                                                            38.6%   430   15.8  41.6  15.5 1378. 145.0
 7. 52.93.224.218                                                                                                                           40.7%   430   17.0  41.0  14.9 1315. 137.2
 8. 52.93.224.215                                                                                                                           40.5%   430   17.5  40.1  15.1 1252. 130.1
 9. 100.91.194.242                                                                                                                          40.2%   430   52.7  81.6  51.3 1619. 162.8
10. 52.93.130.193                                                                                                                           39.8%   430   52.3  82.3  50.8 1553. 154.1
11. 52.93.68.154                                                                                                                            40.7%   430   52.6  76.9  51.0 1485. 142.0
12. 54.239.46.155                                                                                                                           40.0%   430   54.6  75.4  50.8 1421. 131.4
13. 52.95.67.142                                                                                                                            39.5%   430   52.9  82.4  51.4 1355. 150.3
14. 52.95.65.252                                                                                                                            39.3%   430   52.7  76.8  51.1 1288. 138.3
15. 199.27.73.224                                                                                                                           38.8%   430   52.2  72.4  50.6 1221. 126.8
16. 151.101.193.140                                                                                                                         40.2%   430   54.2  73.6  50.9 1269. 137.4

Towards the end of me writing this, it got a lot better and so the % dropped in the reddit mtr went down a lot.

@davidemyers Yes, looks like ICMP inbound is blocked, but I was still able to traceroute and mtr, just not a direct ping to the EC2 ip.

@ShantanuNair
Copy link
Author

Doesn't seem to be a MacOS specific issue. I'm 90% confident that I was experiencing the same issue on my phone while my mac was having network issues.

@davidemyers
Copy link
Contributor

Make sure you're not using the same WireGuard config file on both devices.

Also, those traces look like the WireGuard tunnel was up. You should test the connection between your clients and the EC2 instance outside of WireGuard to see if there's a network issue unrelated to Algo.

@ShantanuNair
Copy link
Author

ShantanuNair commented Nov 25, 2019

@davidemyers

Make sure you're not using the same WireGuard config file on both devices.

I'm not. I'm using different profiles generated by Algo for different devices.

Also, those traces look like the WireGuard tunnel was up. You should test the connection between your clients and the EC2 instance outside of WireGuard to see if there's a network issue unrelated to Algo.

I thought the same, so what I did was deploy Algo to various AWS regions, and test. But they would all have the same issue intermittently. When it doesn't work on my laptop, it doesn't work on the phone either. So we can rule out the issue being macos/config related.

Any suggestions on what I should log or how I can troubleshoot when I experience the network issue? running mtr shows a huge packet loss, and pinging the wg ip (10.19.49.1) also results in a lot of timeouts. But I do get few replies in between. This is all until it suddenly starts working fine. Maybe it's just advanced DPI and there's nothing I can do about it :(

@ShantanuNair
Copy link
Author

Would it be helpful for diagnostics that when I do experience the issue, I ssh into the EC2 instance from another machine and collect the tcpdump -i wg output? I'm running out of ideas on how to troubleshoot this.

@TC1977
Copy link
Contributor

TC1977 commented Nov 25, 2019

@ShantanuNair You could try checking the WireGuard client logs. On your Mac, open the WireGuard app, and at the bottom of the windowpane with your listed connections the little gear icon gives you the option to "View Log".

You probably already did this, but I didn't see explicit mention: also make sure this is occurring on all networks, not just your LAN. Try your iPhone on LTE, and also try taking your Mac to a different network (Starbucks, or Wi-Fi hotspot from your phone) and see if the problem occurs there.

The other thing you could do is to troubleshoot if it's actually Wireguard as opposed to some other issue. You could install the IKEv2 mobileconfig on your Mac and see if that works without problems. And if you do get problems, strongSwan at least gives you useful debug messages on the server with tail -f /var/log/syslog|grep charon.

@DavidMyers
Copy link

DavidMyers commented Nov 25, 2019 via email

@davidemyers
Copy link
Contributor

@DavidMyers Hey Dave, it's Dave. 😃

Yes, it looks like the "e" was left out of a mention above.

@ShantanuNair
Copy link
Author

My bad for the wrong user mention.

Anyway, I tried to record the logs when the issue happens (I feel it's more likely to happen on new connections—say when I deactivate and then reactive a tunnel. It would be really slow to pull up a site, and mtrs would have huge packet loss even after the quick handshake, but once it gets connected I get decent throughput upto ~140-150mbps).

I also notice that cryptostorm's (commercial VPN with supports wireguard) free wireguard service connects fine.

Here is the log:

wg-log.txt

@TC1977
Copy link
Contributor

TC1977 commented Nov 25, 2019

So what jumps out at me is about 13 seconds of repeated Failed to send data packet messages, despite attempts to activate and deactivate the tunnel:

2019-11-24 18:44:00.872541: [NET] peer(UpED…wfCw) - Received handshake response
2019-11-24 18:44:00.872882: [NET] peer(UpED…wfCw) - Sending keepalive packet
2019-11-24 18:45:57.026651: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.082005: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.091933: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.092089: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.138757: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.203670: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.269888: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.332078: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.399803: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.465415: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.533183: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:45:57.597919: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is down
2019-11-24 18:46:00.883135: [NET] peer(UpED…wfCw) - Sending handshake initiation
2019-11-24 18:46:02.869071: [NET] peer(UpED…wfCw) - Retrying handshake because we stopped hearing back after 15 seconds
2019-11-24 18:46:05.894437: [NET] peer(UpED…wfCw) - Sending handshake initiation
2019-11-24 18:46:08.262460: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: can't assign requested address
2019-11-24 18:46:08.263858: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: can't assign requested address
2019-11-24 18:46:08.264451: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: can't assign requested address
2019-11-24 18:46:08.287636: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.301510: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.301635: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.301777: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.301857: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.301936: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.303893: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.350050: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.373338: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.373455: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.377392: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.377515: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.412722: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.430741: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.435930: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.476672: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.528186: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.528408: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.543036: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.566236: [NET] Network change detected with unsatisfied route and interface order [en0, utun2]
2019-11-24 18:46:08.568455: [NET] Routine: receive incoming IPv4 - stopped
2019-11-24 18:46:08.568983: [NET] Routine: receive incoming IPv6 - stopped
2019-11-24 18:46:08.569269: [NET] Routine: receive incoming IPv4 - started
2019-11-24 18:46:08.569388: [NET] Routine: receive incoming IPv6 - started
2019-11-24 18:46:08.569546: [NET] UDP bind has been updated
2019-11-24 18:46:08.569643: [NET] peer(UpED…wfCw) - Sending keepalive packet
2019-11-24 18:46:08.569779: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.570667: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.570932: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.571150: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.571283: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.573068: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.573264: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.573421: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.573518: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.574256: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.574368: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.579276: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.579623: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.610430: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.676137: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: network is unreachable
2019-11-24 18:46:08.694881: [NET] Network change detected with unsatisfied route and interface order [utun2]
2019-11-24 18:46:08.696636: [NET] Routine: receive incoming IPv4 - stopped
2019-11-24 18:46:08.697084: [NET] Routine: receive incoming IPv6 - stopped
2019-11-24 18:46:08.697295: [NET] Routine: receive incoming IPv6 - started
2019-11-24 18:46:08.697385: [NET] Routine: receive incoming IPv4 - started
2019-11-24 18:46:08.697458: [NET] UDP bind has been updated
2019-11-24 18:46:08.697538: [NET] peer(UpED…wfCw) - Sending keepalive packet
2019-11-24 18:46:08.697608: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.738751: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.772542: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.775761: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.776360: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.805085: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.839347: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.868954: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.879272: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.928226: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.928362: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929034: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929128: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929269: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929412: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929500: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929581: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929653: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929725: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929793: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929862: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.929958: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.930020: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.930099: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.930175: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.930253: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.930328: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.930401: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.930810: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.931516: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:08.997849: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:09.001674: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:09.042560: [NET] Network change detected with unsatisfied route and interface order [utun2, en0]
2019-11-24 18:46:09.043250: [NET] Routine: receive incoming IPv4 - stopped
2019-11-24 18:46:09.043334: [NET] Routine: receive incoming IPv6 - stopped
2019-11-24 18:46:09.043631: [NET] Routine: receive incoming IPv6 - started
2019-11-24 18:46:09.043686: [NET] Routine: receive incoming IPv4 - started
2019-11-24 18:46:09.043878: [NET] UDP bind has been updated
2019-11-24 18:46:09.043946: [NET] peer(UpED…wfCw) - Sending keepalive packet
2019-11-24 18:46:09.044034: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:09.067688: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:09.077056: [NET] peer(UpED…wfCw) - Failed to send data packet write udp4 0.0.0.0:57616->15.185.65.36:51820: sendto: no route to host
2019-11-24 18:46:09.756173: [NET] Network change detected with satisfied route and interface order [en0, utun2]
2019-11-24 18:46:09.756358: [NET] Routine: receive incoming IPv4 - stopped
2019-11-24 18:46:09.757639: [NET] Routine: receive incoming IPv6 - stopped
2019-11-24 18:46:09.757825: [NET] Routine: receive incoming IPv6 - started
2019-11-24 18:46:09.757857: [NET] Routine: receive incoming IPv4 - started
2019-11-24 18:46:09.757954: [NET] UDP bind has been updated
2019-11-24 18:46:09.758630: [NET] peer(UpED…wfCw) - Sending keepalive packet
2019-11-24 18:46:09.832253: [NET] Interface set up
2019-11-24 18:46:10.224886: [NET] Network change detected with satisfied route and interface order [en0, utun2]
2019-11-24 18:46:10.227137: [NET] Routine: receive incoming IPv4 - stopped
2019-11-24 18:46:10.227614: [NET] Routine: receive incoming IPv6 - stopped
2019-11-24 18:46:10.227860: [NET] Routine: receive incoming IPv6 - started
2019-11-24 18:46:10.227943: [NET] Routine: receive incoming IPv4 - started
2019-11-24 18:46:10.228109: [NET] UDP bind has been updated
2019-11-24 18:46:10.228181: [NET] peer(UpED…wfCw) - Sending keepalive packet
2019-11-24 18:46:10.937350: [NET] peer(UpED…wfCw) - Sending handshake initiation
2019-11-24 18:46:12.542169: [NET] peer(UpED…wfCw) - Received handshake response

And then you were pretty error-free after that. Wonder if you can check your EC2 console logs in Cloudwatch for rejected packets during that time?

Also, these were the same errors reported in #1629, which according to most recent update by @Ezzahhh may not be a pure macOS 10.15 problem after all.

@ShantanuNair
Copy link
Author

ShantanuNair commented Nov 25, 2019

@TC1977 Thank you so much for helping me with this! I did not see those 'Failed to send data packet' messages inside the wireguard log window, only in the saved log-file.
I will figure out monitoring packet drops via Cloudwatch and get back to you guys.

However, I just gave the IPSEC profiles a shot and it seems to be perfectly stable. Only thing is the speeds are absolutely trash (<400kbps). Not sure if that's expected behavior.

What this leads me to believe is that this is not an issue with the connection to the EC2 server, but instead seems like it's a wireguard only isolated issue, and that it could have something to do with wireguard's routing. Using IPSEC I'm able to reach and maintain a connection with my EC2 just fine. It's only wireguard that seems to be the issue.

Regarding #1629 it seems like his connection was working fine on his phone but not on his mac. But for me when wg is unstable on my macbook, it is also problematic on my phone.
EDIT: Looks like I may be wrong about wg not working on my phone, I'm not sure why that was happening before but it looks like turning my macbook's WiFi off and then back on fixes the issue, only for it to pop up again. I turn my wifi off, turn it back on, and it's all good. A few seconds later, I head over to fast.com and during the speedtest it stalls, and the issue is back. So maybe it is indeed a MacOS issue!

@TC1977
Copy link
Contributor

TC1977 commented Nov 26, 2019

IPsec is said to be slower than WireGuard, but shouldn't be that much slower. I wouldn't be surprised at all if your IPsec connection is also screwed up. It might be worth checking out the strongSwan logs by leaving a window open with tail -f /var/log/syslog|grep charon.

Do you get these issues if you install onto a t3.micro instance in a different region?

@shadowsaw
Copy link

shadowsaw commented Jan 15, 2020

There is advanced DPI being employed against Algo in the Middle East. I had raised this as an issue/notification earlier. I've tried Algo (same profiles, same servers, just different ISPs in other countries) and always noticed this issue only in the Middle East. #1588

@ShantanuNair
Copy link
Author

@shadowsaw This seems to be the case. Have you found anything that works well in that environment. Even ShadowSocks doesn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants