Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSL network crash under constant load #10817

Open
1 of 2 tasks
sec opened this issue Nov 23, 2023 · 17 comments
Open
1 of 2 tasks

WSL network crash under constant load #10817

sec opened this issue Nov 23, 2023 · 17 comments
Labels

Comments

@sec
Copy link

sec commented Nov 23, 2023

Windows Version

Microsoft Windows [Version 10.0.19045.3693]

WSL Version

2.0.11.0

Are you using WSL 1 or WSL 2?

  • WSL 2
  • WSL 1

Kernel Version

5.15.133.1-1

Distro Version

Ubuntu-20.04

Other Software

Docker version 23.0.2, build 569dd73 (run inside WSL)

Repro Steps

I've created sample repo with repro steps and needed software - https://github.com/sec/wsl-network-crash-test
In short

  1. Run some service under WSL/Docker
  2. Access that service from Windows host in a loop
  3. Wait - sometime it takes 1 minute to fail, sometimes it's 10, but it will crash sooner or later
  4. Connection from Windows host to WSL/Docker will stop working
  5. After it will stop working, even normal app run under pure WSL can't be accessed from Windows host

Doing wsl --shutdown and launching WSL/Docker again fix the issue, but that's not the solution to take.

I'm having this issue for many versions back, I've tried to downgrade to almost all possible ones 2.x and the problem is inside all of them. IIRC under 1.x this was working fine (I have big project that I work on that's running containers inside Docker which are accessed from Windows host).

Logs attached, started with everything working, then run the repro steps, it crashed the network, logs collected - hope there's something inside that will help fix this.

Expected Behavior

Network connection from Windows to WSL should work.

Actual Behavior

WSL network cannot be accessed from Windows host.

Diagnostic Logs

WslLogs-2023-11-23_10-48-17.zip
WslNetworkingLogs-2023-11-23_11-17-11.zip

Copy link

Hi I'm an AI powered bot that finds similar issues based off the issue title.

Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it. Thank you!

Open similar issues:

Closed similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

@pocesar
Copy link

pocesar commented Nov 23, 2023

happening here too, but it's only the networking that breaks, you can still run commands using wsl --exec ps -ax for example and see all processes

@sec
Copy link
Author

sec commented Nov 28, 2023

happening here too, but it's only the networking that breaks, you can still run commands using wsl --exec ps -ax for example and see all processes

Exactly, after networking broke, everything else looks working, but to get the network back, need to shutdown WSL and re-launch.

@keith-horton
Copy link
Member

How are you accessing the WSL container from the Windows Host? Is it through the Docker bridge?
From the WSL-side, I didn't find any issues.

Docker has a new release you may try.
https://docs.docker.com/desktop/release-notes/#4260

@sec
Copy link
Author

sec commented Dec 8, 2023

I don't think it's related to docker. The same thing happen with podman (with it's machine running inside WSL). When WSL network crash, podman can't connect it it's machine also. Same with using podman under WSL - the same repro steps can be taken to crash WSL network.

WSL is using default networking settings, I didn't change anything.

Have you tried to reproduce the error using repo and steps I've made? That's shouldn't take more than few minutes to show the problem.

@pocesar
Copy link

pocesar commented Dec 8, 2023

just happened again, but this time it was completely unresponsive to wsl --shutdown (using 2.0.14).
vmcompute was using 100% memory and 100% CPU (of 6 cores)

image

can't even stop the service

@KILLME56k
Copy link

My network fails when i open explorer.

@ademyankov
Copy link

wsl2 constantly crashes for me too when under heavy load. But I am not even sure that it can be called heavy?!

I have an SDK that supports multiple platforms, so I run a script that creates build directory for all those (about 10) different platforms and run cmake in the background to configure all of them.

Something like so:

cd build/x86_release && cmake ../.. &
cd build/x86_debug && cmake -DCMAKE_BUILD_TYPE=Debug ../.. &
cd build/rpi4_release && cmake ../.. &
cd build/rpi4_debug && cmake -DCMAKE_BUILD_TYPE=Debug ../.. &
cd build/rpi5_release && cmake ../.. &
cd build/rpi5_debug && cmake -DCMAKE_BUILD_TYPE=Debug ../.. &
cd build/esp32s3_release && cmake ../.. &
cd build/esp32s3_debug && cmake -DCMAKE_BUILD_TYPE=Debug ../.. &

Never got a single successfull run, ever! It crashes all the time and closes all open wsl windows.

And sometimes I cannot restart it, I have to do this first:

c:\>wsl --shutdown

It is extrimely annoying and disappointing!

WSL version: 2.0.9.0
Kernel version: 5.15.133.1-1
WSLg version: 1.0.59
MSRDC version: 1.2.4677
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22631.3007

@sec
Copy link
Author

sec commented Mar 7, 2024

Any update on this? This is making WSL useless for any real usage.
Just checked newest 2.0.14.0 and it still crashed network under any real load.
Does not matter if I use docker or podman as distro - this is WSL core issue.

@keith-horton
Copy link
Member

@OneBlue , it looks like the wslrelay is timing out trying to talk to the container. (lots of WSAETIMEDOUT errors on the relay sockets) + we can see an HvSocketConnectionDisconnected event that precedes it. Are there any known hvsocket issues?

@hyan23
Copy link

hyan23 commented Jun 3, 2024

I have encountered a similar problem, and I feel that this problem is related to Ipv6(because localhost resolves to ::1). If I do not use localhost, I can access normally by using 127.0.0.1.

@keith-horton
Copy link
Member

Right, accessing the container from the host is supported only through 127.0.0.1 in Mirrored Mode (there Linux option we use to enabling routing loopback traffic only exists for IPv4, not IPv6, unfortunately).

Does crashing under load only happen when in NAT Mode? or in Mirrored Mode?
NAT mode uses a relay that moves traffic over an hvSocket (see the HvSocketConnectionDisconnected event reference above).
Mirrored Mode does not need a relay: it's routed through the vswitch connecting the container.

@sec
Copy link
Author

sec commented Jun 4, 2024

Mirrored mode is not supported when I try to enable it, it switch back to using NAT - where can I find requirments for this mode or check why it's not supported?

@keith-horton
Copy link
Member

Hi there. Mirrored Mode is supported on Windows 11 22H2 or later.
https://learn.microsoft.com/en-us/windows/wsl/wsl-config

@sec
Copy link
Author

sec commented Jun 5, 2024

As I wrote, I'm on Windows 10 and can't use that mode. Can't this be fixed, as it was working fine before some versions of WSL and started to break in recent (now almost a year) versions.

@smolinari
Copy link

smolinari commented Nov 2, 2024

Hi there. Mirrored Mode is supported on Windows 11 22H2 or later. https://learn.microsoft.com/en-us/windows/wsl/wsl-config

I have an application that tunnels via HTTPS to create an SSH like experience in a shell. It is sort of like a VPN in the end, as it uses Tailscale to make the connection.

At any rate, the issue was, if I ran the application in WSL2 (Ubuntu 24.04), my Windows Networking would freeze after about 3-5 minutes and basically cause my machine to be useless with no Internet connection. I also couldn't reset the WIFI adapter. It was locked. I'd have to reboot every time to get the system working again.

Mirrored mode fixed it! Thanks for the tip!

Scott

@sec
Copy link
Author

sec commented Nov 6, 2024

So under Windows 10 - there's no way to fix it, becuase mirrored mode is not supported - this render WSL usuless in the end :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants