Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak / Thousands of abandoned wsl.exe instances (WSL Preview) #8703

Closed
1 of 2 tasks
mrgreywater opened this issue Aug 8, 2022 · 15 comments
Closed
1 of 2 tasks

Comments

@mrgreywater
Copy link

mrgreywater commented Aug 8, 2022

Version

Microsoft Windows [Version 10.0.22000.832]

WSL Version

  • WSL 2
  • WSL 1

Kernel Version

5.10.102.1

Distro Version

Ubuntu-20.04

Other Software

Docker Desktop 4.11.0 (83626)

Repro Steps

Run wsl with various docker images for some multiple days.

Expected Behavior

Memory usage stays somewhat consistent.

Actual Behavior

Memory usage goes up, but is not associated to any running process in Task Manager. Memory Space just "vanishes". In RamMap, thousands over thousands of abandoned wsl.exe processes are listed (pid not actively running anymore), each still consuming a small amount of RAM.

The creation of those abandoned wsl.exe processes continues with time until no memory is left. At that point windows itself becomes unusuable. Mouse cursor begins to lag, audio crackles. At some point the whole system becomes unresponsive.

Only a full system restart clears the memory back to normal levels.

wsl --shutdown or restarting the WSL service has no impact on the zombie processes.

Diagnostic Logs

System has 32 Gigabytes of Ram, wsl consumes around 10 Gigabytes after the first day, consumes almost all of it after a few days, not visible as active processes, traces only visible in Rammap.

screenshot

In RamMap, notice the large amount of unused active memory and large page table
image

In ProcessHacker, you can see a wsl.exe instance being created and closed every few seconds:
https://user-images.githubusercontent.com/2902403/183592815-75add970-ef56-48cf-b76e-559e1c9a86f8.mp4

Here are the currently running filter drivers (doubt they are the culprit though):
image

@NickDeBeenSAE
Copy link

Looks like a server, waay beyond me.

But anyway, lets hope your issue gets sorted.

@OneBlue
Copy link
Collaborator

OneBlue commented Aug 9, 2022

/logs

@ghost
Copy link

ghost commented Aug 9, 2022

Hello! Could you please provide more logs to help us better diagnose your issue?

To collect WSL logs, download and execute collect-wsl-logs.ps1 in an administrative powershell prompt:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1

The scipt will output the path of the log file once done.

Once completed please upload the output files to this Github issue.

Click here for more info on logging

Thank you!

@ghost ghost added the needs-author-feedback label Aug 9, 2022
@mrgreywater
Copy link
Author

WslLogs-2022-08-11_15-41-06.zip

Here is a log after a reboot. Therefore the memory is mostly clear, but during the log, some wsl instanced have already crashed and restarted a few times.

I'll add another log file after a few days, when the memory leak has filled up the ram some more.

@ghost ghost removed the needs-author-feedback label Aug 11, 2022
@mrgreywater
Copy link
Author

WslLogs-2022-08-12_11-20-05.zip
Here is another log after ~1day with about 3gigs of memory being taken by wsl.exe zombie processes.

@kaitallaoua
Copy link

I am experiencing the exact same issue if I can help. My only difference is all of my zombie wsl.exe processes do not occupy standby memory, only 32k of page tables each.

@OneBlue
Copy link
Collaborator

OneBlue commented Aug 16, 2022

Thank you for the logs @mrgreywater.

This is an interesting issue, I see some relaying errors in the logs, but nothing that explains the zombies.

Can you share a dump of one of those wsl.exe zombies, only with a dump of wslservice.exe ?

@kaitallaoua
Copy link

Ive found that docker desktop 4.11.x introduces this, and that reverting back to 4.10.1 I no longer have this issue.

Be careful if you do revert, you need to uninstall your current docker desktop which will delete all containers, volumes, images etc in order to then install the version you want. Be sure to backup!

@mrgreywater
Copy link
Author

mrgreywater commented Aug 17, 2022

@kaitallaoua I noticed it has to do with Docker, but reverting to an old version may work for now, but the underlying issue has to be figured out still. The fact that it is even possible to create such zombie processes that are not cleaned up by the os - even though they are terminated - makes me believe the actual bug may be rooted in the os/driver/hypervisor.
I'll upload a dump soon, when I've got some time.

@ghost ghost removed the needs-author-feedback label Aug 17, 2022
@mrgreywater
Copy link
Author

@OneBlue So after some fiddling around with gflags.exe and windbg, I've got some dumps:

https://drive.google.com/file/d/1hEjdTG0JMGspXIYouA8PYZiB335lE2wB/view?usp=sharing

  • wsl-dump1.dmp: Dump of a wsl.exe that's terminating and about to be a zombie process taking up resources while terminated.
  • wsl-dump2.dmp: Another dump of a wsl.exe that's terminating and about to be a zombie process
  • docker-service.dmp: Dump of the parent process of the wsl.exe (Docker Desktop)
  • wsl-service.dmp: Dump of wslservice.exe

@luisvalleGH
Copy link

Same symptoms with same environment as the issue creator. In my case just by clossing windows session and reopening release the memory

RamMap wsl zombies

image

@mrgreywater
Copy link
Author

mrgreywater commented Aug 24, 2022

Appearently there's been a fix that will be published with the next Docker for Windows release, I'll close this issue as soon as it's available and confirmed to be working.

@siktec-lab
Copy link

@mrgreywater Where did you see that? I Have the same problem which can be reproduced when running an apache server image of any kind.... Its probably a docker problem since its keeping handles to those zombie processes so they are not cleaned and stay there forever occupying the page table.

@mrgreywater
Copy link
Author

@siktec-lab docker/for-win#12877 (comment)

@mrgreywater
Copy link
Author

Appears to be fixed with Docker v4.12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants