Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSL complete freeze #9114

Closed
1 of 2 tasks
androiddisk opened this issue Nov 4, 2022 · 26 comments
Closed
1 of 2 tasks

WSL complete freeze #9114

androiddisk opened this issue Nov 4, 2022 · 26 comments

Comments

@androiddisk
Copy link

Version

10.0.22621.674

WSL Version

  • WSL 2
  • WSL 1

Kernel Version

5.15.68.1

Distro Version

Ununtu 20.04

Other Software

No response

Repro Steps

I have the same problem. No new wsl shell window can be created, and the wsl -- shutdown window is stuck. Even the window cannot be closed. Version 0.70.4.0 is not downloaded from GitHub. It is the 0.70.4.0 version automatically updated by Windows. It should be the same as GitHub https://github.com/microsoft/WSL/releases/tag/0.70.4
This problem has troubled me for a long time.

PS C:\Users\zhong> wsl --version
WSL 版本: 0.70.4.0
内核版本: 5.15.68.1
WSLg 版本: 1.0.45
MSRDC 版本: 1.2.3575
Direct3D 版本: 1.606.4
DXCore 版本: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows版本: 10.0.22621.674

I will cooperate with you if you need me. thank you

Expected Behavior

Open or shut down the wsl shell normally

Actual Behavior

get stuck

Diagnostic Logs

WslLogs-2022-11-04_14-42-14.zip

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 9, 2022

Thank you for reporting this @androiddisk.

Can you please share dumps of the WSL processes once WSL is 'stuck' ?

The simplest way would be to called collect-wsl-logs.ps1 with -Dump (as administrator). For instance:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1 -Dump

@androiddisk
Copy link
Author

@ghost ghost removed the needs-author-feedback label Nov 10, 2022
@androiddisk
Copy link
Author

Most windows keep flashing. Occasionally the window will prompt

由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。
Error code: Wsl/Service/0x8007274c
请按任意键继续. . .

@androiddisk
Copy link
Author

@OneBlue thank you

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 10, 2022

Thank you @androiddisk, these dumps are very interesting. On the Windows side, WSL is stuck trying to create a new session (which looks a lot like #8824).

To investigate, we'd need to look at what's happening inside WSL2. To do this, can you please:

  • First clear everything by running wsl.exe --shutdown (make sure nothing is running). If this is unresponsive, kill wslservice.exe.
  • Open a shell inside your the system distro (and leave it open) via: wsl --u root --system
  • Install curl in that shell via: tdnf install -y curl
  • Run the steps to get WSL into that "frozen" state in your regular distro (not inside the system distro)
  • Once WSL is frozen, use the previously opened shell to dump all the 'init' processes via: curl -k https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/dump-init.sh | bash
  • This will generate a folder with dumps and log files. Please zip it and share it on this issue
  • Also please share a dump of wslservice.exe (you can do this via task manager, in the 'details' tab, right click on wslservice.exe, then 'Create dump file')
  • Also please share the output of dmesg in the system shell

@mwoodpatrick
Copy link

I am still seeing the same issue I captured

core.zip

output from curl command

curl -k https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/dump-init.sh | bash
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 843 100 843 0 0 2365 0 --:--:-- --:--:-- --:--:-- 2367
Loaded plugin: tdnfrepogpgcheck
Package gdb is already installed.
Nothing to do.
[New LWP 5]
0x00000000003bc9cd in ?? ()
Saved corefile core.1
[Inferior 1 (process 1) detached]
[New LWP 30]
warning: Target and debugger are in different PID namespaces; thread lists and other data are likely unreliable. Connect to gdbserver inside the container.
0x00000000003bc9cd in ?? ()
Saved corefile core.2
[Inferior 1 (process 2) detached]
0x00000000003bc9cd in ?? ()
Saved corefile core.6
[Inferior 1 (process 6) detached]
[New LWP 29]
warning: Target and debugger are in different PID namespaces; thread lists and other data are likely unreliable. Connect to gdbserver inside the container.
0x00000000003bc9cd in ?? ()
warning: target file /proc/28/cmdline contained unexpected null characters
Saved corefile core.28
[Inferior 1 (process 28) detached]
0x00000000003bc9cd in ?? ()
Saved corefile core.52
[Inferior 1 (process 52) detached]
0x00000000003bc9cd in ?? ()
Saved corefile core.53
[Inferior 1 (process 53) detached]
Logs and dumps written in /mnt/c/wsl-init-dump-2022-11-11_04-46-29

@androiddisk
Copy link
Author

tdnf: command not found

@ghost ghost removed the needs-author-feedback label Nov 11, 2022
@androiddisk
Copy link
Author

@OneBlue

tdnf: command not found

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 11, 2022

@androiddisk: Did you run this command in a system shell (via wsl.exe --system -u root )? That's where tdnf would be)

@mwoodpatrick
Copy link

mwoodpatrick commented Nov 11, 2022

I hit the same issue again and captured all the requested files (this time) in
capture.zip
The wslservice.zip is 38MB and won't upload due yo 25MB restriction please advise

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 11, 2022

Thank you for the files @mwoodpatrick. Next time this reproes can you also a dump of wslservice.exe (along with the other files)` ? I'd like to see both from the same repro.

If it's too big for Github, feel free to use OneDrive / GoogleDrive.

@androiddisk
Copy link
Author

@androiddisk: Did you run this command in a system shell (via wsl.exe --system -u root )? That's where tdnf would be)

D:\kooapk\yuanc\data>wsl --u root --system
无效的命令行参数: --u
请使用"wsl.exe --help"获取受支持的参数列表。

@mwoodpatrick
Copy link

You should be able to access the files https://drive.google.com/drive/folders/12xKDVhbyeY8pFNaGsZaumffOxJQPUaCk?usp=sharing please let me know if you have any issues

@mwoodpatrick
Copy link

@XhmikosR XhmikosR mentioned this issue Nov 15, 2022
2 tasks
@inoue-katsumi
Copy link

I can only run dump-init.sh in WSL Debug Shell when this symptom occurs. How is it possible to tar up and upload core.* and .txt files under /mnt/c/wsl- ?

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 22, 2022

Here is another datapoint https://drive.google.com/file/d/134CzXAJv0gngd50sLDGGv9Ww9N7-_N2A/view?usp=sharing

Thanks a lot @mwoodpatrick. With this information we have identified the issue. The fix is in 1.0.1.

To install it, download the package and install via: Add-AppxPackage /path/to/package in an elevated powershell

Let me know if the issue is solved for you !

@mwoodpatrick
Copy link

I''m running with an updated WSL-2 and was still seeing a hang I will try to capture new data to debug this

WSL version: 1.0.1.0
Kernel version: 5.15.74.2
WSLg version: 1.0.47
MSRDC version: 1.2.3575
Direct3D version: 1.606.4
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22623.891

@OneBlue OneBlue reopened this Nov 28, 2022
@OneBlue
Copy link
Collaborator

OneBlue commented Nov 28, 2022

Thank you for the update @mwoodpatrick. Reopened this issue since it seems that the issue is still there for you.

@mwoodpatrick
Copy link

It did hang once after doing the update but I have not seen it recur since and unfortunately, I did not capture that first fail. It definitely seems more reliable with the latest update. I will update the bug with debug info once I get it to fail.

@MichealReed
Copy link

Also experiencing this on a heavy build over WSL.

@jetvp
Copy link

jetvp commented Dec 16, 2022

I'm experiencing the same issue using 1.0.3 release.
WSL version: 1.0.3.0
Kernel version: 5.15.79.1
WSLg version: 1.0.47
MSRDC version: 1.2.3575
Direct3D version: 1.606.4
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.19045.2364

The steps to recreate the issue are:

  • WSL with Ubuntu.
  • Docker installed without containerd enabled (haven't seen it with containerd yet).
  • Pull computer into hibernation whilst running, and on wake-up the CPU will slowly ramp up to the point WSL can not longer be stopped via the --shutdown option.

Only solution at that stage is to restart the computer. I've done clean installs of WSL and Docker, and appears to happen each time.

@Cosss7
Copy link

Cosss7 commented Dec 16, 2022 via email

@mwoodpatrick
Copy link

Try capturing the data requested in OneBlue commented on Nov 10

I have not had any recurrence of this issue in last couple of weeks I'm running

WSL version: 1.0.3.0
Kernel version: 5.15.79.1
WSLg version: 1.0.47
MSRDC version: 1.2.3575
Direct3D version: 1.606.4
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22623.1037

@OneBlue
Copy link
Collaborator

OneBlue commented Dec 16, 2022

Thanks for confirming @mwoodpatrick.

I'll close this for now since the original issue appears to be resolved.

Thanks for reporting this @jetvp. We're tracking this issue on #8696

@OneBlue OneBlue closed this as completed Dec 16, 2022
@MichealReed
Copy link

Fyi my freeze is not related to hibernate, I do not use this feature.

@androiddisk
Copy link
Author

@OneBlue

C:\Users\zhong>wsl
由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。
Error code: Wsl/Service/0x8007274c

C:\Users\zhong>wsl --version
WSL 版本: 1.0.3.0
内核版本: 5.15.79.1
WSLg 版本: 1.0.47
MSRDC 版本: 1.2.3575
Direct3D 版本: 1.606.4
DXCore 版本: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows版本: 10.0.22621.963

Duplicate the same bug

But after a few minutes, you can open the wsl terminal again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants