Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: make.bat hangs #36492

Open
alexbrainman opened this issue Jan 10, 2020 · 8 comments
Open

runtime: make.bat hangs #36492

alexbrainman opened this issue Jan 10, 2020 · 8 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Milestone

Comments

@alexbrainman
Copy link
Member

I have Windows 7 computer - windows/amd64.

I am building Go from source using go1.4 as bootstrap.

I am using commit 56d6b87

I am running make.bat command.

What did you expect to see?

I expected make.bat command finish successfully.

What did you see instead?

make.bat never finishes. It hangs, for example, like this:

c:\Users\alexb\dev\go\src>make
Building Go cmd/dist using c:\users\alexb\dev\\go1.4
Building Go toolchain1 using c:\users\alexb\dev\\go1.4.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.

I used process explorer https://docs.microsoft.com/en-us/sysinternals/downloads/process-explorer to examine process tree, and that is what I see:

image

I also used WinDbg to attach to go_bootstrap.exe (pid 9788) and print stacks of all its threads. And that is what I see:

0:019> !uniqstack
Processing 20 threads, please wait

.  0  Id: 263c.23d8 Suspend: 1 Teb: 000007ff`fffde000 Unfrozen
      Start: go_bootstrap+0x648e0 (00000000`004648e0) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`0022fae8 000007fe`fcc710ac ntdll!ZwWaitForSingleObject+0xa
00000000`0022faf0 00000000`0046494e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`0022fb90 00000000`00b37be0 go_bootstrap+0x6494e
00000000`0022fb98 00000000`004270d3 go_bootstrap+0x737be0
00000000`0022fba0 00000000`00000000 go_bootstrap+0x270d3

.  1  Id: 263c.314c Suspend: 1 Teb: 000007ff`fffdc000 Unfrozen
      Start: tmmon64+0x8adac (00000000`7478adac) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`0302fcb8 000007fe`fcc71430 ntdll!ZwWaitForMultipleObjects+0xa
00000000`0302fcc0 00000000`76df06c0 KERNELBASE!GetCurrentProcess+0x40
00000000`0302fdc0 00000000`747aeed4 kernel32!WaitForMultipleObjects+0xb0
00000000`0302fe50 00000000`7478ad07 tmmon64+0xaeed4
00000000`0302ff00 00000000`7478aeae tmmon64+0x8ad07
00000000`0302ff30 00000000`76df59cd tmmon64+0x8aeae
00000000`0302ff60 00000000`76f2a561 kernel32!BaseThreadInitThunk+0xd
00000000`0302ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

.  2  Id: 263c.2d5c Suspend: 1 Teb: 000007ff`fffda000 Unfrozen
      Start: ntdll!RtlDestroyHandleTable+0x270 (00000000`76f1f6f0) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`0322fcc8 00000000`76f1ed15 ntdll!ZwWaitForWorkViaWorkerFactory+0xa
00000000`0322fcd0 00000000`76df59cd ntdll!RtlValidateHeap+0x155
00000000`0322ff60 00000000`76f2a561 kernel32!BaseThreadInitThunk+0xd
00000000`0322ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

.  3  Id: 263c.e74 Suspend: 1 Teb: 000007ff`fffd6000 Unfrozen
      Start: ntdll!TpIsTimerSet+0x8b0 (00000000`76f1a280) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`0342fcb8 00000000`76f1a3c7 ntdll!ZwWaitForMultipleObjects+0xa
00000000`0342fcc0 00000000`76df59cd ntdll!TpIsTimerSet+0x9f7
00000000`0342ff60 00000000`76f2a561 kernel32!BaseThreadInitThunk+0xd
00000000`0342ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

.  4  Id: 263c.196c Suspend: 1 Teb: 000007ff`fffd4000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`28e5eb68 00000000`76f48f58 ntdll!ZwWaitForSingleObject+0xa
00000000`28e5eb70 00000000`76f48e54 ntdll!RtlDeNormalizeProcessParams+0x5a8
00000000`28e5ec20 000007fe`fa3b7f0b ntdll!RtlDeNormalizeProcessParams+0x4a4
00000000`28e5ec50 000007fe`fa3b8504 TmUmEvt64+0x17f0b
00000000`28e5eeb0 000007fe`fa3b8c96 TmUmEvt64+0x18504
00000000`28e5ef10 000007fe`fa4565ca TmUmEvt64+0x18c96
00000000`28e5efa0 000007fe`fa455f8e TmUmEvt64+0xb65ca
00000000`28e5f000 000007fe`fa410686 TmUmEvt64+0xb5f8e
00000000`28e5f150 000007fe`fa439730 TmUmEvt64+0x70686
00000000`28e5f260 00000000`7472f146 TmUmEvt64+0x99730
00000000`28e5f4a0 00000000`747e2d7d tmmon64+0x2f146
00000000`28e5f580 00000000`747e29f4 tmmon64+0xe2d7d
00000000`28e5f640 00000000`74733748 tmmon64+0xe29f4
00000000`28e5f6b0 000007fe`fcc77c3f tmmon64+0x33748
00000000`28e5f780 00000000`0046494e KERNELBASE!ResumeThread+0xf
00000000`28e5f7b0 ffffffff`ffffffff go_bootstrap+0x6494e
00000000`28e5f7b8 00000000`00000001 0xffffffff`ffffffff
00000000`28e5f7c0 ffffffff`ffffffff 0x1
00000000`28e5f7c8 00000000`28e5f928 0xffffffff`ffffffff
00000000`28e5f7d0 00000000`00000000 0x28e5f928

.  5  Id: 263c.3204 Suspend: 1 Teb: 000007ff`fffae000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2905fb08 000007fe`fcc710ac ntdll!ZwWaitForSingleObject+0xa
00000000`2905fb10 00000000`0046494e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`2905fbb0 000000c0`0002c980 go_bootstrap+0x6494e
00000000`2905fbb8 00000000`00000000 0xc0`0002c980

.  6  Id: 263c.32f0 Suspend: 1 Teb: 000007ff`fffac000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2925fb08 000007fe`fcc710ac ntdll!ZwWaitForSingleObject+0xa
00000000`2925fb10 00000000`0046494e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`2925fbb0 000000c0`0002cd00 go_bootstrap+0x6494e
00000000`2925fbb8 00000000`00000164 0xc0`0002cd00
00000000`2925fbc0 7fffffff`00000000 0x164
00000000`2925fbc8 00000000`00000160 0x7fffffff`00000000
00000000`2925fbd0 000000c0`004c6300 0x160
00000000`2925fbd8 00000000`2925fcc0 0xc0`004c6300
00000000`2925fbe0 00000000`0043daa2 0x2925fcc0
00000000`2925fbe8 00000000`00b36bf8 go_bootstrap+0x3daa2
00000000`2925fbf0 00000000`00000000 go_bootstrap+0x736bf8

.  7  Id: 263c.29d8 Suspend: 1 Teb: 000007ff`fffaa000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2945fb08 000007fe`fcc710ac ntdll!ZwWaitForSingleObject+0xa
00000000`2945fb10 00000000`0046494e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`2945fbb0 000000c0`00080280 go_bootstrap+0x6494e
00000000`2945fbb8 00000000`00000000 0xc0`00080280

.  8  Id: 263c.104c Suspend: 1 Teb: 000007ff`fffa8000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`29a8fb08 000007fe`fcc710ac ntdll!ZwWaitForSingleObject+0xa
00000000`29a8fb10 00000000`0046494e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`29a8fbb0 000000c0`00206280 go_bootstrap+0x6494e
00000000`29a8fbb8 00000000`00000000 0xc0`00206280

.  9  Id: 263c.31b4 Suspend: 1 Teb: 000007ff`fffa6000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`29ccfb08 000007fe`fcc710ac ntdll!ZwWaitForSingleObject+0xa
00000000`29ccfb10 00000000`0046494e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`29ccfbb0 000000c0`00207080 go_bootstrap+0x6494e
00000000`29ccfbb8 00000000`00000000 0xc0`00207080

. 10  Id: 263c.2b14 Suspend: 1 Teb: 000007ff`fffa4000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`29eceb68 00000000`76f48f58 ntdll!ZwWaitForSingleObject+0xa
00000000`29eceb70 00000000`76f48e54 ntdll!RtlDeNormalizeProcessParams+0x5a8
00000000`29ecec20 000007fe`fa3b7f0b ntdll!RtlDeNormalizeProcessParams+0x4a4
00000000`29ecec50 000007fe`fa3b8504 TmUmEvt64+0x17f0b
00000000`29eceeb0 000007fe`fa3b8c96 TmUmEvt64+0x18504
00000000`29ecef10 000007fe`fa4565ca TmUmEvt64+0x18c96
00000000`29ecefa0 000007fe`fa455f8e TmUmEvt64+0xb65ca
00000000`29ecf000 000007fe`fa410686 TmUmEvt64+0xb5f8e
00000000`29ecf150 000007fe`fa439730 TmUmEvt64+0x70686
00000000`29ecf260 00000000`7472f146 TmUmEvt64+0x99730
00000000`29ecf4a0 00000000`747e2d7d tmmon64+0x2f146
00000000`29ecf580 00000000`747e29f4 tmmon64+0xe2d7d
00000000`29ecf640 00000000`74733748 tmmon64+0xe29f4
00000000`29ecf6b0 000007fe`fcc77c3f tmmon64+0x33748
00000000`29ecf780 00000000`0046494e KERNELBASE!ResumeThread+0xf
00000000`29ecf7b0 ffffffff`ffffffff go_bootstrap+0x6494e
00000000`29ecf7b8 00000000`00000000 0xffffffff`ffffffff

. 11  Id: 263c.30b4 Suspend: 1 Teb: 000007ff`fffa2000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a10fb08 000007fe`fcc710ac ntdll!ZwWaitForSingleObject+0xa
00000000`2a10fb10 00000000`0046494e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`2a10fbb0 000000c0`00195400 go_bootstrap+0x6494e
00000000`2a10fbb8 00000000`00000009 0xc0`00195400
00000000`2a10fbc0 00000000`00000000 0x9

. 12  Id: 263c.30fc Suspend: 1 Teb: 000007ff`fffa0000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a30f248 00000000`76f48f58 ntdll!ZwWaitForSingleObject+0xa
00000000`2a30f250 00000000`76f48e54 ntdll!RtlDeNormalizeProcessParams+0x5a8
00000000`2a30f300 000007fe`fa3b7f0b ntdll!RtlDeNormalizeProcessParams+0x4a4
00000000`2a30f330 000007fe`fa3b8504 TmUmEvt64+0x17f0b
00000000`2a30f590 000007fe`fa3b8c96 TmUmEvt64+0x18504
00000000`2a30f5f0 000007fe`fa4565ca TmUmEvt64+0x18c96
00000000`2a30f680 000007fe`fa455f8e TmUmEvt64+0xb65ca
00000000`2a30f6e0 000007fe`fa410686 TmUmEvt64+0xb5f8e
00000000`2a30f830 000007fe`fa4363b0 TmUmEvt64+0x70686
00000000`2a30f940 00000000`7472f146 TmUmEvt64+0x963b0
00000000`2a30fb80 00000000`747e2d7d tmmon64+0x2f146
00000000`2a30fc60 00000000`747e29f4 tmmon64+0xe2d7d
00000000`2a30fd20 00000000`74732869 tmmon64+0xe29f4
00000000`2a30fd90 00000000`0046494e tmmon64+0x32869
00000000`2a30fe60 000000c0`00306100 go_bootstrap+0x6494e
00000000`2a30fe68 000000c0`0016b840 0xc0`00306100
00000000`2a30fe70 00000000`2a30fe60 0xc0`0016b840
00000000`2a30fe78 00000000`00000000 0x2a30fe60

. 13  Id: 263c.13a0 Suspend: 2 Teb: 000007ff`fff9e000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a50ec50 000007fe`fa3b8504 TmUmEvt64+0x17fa0
00000000`2a50eeb0 000007fe`fa3b8c96 TmUmEvt64+0x18504
00000000`2a50ef10 000007fe`fa4565ca TmUmEvt64+0x18c96
00000000`2a50efa0 000007fe`fa455f8e TmUmEvt64+0xb65ca
00000000`2a50f000 000007fe`fa410686 TmUmEvt64+0xb5f8e
00000000`2a50f150 000007fe`fa439730 TmUmEvt64+0x70686
00000000`2a50f260 00000000`7472f146 TmUmEvt64+0x99730
00000000`2a50f4a0 00000000`747e2d7d tmmon64+0x2f146
00000000`2a50f580 00000000`747e29f4 tmmon64+0xe2d7d
00000000`2a50f640 00000000`74733748 tmmon64+0xe29f4
00000000`2a50f6b0 000007fe`fcc77c3f tmmon64+0x33748
00000000`2a50f780 00000000`0046494e KERNELBASE!ResumeThread+0xf
00000000`2a50f7b0 ffffffff`ffffffff go_bootstrap+0x6494e
00000000`2a50f7b8 00000000`00000000 0xffffffff`ffffffff

. 15  Id: 263c.1c34 Suspend: 1 Teb: 000007ff`fff9a000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a90faf8 000007fe`fcc710ac ntdll!ZwWaitForSingleObject+0xa
00000000`2a90fb00 00000000`0046494e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`2a90fba0 000000c0`003b9400 go_bootstrap+0x6494e
00000000`2a90fba8 ffffffff`fff85ee0 0xc0`003b9400
00000000`2a90fbb0 00000000`00000000 0xffffffff`fff85ee0

. 16  Id: 263c.2530 Suspend: 1 Teb: 000007ff`fff98000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2ab0fb08 000007fe`fcc710ac ntdll!ZwWaitForSingleObject+0xa
00000000`2ab0fb10 00000000`0046494e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`2ab0fbb0 000000c0`003b9780 go_bootstrap+0x6494e
00000000`2ab0fbb8 00000000`00000009 0xc0`003b9780
00000000`2ab0fbc0 00000000`00000000 0x9

. 17  Id: 263c.3114 Suspend: 1 Teb: 000007ff`fff96000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2ad0fb08 000007fe`fcc710ac ntdll!ZwWaitForSingleObject+0xa
00000000`2ad0fb10 00000000`0046494e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`2ad0fbb0 000000c0`00508d00 go_bootstrap+0x6494e
00000000`2ad0fbb8 00000000`00000000 0xc0`00508d00

. 18  Id: 263c.3344 Suspend: 3 Teb: 000007ff`fff94000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2af0fb08 000007fe`fcc71430 ntdll!ZwWaitForMultipleObjects+0xa
00000000`2af0fb10 00000000`76df06c0 KERNELBASE!GetCurrentProcess+0x40
00000000`2af0fc10 00000000`0046494e kernel32!WaitForMultipleObjects+0xb0
00000000`2af0fca0 000007ff`fff94000 go_bootstrap+0x6494e
00000000`2af0fca8 00000000`00000000 0x7ff`fff94000

. 19  Id: 263c.3128 Suspend: 1 Teb: 000007ff`fff92000 Unfrozen
      Start: ntdll!DbgUiRemoteBreakin (00000000`76ff2dd0) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2b10ff28 00000000`76ff2e08 ntdll!DbgBreakPoint
00000000`2b10ff30 00000000`76df59cd ntdll!DbgUiRemoteBreakin+0x38
00000000`2b10ff60 00000000`76f2a561 kernel32!BaseThreadInitThunk+0xd
00000000`2b10ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Total threads: 20
Duplicate callstacks: 1 (windbg thread #s follow):
14

I was able to use Delve to examine this bug once (see #35775 (comment)), but not anymore. Delve just fails to attach now.

I can reproduce this pretty reliably on this particular computer - make.bat never completes. Sometimes it hangs in go_bootstrap.exe and sometimes in compile.exe. Sometimes there are more than single hung compile.exe.

I can make problem go away, if I change source code to have runtime.preemptMSupported set to false.

/cc @aclements

Alex

@ALTree ALTree added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows labels Jan 10, 2020
@aclements
Copy link
Member

Thanks for the detailed report.

Most of the threads look uninteresting, except, I think, these two:

.  4  Id: 263c.196c Suspend: 1 Teb: 000007ff`fffd4000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`28e5eb68 00000000`76f48f58 ntdll!ZwWaitForSingleObject+0xa
00000000`28e5eb70 00000000`76f48e54 ntdll!RtlDeNormalizeProcessParams+0x5a8
00000000`28e5ec20 000007fe`fa3b7f0b ntdll!RtlDeNormalizeProcessParams+0x4a4
00000000`28e5ec50 000007fe`fa3b8504 TmUmEvt64+0x17f0b
00000000`28e5eeb0 000007fe`fa3b8c96 TmUmEvt64+0x18504
00000000`28e5ef10 000007fe`fa4565ca TmUmEvt64+0x18c96
00000000`28e5efa0 000007fe`fa455f8e TmUmEvt64+0xb65ca
00000000`28e5f000 000007fe`fa410686 TmUmEvt64+0xb5f8e
00000000`28e5f150 000007fe`fa439730 TmUmEvt64+0x70686
00000000`28e5f260 00000000`7472f146 TmUmEvt64+0x99730
00000000`28e5f4a0 00000000`747e2d7d tmmon64+0x2f146
00000000`28e5f580 00000000`747e29f4 tmmon64+0xe2d7d
00000000`28e5f640 00000000`74733748 tmmon64+0xe29f4
00000000`28e5f6b0 000007fe`fcc77c3f tmmon64+0x33748
00000000`28e5f780 00000000`0046494e KERNELBASE!ResumeThread+0xf
00000000`28e5f7b0 ffffffff`ffffffff go_bootstrap+0x6494e
00000000`28e5f7b8 00000000`00000001 0xffffffff`ffffffff
00000000`28e5f7c0 ffffffff`ffffffff 0x1
00000000`28e5f7c8 00000000`28e5f928 0xffffffff`ffffffff
00000000`28e5f7d0 00000000`00000000 0x28e5f928

. 10  Id: 263c.2b14 Suspend: 1 Teb: 000007ff`fffa4000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`29eceb68 00000000`76f48f58 ntdll!ZwWaitForSingleObject+0xa
00000000`29eceb70 00000000`76f48e54 ntdll!RtlDeNormalizeProcessParams+0x5a8
00000000`29ecec20 000007fe`fa3b7f0b ntdll!RtlDeNormalizeProcessParams+0x4a4
00000000`29ecec50 000007fe`fa3b8504 TmUmEvt64+0x17f0b
00000000`29eceeb0 000007fe`fa3b8c96 TmUmEvt64+0x18504
00000000`29ecef10 000007fe`fa4565ca TmUmEvt64+0x18c96
00000000`29ecefa0 000007fe`fa455f8e TmUmEvt64+0xb65ca
00000000`29ecf000 000007fe`fa410686 TmUmEvt64+0xb5f8e
00000000`29ecf150 000007fe`fa439730 TmUmEvt64+0x70686
00000000`29ecf260 00000000`7472f146 TmUmEvt64+0x99730
00000000`29ecf4a0 00000000`747e2d7d tmmon64+0x2f146
00000000`29ecf580 00000000`747e29f4 tmmon64+0xe2d7d
00000000`29ecf640 00000000`74733748 tmmon64+0xe29f4
00000000`29ecf6b0 000007fe`fcc77c3f tmmon64+0x33748
00000000`29ecf780 00000000`0046494e KERNELBASE!ResumeThread+0xf
00000000`29ecf7b0 ffffffff`ffffffff go_bootstrap+0x6494e
00000000`29ecf7b8 00000000`00000000 0xffffffff`ffffffff

. 13  Id: 263c.13a0 Suspend: 2 Teb: 000007ff`fff9e000 Unfrozen
      Start: go_bootstrap+0x64d00 (00000000`00464d00) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a50ec50 000007fe`fa3b8504 TmUmEvt64+0x17fa0
00000000`2a50eeb0 000007fe`fa3b8c96 TmUmEvt64+0x18504
00000000`2a50ef10 000007fe`fa4565ca TmUmEvt64+0x18c96
00000000`2a50efa0 000007fe`fa455f8e TmUmEvt64+0xb65ca
00000000`2a50f000 000007fe`fa410686 TmUmEvt64+0xb5f8e
00000000`2a50f150 000007fe`fa439730 TmUmEvt64+0x70686
00000000`2a50f260 00000000`7472f146 TmUmEvt64+0x99730
00000000`2a50f4a0 00000000`747e2d7d tmmon64+0x2f146
00000000`2a50f580 00000000`747e29f4 tmmon64+0xe2d7d
00000000`2a50f640 00000000`74733748 tmmon64+0xe29f4
00000000`2a50f6b0 000007fe`fcc77c3f tmmon64+0x33748
00000000`2a50f780 00000000`0046494e KERNELBASE!ResumeThread+0xf
00000000`2a50f7b0 ffffffff`ffffffff go_bootstrap+0x6494e
00000000`2a50f7b8 00000000`00000000 0xffffffff`ffffffff

These are all stuck in ResumeThread, which I didn't know threads could get stuck in.

Thread 13 is also interesting because the "suspend count" is 2, suggesting that some other thread has suspended it and is failing to resume it. This may also be why threads 4 and 10 are stopped in obviously blocking operations, while thread 13 is stopped at a seemingly random point.

Do you know what "tmmon64" and "TmUmEvt64" are?

Maybe we just need to hold the suspendLock for longer (though I don't have a theory for why this would be). What happens if you move the unlock(&suspendLock) in preemptM to the very bottom of the function?

@alexbrainman
Copy link
Member Author

Do you know what "tmmon64" and "TmUmEvt64" are?

This is not my computer, so I cannot properly poke at it. But I suspect this computer has some standard anrivirus software installed. And these kinds of software often would install their code to intercept real Win32 API calls.

Maybe we just need to hold the suspendLock for longer (though I don't have a theory for why this would be). What happens if you move the unlock(&suspendLock) in preemptM to the very bottom of the function?

I won't see her on the weekend. But I will try it Monday or Tuesday. Also please show diff of your suggested change, because I don't trust myself with reading your mind.

Thank you.

Alex

@alexbrainman
Copy link
Member Author

Maybe we just need to hold the suspendLock for longer (though I don't have a theory for why this would be). What happens if you move the unlock(&suspendLock) in preemptM to the very bottom of the function?

I changed the code, like this

diff --git a/src/runtime/os_windows.go b/src/runtime/os_windows.go
index 91e147fca9..9166aeb323 100644
--- a/src/runtime/os_windows.go
+++ b/src/runtime/os_windows.go
@@ -1201,8 +1201,6 @@ func preemptM(mp *m) {
        // GetThreadContext actually blocks until it's suspended.
        stdcall2(_GetThreadContext, thread, uintptr(unsafe.Pointer(c)))

-       unlock(&suspendLock)
-
        // Does it want a preemption and is it safe to preempt?
        gp := gFromTLS(mp)
        if wantAsyncPreempt(gp) && isAsyncSafePoint(gp, c.ip(), c.sp(), c.lr())
{
@@ -1231,6 +1229,8 @@ func preemptM(mp *m) {

        stdcall1(_ResumeThread, thread)
        stdcall1(_CloseHandle, thread)
+
+       unlock(&suspendLock)
 }

 // osPreemptExtEnter is called before entering external code that may

And I managed to run make.bat once to successful completion. But then it hung as before, when I run make.bat second time. This time, it is compile.exe that hung. Here is the stack trace from windbg

0:009> !uniqstack
Processing 10 threads, please wait

.  0  Id: 2c08.3140 Suspend: 1 Teb: 000007ff`fffde000 Unfrozen
      Start: compile+0x67ef0 (00000000`00467ef0) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`0022fd98 000007fe`fcd610ac ntdll!ZwWaitForSingleObject+0xa
00000000`0022fda0 00000000`00467f5e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`0022fe40 00000000`0160d320 compile+0x67f5e
00000000`0022fe48 00000000`00000001 compile+0x120d320
00000000`0022fe50 00000000`00000000 0x1

.  1  Id: 2c08.243c Suspend: 1 Teb: 000007ff`fffdc000 Unfrozen
      Start: tmmon64+0x8adac (00000000`7480adac) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`03cdfcb8 000007fe`fcd61430 ntdll!ZwWaitForMultipleObjects+0xa
00000000`03cdfcc0 00000000`76d806c0 KERNELBASE!GetCurrentProcess+0x40
00000000`03cdfdc0 00000000`7482eed4 kernel32!WaitForMultipleObjects+0xb0
00000000`03cdfe50 00000000`7480ad07 tmmon64+0xaeed4
00000000`03cdff00 00000000`7480aeae tmmon64+0x8ad07
00000000`03cdff30 00000000`76d859cd tmmon64+0x8aeae
00000000`03cdff60 00000000`76fba561 kernel32!BaseThreadInitThunk+0xd
00000000`03cdff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

.  2  Id: 2c08.20ec Suspend: 1 Teb: 000007ff`fffda000 Unfrozen
      Start: ntdll!RtlDestroyHandleTable+0x270 (00000000`76faf6f0) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`03fefcc8 00000000`76faed15 ntdll!ZwWaitForWorkViaWorkerFactory+0xa
00000000`03fefcd0 00000000`76d859cd ntdll!RtlValidateHeap+0x155
00000000`03feff60 00000000`76fba561 kernel32!BaseThreadInitThunk+0xd
00000000`03feff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

.  3  Id: 2c08.2ecc Suspend: 1 Teb: 000007ff`fffd8000 Unfrozen
      Start: ntdll!TpIsTimerSet+0x8b0 (00000000`76faa280) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`041efcb8 00000000`76faa3c7 ntdll!ZwWaitForMultipleObjects+0xa
00000000`041efcc0 00000000`76d859cd ntdll!TpIsTimerSet+0x9f7
00000000`041eff60 00000000`76fba561 kernel32!BaseThreadInitThunk+0xd
00000000`041eff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

.  4  Id: 2c08.2cf4 Suspend: 1 Teb: 000007ff`fffd4000 Unfrozen
      Start: compile+0x68330 (00000000`00468330) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`299def98 00000000`76fd8f58 ntdll!ZwWaitForSingleObject+0xa
00000000`299defa0 00000000`76fd8e54 ntdll!RtlDeNormalizeProcessParams+0x5a8
00000000`299df050 00000000`76fde0b5 ntdll!RtlDeNormalizeProcessParams+0x4a4
00000000`299df080 00000000`76fdddd8 ntdll!RtlAllocateHeap+0x455
00000000`299df260 000007fe`fcd61635 ntdll!RtlAllocateHeap+0x178
00000000`299df370 000007fe`fa36bf7d KERNELBASE!LocalAlloc+0x75
00000000`299df3e0 000007fe`fa36c187 TmUmEvt64!TmmonDestoryAddonObject+0x63d
00000000`299df430 000007fe`fa36bbf1 TmUmEvt64!TmmonDestoryAddonObject+0x847
00000000`299df4b0 00000000`74862cd3 TmUmEvt64!TmmonDestoryAddonObject+0x2b1
00000000`299df580 00000000`748629f4 tmmon64+0xe2cd3
00000000`299df640 00000000`747b3de4 tmmon64+0xe29f4
00000000`299df6b0 00000000`76d72a4a tmmon64+0x33de4
00000000`299df780 00000000`00467f5e kernel32!GetThreadContext+0xa
00000000`299df7b0 ffffffff`ffffffff compile+0x67f5e
00000000`299df7b8 00000000`00000000 0xffffffff`ffffffff

.  5  Id: 2c08.ea4 Suspend: 1 Teb: 000007ff`fffae000 Unfrozen
      Start: compile+0x68330 (00000000`00468330) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`29bdfb08 000007fe`fcd610ac ntdll!ZwWaitForSingleObject+0xa
00000000`29bdfb10 00000000`00467f5e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`29bdfbb0 000000c0`0002c980 compile+0x67f5e
00000000`29bdfbb8 00000000`00000000 0xc0`0002c980

.  6  Id: 2c08.558 Suspend: 2 Teb: 000007ff`fffac000 Unfrozen
      Start: compile+0x68330 (00000000`00468330) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`29ddf530 00000000`76ffcf71 ntdll!RtlUnlockHeap+0x6a8
00000000`29ddf5c0 00000000`76fdddd8 ntdll!TpAlpcRegisterCompletionList+0xaf91
00000000`29ddf7a0 000007fe`fcd61635 ntdll!RtlAllocateHeap+0x178
00000000`29ddf8b0 000007fe`fa36bf7d KERNELBASE!LocalAlloc+0x75
00000000`29ddf920 000007fe`fa36c187 TmUmEvt64!TmmonDestoryAddonObject+0x63d
00000000`29ddf970 000007fe`fa36bbf1 TmUmEvt64!TmmonDestoryAddonObject+0x847
00000000`29ddf9f0 00000000`74862cd3 TmUmEvt64!TmmonDestoryAddonObject+0x2b1
00000000`29ddfac0 00000000`748629f4 tmmon64+0xe2cd3
00000000`29ddfb80 00000000`747b0aab tmmon64+0xe29f4
00000000`29ddfbf0 00000000`00467f5e tmmon64+0x30aab
00000000`29ddfcc0 000000c0`005c4000 compile+0x67f5e
00000000`29ddfcc8 00000000`00002000 0xc0`005c4000
00000000`29ddfcd0 00000000`00001000 0x2000
00000000`29ddfcd8 00000000`00000004 0x1000
00000000`29ddfce0 00000000`00432b66 0x4
00000000`29ddfce8 00000000`76d914cf compile+0x32b66
00000000`29ddfcf0 00000000`0000016c kernel32!GetTickCount+0x1f
00000000`29ddfcf8 00000000`ffffffff 0x16c
00000000`29ddfd00 00000000`00000000 0xffffffff

.  7  Id: 2c08.2f14 Suspend: 1 Teb: 000007ff`fffaa000 Unfrozen
      Start: compile+0x68330 (00000000`00468330) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a01fb08 000007fe`fcd610ac ntdll!ZwWaitForSingleObject+0xa
00000000`2a01fb10 00000000`00467f5e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`2a01fbb0 000000c0`00080280 compile+0x67f5e
00000000`2a01fbb8 00000000`01e7b5c0 0xc0`00080280
00000000`2a01fbc0 00000000`00000000 0x1e7b5c0

.  8  Id: 2c08.2f64 Suspend: 1 Teb: 000007ff`fffa8000 Unfrozen
      Start: ntdll!RtlDestroyHandleTable+0x270 (00000000`76faf6f0) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a68f3a8 00000000`76fd8f58 ntdll!ZwWaitForSingleObject+0xa
00000000`2a68f3b0 00000000`76fd8e54 ntdll!RtlDeNormalizeProcessParams+0x5a8
00000000`2a68f460 00000000`76fde0b5 ntdll!RtlDeNormalizeProcessParams+0x4a4
00000000`2a68f490 00000000`76fdddd8 ntdll!RtlAllocateHeap+0x455
00000000`2a68f670 00000000`76fba762 ntdll!RtlAllocateHeap+0x178
00000000`2a68f780 000007fe`fcd62898 ntdll!RtlUserThreadStart+0x222
00000000`2a68f7c0 000007fe`fa38ae5e KERNELBASE!FlsSetValue+0x168
00000000`2a68f7f0 000007fe`fa38a152 TmUmEvt64!GetInterface+0x1e50e
00000000`2a68f820 000007fe`fa389c2d TmUmEvt64!GetInterface+0x1d802
00000000`2a68f850 000007fe`fa385c19 TmUmEvt64!GetInterface+0x1d2dd
00000000`2a68f880 000007fe`fa385fc9 TmUmEvt64!GetInterface+0x192c9
00000000`2a68f8b0 000007fe`fa3861f9 TmUmEvt64!GetInterface+0x19679
00000000`2a68f8e0 00000000`76fba719 TmUmEvt64!GetInterface+0x198a9
00000000`2a68f940 00000000`76fba46f ntdll!RtlUserThreadStart+0x1d9
00000000`2a68fa40 00000000`76fba36e ntdll!LdrInitializeThunk+0x10f
00000000`2a68fab0 00000000`00000000 ntdll!LdrInitializeThunk+0xe

.  9  Id: 2c08.2b84 Suspend: 1 Teb: 000007ff`fffa6000 Unfrozen
      Start: ntdll!DbgUiRemoteBreakin (00000000`77082dd0) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a88ff28 00000000`77082e08 ntdll!DbgBreakPoint
00000000`2a88ff30 00000000`76d859cd ntdll!DbgUiRemoteBreakin+0x38
00000000`2a88ff60 00000000`76fba561 kernel32!BaseThreadInitThunk+0xd
00000000`2a88ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Total threads: 10

Alex

@alexbrainman
Copy link
Member Author

I just tried to verify this issue. And it is still broken on af9ab6b. It gets stuck here

c:\Users\alexb\dev\go\src>make
Building Go cmd/dist using c:\users\alexb\dev\\go1.4
Building Go toolchain1 using c:\users\alexb\dev\\go1.4.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.

This what process tree looks like when stuck

image

And this is what windbg says about compile.exe process with pid of 11956:


Microsoft (R) Windows Debugger Version 6.12.0002.633 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.

*** wait with pending attach
Symbol search path is: *** Invalid ***
****************************************************************************
* Symbol loading may be unreliable without a symbol search path.           *
* Use .symfix to have the debugger choose a symbol path.                   *
* After setting your symbol path, use .reload to refresh symbol locations. *
****************************************************************************
Executable search path is: 
ModLoad: 00000000`00400000 00000000`0175a000   c:\Users\alexb\dev\go\pkg\tool\windows_amd64\compile.exe
ModLoad: 00000000`76fb0000 00000000`7715a000   C:\Windows\SYSTEM32\ntdll.dll
ModLoad: 00000000`76d90000 00000000`76eaf000   C:\Windows\system32\kernel32.dll
ModLoad: 000007fe`fce10000 000007fe`fce7a000   C:\Windows\system32\KERNELBASE.dll
ModLoad: 000007fe`fa450000 000007fe`fa622000   C:\Windows\system32\tmumh\20019\AddOn\8.50.0.2071\TmUmEvt64.dll
ModLoad: 00000000`77160000 00000000`77167000   C:\Windows\system32\PSAPI.DLL
ModLoad: 00000000`76eb0000 00000000`76faa000   C:\Windows\system32\USER32.dll
ModLoad: 000007fe`fd850000 000007fe`fd8b7000   C:\Windows\system32\GDI32.dll
ModLoad: 000007fe`fee80000 000007fe`fee8e000   C:\Windows\system32\LPK.dll
ModLoad: 000007fe`fefa0000 000007fe`ff06b000   C:\Windows\system32\USP10.dll
ModLoad: 000007fe`fee90000 000007fe`fef2f000   C:\Windows\system32\msvcrt.dll
ModLoad: 000007fe`fd0e0000 000007fe`fd1bb000   C:\Windows\system32\ADVAPI32.dll
ModLoad: 000007fe`fee60000 000007fe`fee7f000   C:\Windows\SYSTEM32\sechost.dll
ModLoad: 000007fe`fe920000 000007fe`fea4d000   C:\Windows\system32\RPCRT4.dll
ModLoad: 000007fe`fd2a0000 000007fe`fd2ce000   C:\Windows\system32\IMM32.DLL
ModLoad: 000007fe`fd370000 000007fe`fd479000   C:\Windows\system32\MSCTF.dll
ModLoad: 000007fe`fa440000 000007fe`fa443000   C:\Windows\system32\api-ms-win-core-synch-l1-2-0.DLL
ModLoad: 00000000`747b0000 00000000`7491e000   C:\Windows\system32\tmumh\20019\TmMon\2.8.0.1034\tmmon64.dll
ModLoad: 000007fe`faab0000 000007fe`faaeb000   C:\Windows\system32\winmm.dll
ModLoad: 000007fe`fef50000 000007fe`fef9d000   C:\Windows\system32\ws2_32.dll
ModLoad: 000007fe`ff070000 000007fe`ff078000   C:\Windows\system32\NSI.dll
ModLoad: 000007fe`fcb40000 000007fe`fcb4f000   C:\Windows\system32\cryptbase.dll
ModLoad: 000007fe`fb350000 000007fe`fb37c000   C:\Windows\system32\powrprof.dll
ModLoad: 000007fe`ff080000 000007fe`ff257000   C:\Windows\system32\SETUPAPI.dll
ModLoad: 000007fe`fcd80000 000007fe`fcdb6000   C:\Windows\system32\CFGMGR32.dll
ModLoad: 000007fe`fd1c0000 000007fe`fd29a000   C:\Windows\system32\OLEAUT32.dll
ModLoad: 000007fe`fec60000 000007fe`fee5c000   C:\Windows\system32\ole32.dll
ModLoad: 000007fe`fcdd0000 000007fe`fcdea000   C:\Windows\system32\DEVOBJ.dll
(2eb4.2e3c): Break instruction exception - code 80000003 (first chance)
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Windows\SYSTEM32\ntdll.dll - 
ntdll!DbgBreakPoint:
00000000`76ffafb0 cc              int     3
0:011> !uniqstack
Processing 12 threads, please wait
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Windows\system32\tmumh\20019\AddOn\8.50.0.2071\TmUmEvt64.dll - 
*** ERROR: Module load completed but symbols could not be loaded for C:\Windows\system32\tmumh\20019\TmMon\2.8.0.1034\tmmon64.dll
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Windows\system32\KERNELBASE.dll - 
*** WARNING: Unable to verify timestamp for c:\Users\alexb\dev\go\pkg\tool\windows_amd64\compile.exe
*** ERROR: Module load completed but symbols could not be loaded for c:\Users\alexb\dev\go\pkg\tool\windows_amd64\compile.exe

.  0  Id: 2eb4.270c Suspend: 1 Teb: 000007ff`fffdd000 Unfrozen
      Start: compile+0x69500 (00000000`00469500) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`0022e618 00000000`76ff8f58 ntdll!ZwWaitForSingleObject+0xa
00000000`0022e620 00000000`76ff8e54 ntdll!RtlDeNormalizeProcessParams+0x5a8
00000000`0022e6d0 00000000`76ffe0b5 ntdll!RtlDeNormalizeProcessParams+0x4a4
00000000`0022e700 00000000`76ffddd8 ntdll!RtlAllocateHeap+0x455
00000000`0022e8e0 000007fe`fa553774 ntdll!RtlAllocateHeap+0x178
00000000`0022e9f0 000007fe`fa46427b TmUmEvt64!GetInterface+0x46e24
00000000`0022ea20 000007fe`fa4aef35 TmUmEvt64+0x1427b
00000000`0022ea50 000007fe`fa4cbea0 TmUmEvt64+0x5ef35
00000000`0022eaa0 000007fe`fa4cc560 TmUmEvt64+0x7bea0
00000000`0022eb70 000007fe`fa4d19f1 TmUmEvt64+0x7c560
00000000`0022ebc0 000007fe`fa4d4b8b TmUmEvt64+0x819f1
00000000`0022ec00 000007fe`fa4ceec8 TmUmEvt64+0x84b8b
00000000`0022f0f0 000007fe`fa4c0618 TmUmEvt64+0x7eec8
00000000`0022f120 000007fe`fa4e9730 TmUmEvt64+0x70618
00000000`0022f230 00000000`747df146 TmUmEvt64+0x99730
00000000`0022f470 00000000`74892d7d tmmon64+0x2f146
00000000`0022f550 00000000`748929f4 tmmon64+0xe2d7d
00000000`0022f610 00000000`747e30ac tmmon64+0xe29f4
00000000`0022f680 000007fe`fce17c3f tmmon64+0x330ac
00000000`0022f750 00000000`0046967e KERNELBASE!ResumeThread+0xf
00000000`0022f780 ffffffff`ffffffff compile+0x6967e
00000000`0022f788 00000000`00000000 0xffffffff`ffffffff
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Windows\system32\kernel32.dll - 

.  1  Id: 2eb4.2ed0 Suspend: 1 Teb: 000007ff`fffdb000 Unfrozen
      Start: tmmon64+0x8adac (00000000`7483adac) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`0395fcb8 000007fe`fce11430 ntdll!ZwWaitForMultipleObjects+0xa
00000000`0395fcc0 00000000`76da06c0 KERNELBASE!GetCurrentProcess+0x40
00000000`0395fdc0 00000000`7485eed4 kernel32!WaitForMultipleObjects+0xb0
00000000`0395fe50 00000000`7483ad07 tmmon64+0xaeed4
00000000`0395ff00 00000000`7483aeae tmmon64+0x8ad07
00000000`0395ff30 00000000`76da59cd tmmon64+0x8aeae
00000000`0395ff60 00000000`76fda561 kernel32!BaseThreadInitThunk+0xd
00000000`0395ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

.  2  Id: 2eb4.1b48 Suspend: 1 Teb: 000007ff`fffd9000 Unfrozen
      Start: ntdll!RtlDestroyHandleTable+0x270 (00000000`76fcf6f0) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`03b5fcc8 00000000`76fced15 ntdll!ZwWaitForWorkViaWorkerFactory+0xa
00000000`03b5fcd0 00000000`76da59cd ntdll!RtlValidateHeap+0x155
00000000`03b5ff60 00000000`76fda561 kernel32!BaseThreadInitThunk+0xd
00000000`03b5ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

.  3  Id: 2eb4.2b3c Suspend: 1 Teb: 000007ff`fffd7000 Unfrozen
      Start: ntdll!TpIsTimerSet+0x8b0 (00000000`76fca280) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`03d5fcb8 00000000`76fca3c7 ntdll!ZwWaitForMultipleObjects+0xa
00000000`03d5fcc0 00000000`76da59cd ntdll!TpIsTimerSet+0x9f7
00000000`03d5ff60 00000000`76fda561 kernel32!BaseThreadInitThunk+0xd
00000000`03d5ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

.  4  Id: 2eb4.125c Suspend: 1 Teb: 000007ff`fffd5000 Unfrozen
      Start: compile+0x69940 (00000000`00469940) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2956e648 00000000`76ff8f58 ntdll!ZwWaitForSingleObject+0xa
00000000`2956e650 00000000`76ff8e54 ntdll!RtlDeNormalizeProcessParams+0x5a8
00000000`2956e700 00000000`76ffe0b5 ntdll!RtlDeNormalizeProcessParams+0x4a4
00000000`2956e730 00000000`76ffddd8 ntdll!RtlAllocateHeap+0x455
00000000`2956e910 000007fe`fa553774 ntdll!RtlAllocateHeap+0x178
00000000`2956ea20 000007fe`fa46427b TmUmEvt64!GetInterface+0x46e24
00000000`2956ea50 000007fe`fa4aef35 TmUmEvt64+0x1427b
00000000`2956ea80 000007fe`fa4cbea0 TmUmEvt64+0x5ef35
00000000`2956ead0 000007fe`fa4cc560 TmUmEvt64+0x7bea0
00000000`2956eba0 000007fe`fa4d19f1 TmUmEvt64+0x7c560
00000000`2956ebf0 000007fe`fa4d4b8b TmUmEvt64+0x819f1
00000000`2956ec30 000007fe`fa4ceec8 TmUmEvt64+0x84b8b
00000000`2956f120 000007fe`fa4c0618 TmUmEvt64+0x7eec8
00000000`2956f150 000007fe`fa4e9730 TmUmEvt64+0x70618
00000000`2956f260 00000000`747df146 TmUmEvt64+0x99730
00000000`2956f4a0 00000000`74892d7d tmmon64+0x2f146
00000000`2956f580 00000000`748929f4 tmmon64+0xe2d7d
00000000`2956f640 00000000`747e30ac tmmon64+0xe29f4
00000000`2956f6b0 000007fe`fce17c3f tmmon64+0x330ac
00000000`2956f780 00000000`0046967e KERNELBASE!ResumeThread+0xf
00000000`2956f7b0 ffffffff`ffffffff compile+0x6967e
00000000`2956f7b8 00000000`00000001 0xffffffff`ffffffff
00000000`2956f7c0 ffffffff`ffffffff 0x1
00000000`2956f7c8 00000000`2956f928 0xffffffff`ffffffff
00000000`2956f7d0 00000000`00000000 0x2956f928

.  5  Id: 2eb4.2a8c Suspend: 1 Teb: 000007ff`fffd3000 Unfrozen
      Start: compile+0x69940 (00000000`00469940) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2976fbc8 000007fe`fce110ac ntdll!ZwWaitForSingleObject+0xa
00000000`2976fbd0 00000000`0046967e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`2976fc70 000000c0`0002ca78 compile+0x6967e
00000000`2976fc78 00000000`02030000 0xc0`0002ca78
00000000`2976fc80 00000000`00000000 0x2030000

.  6  Id: 2eb4.2670 Suspend: 3 Teb: 000007ff`fffae000 Unfrozen
      Start: compile+0x69940 (00000000`00469940) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2996e638 00000000`7701cf71 ntdll!RtlUnlockHeap+0x90
00000000`2996e640 00000000`76ffddd8 ntdll!TpAlpcRegisterCompletionList+0xaf91
00000000`2996e820 000007fe`fa553774 ntdll!RtlAllocateHeap+0x178
00000000`2996e930 000007fe`fa46427b TmUmEvt64!GetInterface+0x46e24
00000000`2996e960 000007fe`fa4aef35 TmUmEvt64+0x1427b
00000000`2996e990 000007fe`fa4cbea0 TmUmEvt64+0x5ef35
00000000`2996e9e0 000007fe`fa4cc560 TmUmEvt64+0x7bea0
00000000`2996eab0 000007fe`fa4d19f1 TmUmEvt64+0x7c560
00000000`2996eb00 000007fe`fa4d4b8b TmUmEvt64+0x819f1
00000000`2996eb40 000007fe`fa4ceec8 TmUmEvt64+0x84b8b
00000000`2996f030 000007fe`fa4c0618 TmUmEvt64+0x7eec8
00000000`2996f060 000007fe`fa4e9730 TmUmEvt64+0x70618
00000000`2996f170 00000000`747df146 TmUmEvt64+0x99730
00000000`2996f3b0 00000000`74892d7d tmmon64+0x2f146
00000000`2996f490 00000000`748929f4 tmmon64+0xe2d7d
00000000`2996f550 00000000`747e30ac tmmon64+0xe29f4
00000000`2996f5c0 000007fe`fce17c3f tmmon64+0x330ac
00000000`2996f690 00000000`0046967e KERNELBASE!ResumeThread+0xf
00000000`2996f6c0 ffffffff`ffffffff compile+0x6967e
00000000`2996f6c8 00000000`00000001 0xffffffff`ffffffff
00000000`2996f6d0 ffffffff`ffffffff 0x1
00000000`2996f6d8 00000000`2996f840 0xffffffff`ffffffff
00000000`2996f6e0 00000000`00000000 0x2996f840

.  7  Id: 2eb4.2870 Suspend: 1 Teb: 000007ff`fffac000 Unfrozen
      Start: compile+0x69940 (00000000`00469940) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`29e7fb08 000007fe`fce110ac ntdll!ZwWaitForSingleObject+0xa
00000000`29e7fb10 00000000`0046967e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`29e7fbb0 000000c0`00400278 compile+0x6967e
00000000`29e7fbb8 00000000`0041f9d7 0xc0`00400278
00000000`29e7fbc0 000000c0`00000000 compile+0x1f9d7
00000000`29e7fbc8 00000000`00000188 0xc0`00000000
00000000`29e7fbd0 00000000`00000000 0x188

.  8  Id: 2eb4.2f24 Suspend: 2 Teb: 000007ff`fffaa000 Unfrozen
      Start: compile+0x69940 (00000000`00469940) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a07fba8 000007fe`fce11945 ntdll!ZwAllocateVirtualMemory+0xa
00000000`2a07fbb0 00000000`747e0631 KERNELBASE!VirtualAlloc+0x45
00000000`2a07fbf0 00000000`0046967e tmmon64+0x30631
00000000`2a07fcc0 000000c0`013aa000 compile+0x6967e
00000000`2a07fcc8 00000000`00002000 0xc0`013aa000
00000000`2a07fcd0 00000000`00001000 0x2000
00000000`2a07fcd8 00000000`00000004 0x1000
00000000`2a07fce0 00000000`00000012 0x4
00000000`2a07fce8 00000000`00000003 0x12
00000000`2a07fcf0 00000000`00000000 0x3

.  9  Id: 2eb4.2188 Suspend: 1 Teb: 000007ff`fffa8000 Unfrozen
      Start: compile+0x69940 (00000000`00469940) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a28fbd8 000007fe`fce110ac ntdll!ZwWaitForSingleObject+0xa
00000000`2a28fbe0 00000000`0046967e KERNELBASE!WaitForSingleObjectEx+0x9c
00000000`2a28fc80 000000c0`00388278 compile+0x6967e
00000000`2a28fc88 00000000`00000000 0xc0`00388278

. 10  Id: 2eb4.25c8 Suspend: 1 Teb: 000007ff`fffa6000 Unfrozen
      Start: ntdll!RtlDestroyHandleTable+0x270 (00000000`76fcf6f0) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a56f418 00000000`76ff8f58 ntdll!ZwWaitForSingleObject+0xa
00000000`2a56f420 00000000`76ff8e54 ntdll!RtlDeNormalizeProcessParams+0x5a8
00000000`2a56f4d0 00000000`76ffe0b5 ntdll!RtlDeNormalizeProcessParams+0x4a4
00000000`2a56f500 00000000`76ffddd8 ntdll!RtlAllocateHeap+0x455
00000000`2a56f6e0 000007fe`fa5537eb ntdll!RtlAllocateHeap+0x178
00000000`2a56f7f0 000007fe`fa52a163 TmUmEvt64!GetInterface+0x46e9b
00000000`2a56f820 000007fe`fa529c2d TmUmEvt64!GetInterface+0x1d813
00000000`2a56f850 000007fe`fa525c19 TmUmEvt64!GetInterface+0x1d2dd
00000000`2a56f880 000007fe`fa525fc9 TmUmEvt64!GetInterface+0x192c9
00000000`2a56f8b0 000007fe`fa5261f9 TmUmEvt64!GetInterface+0x19679
00000000`2a56f8e0 00000000`76fda719 TmUmEvt64!GetInterface+0x198a9
00000000`2a56f940 00000000`76fda46f ntdll!RtlUserThreadStart+0x1d9
00000000`2a56fa40 00000000`76fda36e ntdll!LdrInitializeThunk+0x10f
00000000`2a56fab0 00000000`00000000 ntdll!LdrInitializeThunk+0xe

. 11  Id: 2eb4.2e3c Suspend: 1 Teb: 000007ff`fffa4000 Unfrozen
      Start: ntdll!DbgUiRemoteBreakin (00000000`770a2dd0) 
      Priority: 0  Priority class: 32  Affinity: f
Child-SP          RetAddr           Call Site
00000000`2a76ff28 00000000`770a2e08 ntdll!DbgBreakPoint
00000000`2a76ff30 00000000`76da59cd ntdll!DbgUiRemoteBreakin+0x38
00000000`2a76ff60 00000000`76fda561 kernel32!BaseThreadInitThunk+0xd
00000000`2a76ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x21

Total threads: 12

Alex

@aclements
Copy link
Member

Thanks for the two other dumps.

I did some more searching around and I'm almost positive tmmon64 and TmUmEvt64 are related to Trend Micro anti-virus, which agrees with what you said in #36492 (comment).

Unfortunately, I think its syscall interception is introducing a lock cycle that's leading to a deadlock.

In #36492 (comment), threads 6 and 8 have suspend counts > 1. Threads 0 and 6 (again) are in ResumeThread. So, thread 6 must have suspended thread 8 for preemption, and then when it was trying to resume thread 8, thread 0 suspended thread 6 for preemption. Where this gets interesting is that all three of these threads are in Windows memory allocation functions via tmmon64/TmUmEvt64. My guess is there's a cycle between threads 0 and 6: TmUmEvt64 on thread 6 locked the Windows heap inside ResumeThread, and was then suspended with that lock held. When thread 0 then tried to resume it with ResumeThread, TmUmEvt64 again tried to lock the Windows heap, but it can't get that lock, so it's stuck.

#36492 (comment) shows similar evidence: thread 6 is suspended in RtlUnlockHeap (via TmUmEvt64) and thread 4 is in GetThreadContext -> TmUmEvt64 -> RtlAllocateHeap, indicating that it has thread 6 suspended and is in a lock cycle on the Windows heap lock. Even completely serializing thread suspend/resume by moving the unlock doesn't help enough because TmUmEvt64 can wind up in Windows heap functions through other means.

So, ultimately, this is probably a bug in Trend Micro, but only because we're doing something really unusual with suspending our own threads, which, sadly, probably makes this our problem. The downsides of using SuspendThread keep piling up, but I have no idea what to replace it with. :(

@aclements
Copy link
Member

Fascinating. It seems .NET uses SuspendThread for driving threads to GC safe-points. It seems they ran into similar problems with Windows heap locks in general, though not specifically related to system call interceptors.

I don't see anything in threadsuspend.cpp itself that's obviously different from what we do, but there's a huge comment about OS resources and SuspendThread that indicates they're carefully synchronizing every transition into and out of managed code (presumably this includes every "system call") so they don't even attempt to suspend a thread that isn't in managed code. If I've followed the twisty passages correctly, this winds up at DisablePreemptiveGC and EnablePreemptiveGC and ultimately RareDisablePreemptiveGC and RareEnablePreemptiveGC, which look like they can block transitions into and out of managed code depending on GC preemption state.

@alexbrainman
Copy link
Member Author

I did some more searching around and I'm almost positive tmmon64 and TmUmEvt64 are related to Trend Micro anti-virus,

You are, probably, correct. I use this PC at work. Our admin run whatever software they like on it.

... we're doing something really unusual with suspending our own threads, which, sadly, probably makes this our problem.

I agree. I run a lot of different programs on that computer, and none of it hangs.

This makes Go build tools impossible to use on this computer. I suspect the same can be said about programs built with Go.

The PC is still running Windows 7 - which is rare this days. Hopefully this bug is uncommon.

The downsides of using SuspendThread keep piling up, but I have no idea what to replace it with. :(

Personally I don't see any benefits from preemption. I don't run any code that requires preemption on Windows. I would just disable preempt code on Windows. You will also avoid Delve problems on Windows.

Fascinating. It seems .NET uses SuspendThread for driving threads to GC safe-points. It seems they ran into similar problems with Windows heap locks in general, though not specifically related to system call interceptors.

It is quite possible. But I expect to see restrictions like that mentioned on Windows API descriptions. I am not aware of any such thing.

Adding @zx2c4 in case he has some bright ideas.

Alex

@alexbrainman
Copy link
Member Author

Another attempt to verify this issue. I checked against ec51703 (tagged as go1.17).

This time it was harder to break it - it took me 3 running of make.bat before it hung. I am not sure if it is because of new version of Go or because software on my OS changed or because my PC is overloaded or under-loaded with other programs.

Here is how my environment is configured:

set HOME=c:\users\alexb\dev\
set GOROOT=%HOME%\go
set GOROOT_BOOTSTRAP=%HOME%\go1.4
set GOPATH=%HOME%
set MINGW=%HOME%\tdm_gcc_64_5.1.0
set PATH=%PATH%;%MINGW%\bin;%GOROOT%\bin
cd %GOROOT%\src
cmd

Here is what I did:

c:\Users\alexb\dev\go\src>git rev-parse HEAD
ec5170397c724a8ae440b2bc529f857c86f0e6b1

c:\Users\alexb\dev\go\src>make
Building Go cmd/dist using c:\users\alexb\dev\\go1.4
Building Go toolchain1 using c:\users\alexb\dev\\go1.4.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for windows/amd64.
---
Installed Go for windows/amd64 in c:\Users\alexb\dev\go
Installed commands in c:\Users\alexb\dev\go\bin

c:\Users\alexb\dev\go\src>make
Building Go cmd/dist using c:\users\alexb\dev\\go1.4
Building Go toolchain1 using c:\users\alexb\dev\\go1.4.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for windows/amd64.
---
Installed Go for windows/amd64 in c:\Users\alexb\dev\go
Installed commands in c:\Users\alexb\dev\go\bin

c:\Users\alexb\dev\go\src>make
Building Go cmd/dist using c:\users\alexb\dev\\go1.4
Building Go toolchain1 using c:\users\alexb\dev\\go1.4.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for windows/amd64.

Here is a screenshot of Go build process tree that hung in Process Explorer program:

image

Setting set GODEBUG=asyncpreemptoff=1 mainly appears to solve my problem with building and using Go.

Unfortunately Go still hangs if I run some tests in runtime package, because some tests silently clear GODEBUG environment variable. Just grep runtime package for use of internal/testenv.CleanCmdEnv for details.

Alex

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@mknyszek mknyszek moved this to Triage Backlog in Go Compiler / Runtime Jul 15, 2022
@seankhliao seankhliao added this to the Unplanned milestone Aug 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Projects
Status: Triage Backlog
Development

No branches or pull requests

5 participants