-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite recursion from GetThreadID in Debugger #49
Comments
Yes! It makes perfect sense. I now can't believe I didn't see this before. Thanks for shedding light - again, two issues here. Looks like the underlying issue has been there for a long time - both from when I added the While I think the Your workaround could in fact be called a fix. I will implement it (or a variant thereof) and get it tested and published asap. I may try and improve A massive thanks for this work. |
…or LdrpInvertedFunctionTableSRWLock held - thank you @michaelweiser (#49)
Fix now pushed - thank you again! |
Hi,
I'm seeing stack overflow exceptions on Windows 10 with even the simplest program doing a single API call:
Unfortunately, I was not able to grab any meaningful backtrace beyond it happening in
enter_hook()
andoperate_on_backtrace()
.Through single stepping the code I think to have found the root cause but can only describe it verbally with links into the capemon source code:
enter_hook()
calls__called_by_hook()
to prevent hook recursion:capemon/hooking.c
Line 293 in e62f1a4
__called_by_hook()
runsaddr_in_our_dll_range()
viaoperate_on_backtrace()
:capemon/hooking.c
Line 181 in e62f1a4
operate_on_backtrace()
in the 64 bit version runsour_stackwalk()
to retrieve the number of strack frames to look at:capemon/hooking_64.c
Line 1168 in e62f1a4
our_stackwalk()
will return zero if the SRW lock is held or an EXCEPTION_EXECUTE_HANDLER exception occurs (I'm fuzzy on the details of the latter):capemon/hooking_64.c
Line 1126 in e62f1a4
capemon/hooking_64.c
Line 1150 in e62f1a4
operate_on_backtrace()
to never calladdr_in_our_dll_range()
and will default to returning zeroThis in the context of
__called_by_hook()
means thatenter_hook()
was not triggered from another hook. This essentially creates potential for unwanted hook recursion whenever the SRW lock is held or that execution exception occurs during stack unwinding.This seems to quite reliably be triggered and turned into infinite recursion by the Debugger:
__called_by_hook()
having toldenter_hook()
that it was not called by a hook,api_dispatch()
is calledapi_dispatch()
may (and in my observation basically always does) callInitNewThreadBreakpoints()
InitNewThreadBreakpoints()
callsCreateThreadBreakpoints()
CreateThreadBreakpoints()
callsGetThreadId()
GetThreadId()
internally (at least on Windows 10) callsNtQueryInformationThread()
-> which is hookedThis causes instantaneous inifinite hook recursion on any hooked API call (at the very least if the SRW lock is held), leading to the observed stack overflow.
To recap, the call chain is:
/any API call/ -> [recurse:
enter_hook()
+__called_by_hook()
== 0 ->api_dispatch()
->InitNewThreadBreakpoints()
->CreateThreadBreakpoints()
->GetThreadId()
->NtQueryInformationThread()
]My workaround looks like this:
What this does is make
our_stackwalk()
indicate the inability to walk the stack at all by returning-1
. This will still makeoperate_on_backtrace()
not calladdr_in_our_dll_range()
but the changed return code default of-1
will again indicate that fact to the caller. The only caller evaluating the return code at all is__called_by_hook()
. There we now cautiously return1
, meaning "yes, we've been or at least could have been called from a hook". This successfully prevents the infinite recursion and subsequent stack overflow in my tests.Does any of that make sense?
The text was updated successfully, but these errors were encountered: