Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

on win64, use SEH unwind tables to walk callstack #1222

Open
derekbruening opened this issue Nov 28, 2014 · 2 comments
Open

on win64, use SEH unwind tables to walk callstack #1222

derekbruening opened this issue Nov 28, 2014 · 2 comments

Comments

@derekbruening
Copy link
Contributor

From [email protected] on May 08, 2013 22:42:06

With the common lack of frame pointers for win64 code, we should use the
SEH unwind tables to construct callstacks.

I'm seeing a lot of callstack issues with extra frames on win64 with
-replace_malloc, and this is one solution.

Original issue: http://code.google.com/p/drmemory/issues/detail?id=1222

@derekbruening
Copy link
Contributor Author

Summary at WIP point Dec 2015:

Unfortunately RtlLookupFunctionTable acquires a lock
(LdrpInvertedFunctionTableSRWLock), making it unsafe for us to use. Thus
we must implement our own RtlLookupFunctionTable and RtlVirtualUnwind
routines.

Fortunately, we already have .pdata parsing code in DR in
add_SEH_to_rct_table().

Xref:

  • win32/module.c add_SEH_to_rct_table()
  • i#1300: investigate new Ki export on Win8.1 x64: KiUserInvertedFunctionTable
  • PR 250395: RCT targets and x64 SEH
  • PR 276527: [x64] VIOLATIONS starting up IE64: SEH64 handlers

Challenges:

  • What about DGC that has called RtlInstallFunctionTableCallback or
    RtlAddFunctionTable? I would say, do not handle this -- just fall back
    to stack scanning.
  • We need the PC and FP now instead of just the SP. This requires
    refactoring all code requesting callstacks. For -replace_malloc, this
    requires drwrap storing the caller's FP. Alternatively, can we start the
    unwind in our replace routine? Is the frame sequence seamless enough?!?
    Or, we can fall back to stack scanning when an FP is needed, which is
    pretty rare (I measured 1 in 500 on Chromium test binaries).
  • Unwinding is fragile: if the SP is not quite right it fails, yet there is
    no clear success indication. We probably need to integrate it into the
    current callstack walk and use heuristics to decide when to fall back to
    stack scanning on each frame.
  • The source of the bogus retaddrs for the bogus callstack frames with the
    current callstack walk (which is not using any unwind data) is not yet
    known -- we should figure out whether we can easily eliminate those.
    Xref [x64] CRASH due to bogus callstack frames messing up nosy heap handling #1833.
  • Unwinding is failing when inside some ntdll routines -- or is xsp just off
    or sthg?
  • I've seen other weird things, like a DR routine show up when doing an
    unwind, which makes no sense. Basically, so far this unwinding has been
    fragile and unreliable, at least with our current app state starting points.

@derekbruening
Copy link
Contributor Author

We would want to put this in place for any general library for #823.

derekbruening added a commit that referenced this issue Oct 17, 2021
Updates DR to 53af6c7 for the new drcallstack library.

Adds a new option -callstack_use_unwind which is on by default for
Linux.  This uses drcallstack's libunwind-based callstack walk, which
fixes problems of missing frames.  If the starting PC is not in a
module, the old callstack walking is used.

Updates malloc replacement contexts to include the PC as of the same
point as the captured stack pointer, for proper libunwind input.

Issue: #823, #2399, #2392, #1222
Fixes #2392
derekbruening added a commit that referenced this issue Oct 18, 2021
Updates DR to 53af6c7 for the new drcallstack library.

Adds a new option -callstack_use_unwind which is on by default for
Linux.  This uses drcallstack's libunwind-based callstack walk, which
fixes problems of missing frames.  If the starting PC is not in a
module, the old callstack walking is used.

Updates malloc replacement contexts to include the PC as of the same
point as the captured stack pointer, for proper libunwind input.

Issue: #823, #2399, #2392, #1222
Fixes #2392
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant