Clean up return->call address translation for backtraces #33380

c42f · 2019-09-25T06:32:27Z

Here I translate return->call address internally in jl_unw_stepn so that users never have to deal with this subtlety when calling StackTraces.lookup, as discussed on slack with @vtjnash. This is a prelude to bringing in some larger changes from #33277, but which got conflicted with the merge of #33190.

The first commit also rearranges the jl_unw_stepn API a little so that interpreter stack traces can always be fit in to bt_data when calling jl_unw_stepn incrementally for deep stacks + add test cases for that (fixes #29695). For the tests, clean up printing of CodeInfo objects wrapped in InterpreterIP as well.

The second commit reverts the API breaking name change lookupat->lookup from #33190. @timholy I think this resolves your reservations from #33190 (comment). With return->call address translation done inside jl_unw_stepn, there should largely be no need to make the API breaking change, as all backtrace addresses which the user has access to will already be translated and can be passed directly into StackTraces.lookup(), exactly as was done in julia 1.3. (There's one possible exception to this rule: If the user were looking up the debug info for a pointer to a C function not obtained from a backtrace, they previously would have had to use StackTraces.lookup(ptr+1) to undo the the ptr-1 translation that StackTraces.lookup did prior to #33190. This kind of usage is reflected in the test which does StackTraces.lookup(ctestptr) but it seems like an edge case and not worth breaking the API over.)

…_unw_stepn Translate return->call address internally in jl_unw_stepn so that users never have to deal with this subtlety when calling StackTraces.lookup. Also rearrange the jl_unw_stepn API a little so that interpreter stack traces can always be fit in to bt_data when calling jl_unw_stepn incrementally for deep stacks + add test cases for that (Fixes #29695). For the tests, clean up printing of CodeInfo objects wrapped in InterpreterIP.

With return->call address tanslation done inside jl_unw_stepn, there should largely be no need to make this API breaking change, as all backtrace addresses which the user has access to will already be translated and can be passed directly into `StackTraces.lookup()`, as was done previously. There's one possible exception to this rule: If the user were looking up the debug info for a pointer to a C function *not* obtained from a backtrace, they previously would have had to use `StackTraces.lookup(ptr+1)` to "undo" the the `ptr-1` translation that `StackTraces.lookup` used to perform prior to #33190. This seems like an edge case however and probably not worth breaking the API over.

c42f · 2019-09-25T08:10:21Z

src/stackwalk.c

+            // * The LLVM unwinder functions step() and setInfoBasedOnIPRegister()
+            //   https://github.com/llvm/llvm-project/blob/master/libunwind/src/UnwindCursor.hpp
+            // * The way that libunwind handles it in `unw_get_proc_name`:
+            //   https://lists.nongnu.org/archive/html/libunwind-devel/2014-06/msg00025.html


As you can see, I spent a while looking up what people do for this stuff. AFAICT it seems that return_ip - 1 should be sufficient for our use cases. libunwind does something slightly more fancy, but I guess we don't want to trace through signal handler frames. Also libunwind's dwarf.use_prev_instr field is hidden from us behind an opaque pointer in any case (see unw_get_proc_name).

The LLVM unwinder seems to have a special case for ARM in thumb mode and I tried to do the same for our ARMv7 support but I don't know how to test that yet. @yuyichao does the above use of _CPU_ARM_ look right?

As you can see, I spent a while looking up what people do for this stuff. AFAICT it seems that return_ip - 1 should be sufficient for our use cases. libunwind does something slightly more fancy, but I guess we don't want to trace through signal handler frames. Also libunwind's dwarf.use_prev_instr field is hidden from us behind an opaque pointer in any case (see unw_get_proc_name).

As long as we don't give that correct pointer back to libunwind and don't add the offset when we get the backtrace from the signal frame it should be fine.

does the above use of CPU_ARM look right?

LGTM.

don't add the offset when we get the backtrace from the signal frame

Oops, looks like I was doing that part wrong. This should now be cleaned up in the latest commit by avoiding doing the offset for contexts we know have come from a signal handler. I checked that this matches the behavior of libunwind's unw_backtrace() for those contexts on my machine.

…al frames Add an option to jl_unw_stepn to avoid adjusting the instruction pointer when we know the cursor derives from a signal frame. (We could alternatively try to do this with unw_is_signal_frame, but that wouldn't work for windows and would be an extra function call for each jl_unw_step which seems a bit unnecessary when we already know the top frame is the signal frame.) Also generalize skipping of the first few backtrace frames as needed to hide the internal backtrace machinery itself by adding a `skip` option to jl_unw_stepn/jl_backtrace_from_here/record_backtrace. As part of this also move the workaround for 32-bit windows into the backtrace internals. Move Base.backtrace into error.jl as it doesn't need any lookup functionality from StackTraces to work correctly.

c42f · 2019-09-26T05:06:59Z

I've added another commit here to be more careful to not adjust the pointer in the case of signal frames (@vtjnash I think this will also be more correct for use in your Profiler work)

I also generalized the skipping of certain internal frames involved in capturing the backtraces themselves so frames for jl_backtrace_from_here and jl_throw etc can be systematically elided. (For internal debugging it's possible to put them back in by using a negative skip argument to jl_backtrace_from_here.)

base/stacktraces.jl

c42f · 2019-10-02T03:50:04Z

Bump. Any takers for a more detailed look? @JeffBezanson?

Perhaps I should just merge this as it allows us to revert a breaking change, passes tests, and also fixes a long standing bug. But having someone read it closely enough to click "Approve" is always appreciated!

vchuravy

looks good to me.

c42f · 2019-10-02T23:28:40Z

Thanks guys!

c42f added 2 commits September 25, 2019 16:14

c42f requested a review from vtjnash September 25, 2019 06:32

c42f added error handling Handling of exceptions by Julia or the user bugfix This change fixes an existing bug labels Sep 25, 2019

c42f mentioned this pull request Sep 25, 2019

Make backtrace buffer handling more systematic #33277

Merged

Clear ARMv7 instruction pointer encoding bit for use in caller lookup

141ff51

c42f commented Sep 25, 2019

View reviewed changes

c42f force-pushed the cjf/bt-lookup-offset branch from 4c6dc81 to 7528266 Compare September 26, 2019 04:58

ararslan reviewed Sep 29, 2019

View reviewed changes

base/stacktraces.jl Show resolved Hide resolved

ararslan requested a review from JeffBezanson October 2, 2019 04:02

vchuravy approved these changes Oct 2, 2019

View reviewed changes

JeffBezanson merged commit 3f37178 into master Oct 2, 2019

JeffBezanson deleted the cjf/bt-lookup-offset branch October 2, 2019 21:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean up return->call address translation for backtraces #33380

Clean up return->call address translation for backtraces #33380

c42f commented Sep 25, 2019

c42f Sep 25, 2019 •

edited

Loading

yuyichao Sep 25, 2019

c42f Sep 26, 2019

c42f commented Sep 26, 2019 •

edited

Loading

c42f commented Oct 2, 2019

vchuravy left a comment

c42f commented Oct 2, 2019

Clean up return->call address translation for backtraces #33380

Clean up return->call address translation for backtraces #33380

Conversation

c42f commented Sep 25, 2019

c42f Sep 25, 2019 • edited Loading

Choose a reason for hiding this comment

yuyichao Sep 25, 2019

Choose a reason for hiding this comment

c42f Sep 26, 2019

Choose a reason for hiding this comment

c42f commented Sep 26, 2019 • edited Loading

c42f commented Oct 2, 2019

vchuravy left a comment

Choose a reason for hiding this comment

c42f commented Oct 2, 2019

c42f Sep 25, 2019 •

edited

Loading

c42f commented Sep 26, 2019 •

edited

Loading