problems with interpreter backtraces #29695

vtjnash · 2018-10-17T21:47:05Z

Seems to be two, possibly disjoint problems observed here:

"repeated 387 more times" and the immediately subsequent "repeated 166 more times" are precisely the same stacktrace
the interpreter duplicates f28442 on the callstack

$ ./julia --compile=min -q
julia> begin
           f28442(c) = g28442(c + 1)
           g28442(c) = c > 2000 ? (return backtrace()) : f28442(c + 1)
           bt = f28442(1)
           Base.show_backtrace(stdout, bt)
       end

Stacktrace:
 [1] g28442(::Int64) at REPL[0]:3
 [2] f28442(::Int64) at REPL[0]:2
 ... (the last 2 lines are repeated 442 more times)
 [887] g28442(::Int64) at REPL[0]:3
 [888] f28442(::Int64) at REPL[0]:2 (repeats 2 times)
 [889] g28442(::Int64) at REPL[0]:3
 ... (the last 2 lines are repeated 387 more times)
 [1664] f28442(::Int64) at REPL[0]:2 (repeats 2 times)
 [1665] g28442(::Int64) at REPL[0]:3
 ... (the last 2 lines are repeated 166 more times)
 [1998] f28442(::Int64) at REPL[0]:2
 [1999] top-level scope at REPL[0]:4
 [2000] eval(::Module, ::Any) at ./boot.jl:319
 [2001] eval_user_input(::Any, ::REPL.REPLBackend) at REPL.jl:85
 [2002] macro expansion at REPL.jl:117 [inlined]

The text was updated successfully, but these errors were encountered:

c42f · 2018-10-26T06:48:30Z

the interpreter duplicates f28442 on the callstack

Actually this appears to be the opposite: g28442 is "missing" from the callstack (actually present as the C frame Lenter_interpreter_frame_start_val). Here's the problematic part of the stacktrace with C frames turned on:

g28442(::Int64) at REPL[30]:3
jl_fptr_interpret_call at interpreter.c:861
jl_apply_generic at gf.c:2209
do_call at interpreter.c:323
eval_value at interpreter.c:411
eval_stmt_value at interpreter.c:362 [inlined]
eval_body at interpreter.c:757
jl_interpret_call_callback at interpreter.c:853

f28442(::Int64) at REPL[30]:2
jl_fptr_interpret_call at interpreter.c:861
jl_apply_generic at gf.c:2209
do_call at interpreter.c:323
eval_value at interpreter.c:411
eval_stmt_value at interpreter.c:362 [inlined]
eval_body at interpreter.c:757
jl_interpret_call_callback at interpreter.c:853

Lenter_interpreter_frame_start_val at libjulia.so.1:?
jl_fptr_interpret_call at interpreter.c:861
jl_apply_generic at gf.c:2209
do_call at interpreter.c:323
eval_value at interpreter.c:411
eval_stmt_value at interpreter.c:362 [inlined]
eval_body at interpreter.c:757
jl_interpret_call_callback at interpreter.c:853

f28442(::Int64) at REPL[30]:2
jl_fptr_interpret_call at interpreter.c:861
jl_apply_generic at gf.c:2209
do_call at interpreter.c:323
eval_value at interpreter.c:411
eval_stmt_value at interpreter.c:362 [inlined]
eval_body at interpreter.c:757
jl_interpret_call_callback at interpreter.c:853

Looking at the code, the reason for this is they way that jl_backtrace_from_here captures the stack trace in chunks; when there's an interpreter frame on the boundary of a chunk there's not enough space to capture it inside jl_unw_stepn, but instead of falling back to the next chunk the frame gets captured as a plain C frame instead.

Increase expressibility of what can be stored in backtrace buffers, while ensuring that the GC can find roots without knowing about the detail. To do this, introduce a new "extended backtrace entry" format which carries along the number of roots and other data in a bitpacked format. This allows the backtrace buffer to be traversed and the roots collected in a general way, without the GC knowing about interpreter frames. Use this to add the module to InterperterIP so that the module of interpreted top level thunks can be known. In the future the extended entry format should allow us to be a lot more flexible with what can be stored in a backtrace. For example, we could * Compress the backtrace cycles of runaway recursive functions so that stack overflows are much more likely to fit in the fixed-size bt_data array. * Integrate external or other types of interpreter frames into the backtrace machinery. Also: * Fix some problems with how jl_unw_stepn is used to collect backtraces in chunks in jl_backtrace_from_here (fixes #29695). * Improve printing of InterpreterIP's holding CodeInfo objects so they fit on one line.

Increase expressibility of what can be stored in backtrace buffers, while ensuring that the GC can find roots without knowing about the detail. To do this, introduce a new "extended backtrace entry" format which carries along the number of roots and other data in a bitpacked format. This allows the backtrace buffer to be traversed and the roots collected in a general way, without the GC knowing about interpreter frames. Use this to add the module to InterperterIP so that the module of interpreted top level thunks can be known. In the future the extended entry format should allow us to be a lot more flexible with what can be stored in a backtrace. For example, we could * Compress the backtrace cycles of runaway recursive functions so that stack overflows are much more likely to fit in the fixed-size bt_data array. * Integrate external or other types of interpreter frames into the backtrace machinery. Also: * Add interpreter frames to stack trace dumping of fatal errors. * Fix some problems with how jl_unw_stepn is used to collect backtraces in chunks in jl_backtrace_from_here (fixes #29695). * Improve printing of InterpreterIP's holding CodeInfo objects so they fit on one line.

* Translate return->call address internally in jl_unw_stepn so that users don't have to deal with this subtlety * Rearrange the jl_unw_stepn return code so that interpreter stack traces can always be fit in to bt_data when calling jl_unw_stepn incrementally for deep stacks. Fixes #29695

…_unw_stepn Translate return->call address internally in jl_unw_stepn so that users never have to deal with this subtlety when calling StackTraces.lookup. Also rearrange the jl_unw_stepn API a little so that interpreter stack traces can always be fit in to bt_data when calling jl_unw_stepn incrementally for deep stacks. Fixes #29695

…_unw_stepn Translate return->call address internally in jl_unw_stepn so that users never have to deal with this subtlety when calling StackTraces.lookup. Also rearrange the jl_unw_stepn API a little so that interpreter stack traces can always be fit in to bt_data when calling jl_unw_stepn incrementally for deep stacks + add test cases for that (Fixes #29695).

…_unw_stepn Translate return->call address internally in jl_unw_stepn so that users never have to deal with this subtlety when calling StackTraces.lookup. Also rearrange the jl_unw_stepn API a little so that interpreter stack traces can always be fit in to bt_data when calling jl_unw_stepn incrementally for deep stacks + add test cases for that (Fixes #29695). For the tests, clean up printing of CodeInfo objects wrapped in InterpreterIP.

vtjnash mentioned this issue Oct 18, 2018

tests known to fail in the interpreter #29379

Closed

4 tasks

ararslan added compiler:interpreter error handling Handling of exceptions by Julia or the user labels Oct 21, 2018

ararslan mentioned this issue Dec 1, 2018

try changes backtraces on FreeBSD #30233

Closed

c42f mentioned this issue Sep 16, 2019

Make backtrace buffer handling more systematic #33277

Merged

c42f mentioned this issue Sep 25, 2019

Clean up return->call address translation for backtraces #33380

Merged

JeffBezanson closed this as completed in #33380 Oct 2, 2019

IanButterworth mentioned this issue Mar 2, 2024

backtrace test failing on windows #53548

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

problems with interpreter backtraces #29695

problems with interpreter backtraces #29695

vtjnash commented Oct 17, 2018

c42f commented Oct 26, 2018 •

edited

Loading

problems with interpreter backtraces #29695

problems with interpreter backtraces #29695

Comments

vtjnash commented Oct 17, 2018

c42f commented Oct 26, 2018 • edited Loading

c42f commented Oct 26, 2018 •

edited

Loading