Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with interpreter backtraces #29695

Closed
vtjnash opened this issue Oct 17, 2018 · 1 comment · Fixed by #33380
Closed

problems with interpreter backtraces #29695

vtjnash opened this issue Oct 17, 2018 · 1 comment · Fixed by #33380
Labels
compiler:interpreter error handling Handling of exceptions by Julia or the user

Comments

@vtjnash
Copy link
Member

vtjnash commented Oct 17, 2018

Seems to be two, possibly disjoint problems observed here:

  1. "repeated 387 more times" and the immediately subsequent "repeated 166 more times" are precisely the same stacktrace
  2. the interpreter duplicates f28442 on the callstack
$ ./julia --compile=min -q
julia> begin
           f28442(c) = g28442(c + 1)
           g28442(c) = c > 2000 ? (return backtrace()) : f28442(c + 1)
           bt = f28442(1)
           Base.show_backtrace(stdout, bt)
       end

Stacktrace:
 [1] g28442(::Int64) at REPL[0]:3
 [2] f28442(::Int64) at REPL[0]:2
 ... (the last 2 lines are repeated 442 more times)
 [887] g28442(::Int64) at REPL[0]:3
 [888] f28442(::Int64) at REPL[0]:2 (repeats 2 times)
 [889] g28442(::Int64) at REPL[0]:3
 ... (the last 2 lines are repeated 387 more times)
 [1664] f28442(::Int64) at REPL[0]:2 (repeats 2 times)
 [1665] g28442(::Int64) at REPL[0]:3
 ... (the last 2 lines are repeated 166 more times)
 [1998] f28442(::Int64) at REPL[0]:2
 [1999] top-level scope at REPL[0]:4
 [2000] eval(::Module, ::Any) at ./boot.jl:319
 [2001] eval_user_input(::Any, ::REPL.REPLBackend) at REPL.jl:85
 [2002] macro expansion at REPL.jl:117 [inlined]
@ararslan ararslan added compiler:interpreter error handling Handling of exceptions by Julia or the user labels Oct 21, 2018
@c42f
Copy link
Member

c42f commented Oct 26, 2018

the interpreter duplicates f28442 on the callstack

Actually this appears to be the opposite: g28442 is "missing" from the callstack (actually present as the C frame Lenter_interpreter_frame_start_val). Here's the problematic part of the stacktrace with C frames turned on:

g28442(::Int64) at REPL[30]:3
jl_fptr_interpret_call at interpreter.c:861
jl_apply_generic at gf.c:2209
do_call at interpreter.c:323
eval_value at interpreter.c:411
eval_stmt_value at interpreter.c:362 [inlined]
eval_body at interpreter.c:757
jl_interpret_call_callback at interpreter.c:853

f28442(::Int64) at REPL[30]:2
jl_fptr_interpret_call at interpreter.c:861
jl_apply_generic at gf.c:2209
do_call at interpreter.c:323
eval_value at interpreter.c:411
eval_stmt_value at interpreter.c:362 [inlined]
eval_body at interpreter.c:757
jl_interpret_call_callback at interpreter.c:853

Lenter_interpreter_frame_start_val at libjulia.so.1:?
jl_fptr_interpret_call at interpreter.c:861
jl_apply_generic at gf.c:2209
do_call at interpreter.c:323
eval_value at interpreter.c:411
eval_stmt_value at interpreter.c:362 [inlined]
eval_body at interpreter.c:757
jl_interpret_call_callback at interpreter.c:853

f28442(::Int64) at REPL[30]:2
jl_fptr_interpret_call at interpreter.c:861
jl_apply_generic at gf.c:2209
do_call at interpreter.c:323
eval_value at interpreter.c:411
eval_stmt_value at interpreter.c:362 [inlined]
eval_body at interpreter.c:757
jl_interpret_call_callback at interpreter.c:853

Looking at the code, the reason for this is they way that jl_backtrace_from_here captures the stack trace in chunks; when there's an interpreter frame on the boundary of a chunk there's not enough space to capture it inside jl_unw_stepn, but instead of falling back to the next chunk the frame gets captured as a plain C frame instead.

c42f added a commit that referenced this issue Sep 16, 2019
Increase expressibility of what can be stored in backtrace buffers,
while ensuring that the GC can find roots without knowing about the
detail.

To do this, introduce a new "extended backtrace entry" format which
carries along the number of roots and other data in a bitpacked format.
This allows the backtrace buffer to be traversed and the roots collected
in a general way, without the GC knowing about interpreter frames. Use
this to add the module to InterperterIP so that the module of
interpreted top level thunks can be known.

In the future the extended entry format should allow us to be a lot more
flexible with what can be stored in a backtrace. For example, we could
* Compress the backtrace cycles of runaway recursive functions so that
  stack overflows are much more likely to fit in the fixed-size bt_data
  array.
* Integrate external or other types of interpreter frames into the
  backtrace machinery.

Also:
* Fix some problems with how jl_unw_stepn is used to collect backtraces
  in chunks in jl_backtrace_from_here (fixes #29695).
* Improve printing of InterpreterIP's holding CodeInfo objects so they
  fit on one line.
c42f added a commit that referenced this issue Sep 16, 2019
Increase expressibility of what can be stored in backtrace buffers,
while ensuring that the GC can find roots without knowing about the
detail.

To do this, introduce a new "extended backtrace entry" format which
carries along the number of roots and other data in a bitpacked format.
This allows the backtrace buffer to be traversed and the roots collected
in a general way, without the GC knowing about interpreter frames. Use
this to add the module to InterperterIP so that the module of
interpreted top level thunks can be known.

In the future the extended entry format should allow us to be a lot more
flexible with what can be stored in a backtrace. For example, we could
* Compress the backtrace cycles of runaway recursive functions so that
  stack overflows are much more likely to fit in the fixed-size bt_data
  array.
* Integrate external or other types of interpreter frames into the
  backtrace machinery.

Also:
* Add interpreter frames to stack trace dumping of fatal errors.
* Fix some problems with how jl_unw_stepn is used to collect backtraces
  in chunks in jl_backtrace_from_here (fixes #29695).
* Improve printing of InterpreterIP's holding CodeInfo objects so they
  fit on one line.
c42f added a commit that referenced this issue Sep 17, 2019
Increase expressibility of what can be stored in backtrace buffers,
while ensuring that the GC can find roots without knowing about the
detail.

To do this, introduce a new "extended backtrace entry" format which
carries along the number of roots and other data in a bitpacked format.
This allows the backtrace buffer to be traversed and the roots collected
in a general way, without the GC knowing about interpreter frames. Use
this to add the module to InterperterIP so that the module of
interpreted top level thunks can be known.

In the future the extended entry format should allow us to be a lot more
flexible with what can be stored in a backtrace. For example, we could
* Compress the backtrace cycles of runaway recursive functions so that
  stack overflows are much more likely to fit in the fixed-size bt_data
  array.
* Integrate external or other types of interpreter frames into the
  backtrace machinery.

Also:
* Add interpreter frames to stack trace dumping of fatal errors.
* Fix some problems with how jl_unw_stepn is used to collect backtraces
  in chunks in jl_backtrace_from_here (fixes #29695).
* Improve printing of InterpreterIP's holding CodeInfo objects so they
  fit on one line.
c42f added a commit that referenced this issue Sep 24, 2019
* Translate return->call address internally in jl_unw_stepn so that
  users don't have to deal with this subtlety

* Rearrange the jl_unw_stepn return code so that interpreter stack
  traces can always be fit in to bt_data when calling jl_unw_stepn
  incrementally for deep stacks. Fixes #29695
c42f added a commit that referenced this issue Sep 25, 2019
…_unw_stepn

Translate return->call address internally in jl_unw_stepn so that users
never have to deal with this subtlety when calling StackTraces.lookup.

Also rearrange the jl_unw_stepn API a little so that interpreter stack
traces can always be fit in to bt_data when calling jl_unw_stepn
incrementally for deep stacks. Fixes #29695
c42f added a commit that referenced this issue Sep 25, 2019
…_unw_stepn

Translate return->call address internally in jl_unw_stepn so that users
never have to deal with this subtlety when calling StackTraces.lookup.

Also rearrange the jl_unw_stepn API a little so that interpreter stack
traces can always be fit in to bt_data when calling jl_unw_stepn
incrementally for deep stacks + add test cases for that (Fixes #29695).
c42f added a commit that referenced this issue Sep 25, 2019
…_unw_stepn

Translate return->call address internally in jl_unw_stepn so that users
never have to deal with this subtlety when calling StackTraces.lookup.

Also rearrange the jl_unw_stepn API a little so that interpreter stack
traces can always be fit in to bt_data when calling jl_unw_stepn
incrementally for deep stacks + add test cases for that (Fixes #29695).
For the tests, clean up printing of CodeInfo objects wrapped in
InterpreterIP.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:interpreter error handling Handling of exceptions by Julia or the user
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants