Extend stacktrace for failed tests not directly under testset #49451

serenity4 · 2023-04-21T10:08:52Z

Issue

When tests are factored out into functions, the stacktrace for failed or errored @tests only includes the frames related to the macrocall inside that function, discarding frames higher up. Because information about the callsite of such testing functions is not preserved, it makes finding failing tests much harder as there is no line indicating where the function was called. This is especially inconvenient when these testing functions are called in multiple places.

Basic example:

using Test

function test_smallprime(x)
  @test x < 10
  @test x in (2, 3, 5, 7)
end

@testset "Tests" begin
  test_smallprime(7)
  test_smallprime(1)
  test_smallprime(10)
  test_smallprime(13)
end

Stacktrace on master

Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
Expression: x in (2, 3, 5, 7)
 Evaluated: 1 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/.asdf/installs/julia/nightly/share/julia/stdlib/v1.10/Test/src/Test.jl:478 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:4
  Expression: x < 10
   Evaluated: 10 < 10

Stacktrace:
 [1] macro expansion
   @ ~/.asdf/installs/julia/nightly/share/julia/stdlib/v1.10/Test/src/Test.jl:478 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:4
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
  Expression: x in (2, 3, 5, 7)
   Evaluated: 10 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/.asdf/installs/julia/nightly/share/julia/stdlib/v1.10/Test/src/Test.jl:478 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:4
  Expression: x < 10
   Evaluated: 13 < 10

Stacktrace:
 [1] macro expansion
   @ ~/.asdf/installs/julia/nightly/share/julia/stdlib/v1.10/Test/src/Test.jl:478 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:4
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
  Expression: x in (2, 3, 5, 7)
   Evaluated: 13 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/.asdf/installs/julia/nightly/share/julia/stdlib/v1.10/Test/src/Test.jl:478 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
Test Summary: | Pass  Fail  Total  Time
Tests         |    3     5      8  0.6s

As one can see, the stacktraces give no clue about where to look besides that testing function.

Stacktrace on this PR

Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
  Expression: x in (2, 3, 5, 7)
   Evaluated: 1 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:502 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
 [3] macro expansion
   @ ~/julia/src/stdlib/Test/example.jl:10 [inlined]
 [4] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1536 [inlined]
 [5] top-level scope
   @ ~/julia/src/stdlib/Test/example.jl:9
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:4
  Expression: x < 10
   Evaluated: 10 < 10

Stacktrace:
 [1] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:502 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:4
 [3] macro expansion
   @ ~/julia/src/stdlib/Test/example.jl:11 [inlined]
 [4] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1536 [inlined]
 [5] top-level scope
   @ ~/julia/src/stdlib/Test/example.jl:9
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
  Expression: x in (2, 3, 5, 7)
   Evaluated: 10 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:502 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
 [3] macro expansion
   @ ~/julia/src/stdlib/Test/example.jl:11 [inlined]
 [4] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1536 [inlined]
 [5] top-level scope
   @ ~/julia/src/stdlib/Test/example.jl:9
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:4
  Expression: x < 10
   Evaluated: 13 < 10

Stacktrace:
 [1] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:502 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:4
 [3] macro expansion
   @ ~/julia/src/stdlib/Test/example.jl:12 [inlined]
 [4] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1536 [inlined]
 [5] top-level scope
   @ ~/julia/src/stdlib/Test/example.jl:9
Tests: Test Failed at /home/serenity4/julia/src/stdlib/Test/example.jl:5
  Expression: x in (2, 3, 5, 7)
   Evaluated: 13 in (2, 3, 5, 7)

Stacktrace:
 [1] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:502 [inlined]
 [2] test_smallprime(x::Int64)
   @ Main ~/julia/src/stdlib/Test/example.jl:5
 [3] macro expansion
   @ ~/julia/src/stdlib/Test/example.jl:12 [inlined]
 [4] macro expansion
   @ ~/julia/src/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1536 [inlined]
 [5] top-level scope
   @ ~/julia/src/stdlib/Test/example.jl:9
Test Summary: | Pass  Fail  Total  Time
Tests         |    3     5      8  0.1s
ERROR: LoadError: Some tests did not pass: 3 passed, 5 failed, 0 errored, 0 broken.
in expression starting at /home/serenity4/julia/src/stdlib/Test/example.jl:8

This is more informative. Note that we include 3 additional frames instead of just one; that is due to inlining which groups these 3 frames into one stacktrace index. This makes the stacktrace more verbose, so we will have to make sure that we only include more frames when necessary.

Proposed implementation

The solution implemented in this PR improves backtrace scrubbing such that if the source location of a @test macrocall is not directly in scope of an outer @testset macrocall, then all frames higher up until a testset is reached are included.

…-test-stacktraces

timholy

Fantastic work, @serenity4, this is a very welcome improvement.

This looks ready to me. One question, though: lookup is expensive, and to me it looks like it's possible to call lookup on the backtrace twice. If it's a problem (not sure it is), a potential solution is to cache the result. To determine if this matters, have you timed how long Julia's own test suite takes with this change vs the implementation currently on master?

stdlib/Test/test/runtests.jl

serenity4 · 2023-04-23T12:41:11Z

A quick profiling indicates that most of the time of is indeed spent in lookup, and notably for Error test results which always scrub the backtrace (Fail results only do it when printing, if printing is enabled). I think that is enough to motivate a more parsimonious use of this function. I'll see if it is possible to restructure the code in a way that avoids duplicate calls, and if it gets ugly we could indeed use a simple cache.

serenity4 · 2023-04-23T17:03:04Z

I reduced a bunch of potentially duplicated lookups, but there were very few and I did not notice performance improvements in practice. Nevertheless, I think the implementation is slightly cleaner so I'd leave it with the refactors.

The slow part is actually looking up stackframes all the way until the @testset expansion, which is only performed once, and includes a bunch of C frames (I've seen cases range from ~8 to ~20 frames in-between @test and @testset). Before, we were only looking up stackframes until the @test macro expansion; going until @testset requires more work and results roughly in a 30% increase in evaluation time for a replica of the example introduced in this PR (8 failed tests + 1 errored test). On the bright side, the increase in time only affects failed and errored tests which are not directly under a @testset. Local tests confirm that performance remains unaffected otherwise.

I don't think we can really improve performance without trying to further optimize lookup, and given that the difference will not be observed in most cases I think it is a reasonable cost for the extra frames.

timholy · 2023-04-24T10:07:30Z

LGTM. Just need to fix the mingw32 test errors and I think this is good to go.

serenity4 · 2023-04-25T22:43:30Z

Should be all good now.

vtjnash

sgtm

serenity4 · 2023-05-03T10:01:01Z

Thanks for the review @timholy and @vtjnash! And thanks to @Seelengrab who pointed me to the right direction :)

Seelengrab · 2023-05-03T12:35:22Z

I think this is a great step towards making julia testsuites more modular, by making the stacktraces of failed cases easily readable :) Thanks for implementing this!

aviatesk · 2023-05-04T09:43:50Z

I noticed this PR seems to break something like @testset "..." include("path/to/test/file").

MRE would be:

echo "using Test; @test false" >! foo.jl
julia -e 'using Test; @testset "foo" include("foo.jl")'

Before:

foo: Test Failed at /Users/aviatesk/julia/julia1/foo.jl:1
  Expression: false
Stacktrace:
 [1] top-level scope
   @ /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Test/src/Test.jl:464
 [2] include(fname::String)
   @ Base.MainInclude ./client.jl:476
 [3] top-level scope
   @ /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Test/src/Test.jl:1363
 [4] eval
   @ ./boot.jl:368 [inlined]
 [5] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:276
 [6] _start()
   @ Base ./client.jl:522
Test Summary: | Fail  Total  Time
foo           |    1      1  0.4s
ERROR: Some tests did not pass: 0 passed, 1 failed, 0 errored, 0 broken.

Now:

foo: Test Failed at /Users/aviatesk/julia/julia1/foo.jl:1
  Expression: false
foo: Error During Test at none:1
  Got exception outside of a @test
  LoadError: TypeError: in typeassert, expected Int64, got a value of type Nothing
  Stacktrace:
    [1] test_callsite(bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}}, file_ts::String, file_t::String)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:67
    [2] test_location(bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}}, file_ts::String, file_t::String)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:58 [inlined]
    [3] scrub_backtrace(bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}}, file_ts::String, file_t::String)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:93
    [4] record(ts::Test.DefaultTestSet, t::Union{Test.Error, Test.Fail}; print_result::Bool)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1083
    [5] record(ts::Test.DefaultTestSet, t::Test.Error)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1076 [inlined]
    [6] do_test(result::Test.ExecutionResult, orig_expr::Any)
      @ Test ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:703
    [7] top-level scope
      @ ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:511
    [8] include(fname::String)
      @ Base.MainInclude ./client.jl:478
    [9] top-level scope
      @ ~/julia/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1545
   [10] eval(m::Module, e::Expr)
      @ Core ./boot.jl:378 [inlined]
   [11] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:280
   [12] _start()
      @ Base ./client.jl:541
  in expression starting at /Users/aviatesk/julia/julia1/foo.jl:1
Test Summary: | Error  Total  Time
foo           |     1      1  0.6s
ERROR: Some tests did not pass: 0 passed, 0 failed, 1 errored, 0 broken.

Note that we now have an internal error within Test.jl

serenity4 · 2023-05-04T10:10:49Z

Thanks for reporting this, it seems like originally this case was also not well supported (see how it includes frames until _start()), and I may have been too optimistic about stacktraces being reasonably structured (note the absence of any macro expansion in the original stacktrace).

One quick fix would be to remove the type assertions and be more relaxed, to avoid errors from occurring.

For a better fix, we might want to take a look at why no macro expansions are present in the stacktrace in the first place, and then see how easily we could change that.

I'll take a deeper look this evening and make a PR with either of the fixes depending on how it goes.

KristofferC · 2023-05-04T11:11:44Z

Ref #44757

serenity4 · 2023-05-04T17:01:20Z

For a better fix, we might want to take a look at why no macro expansions are present in the stacktrace in the first place, and then see how easily we could change that.

From #35360 (comment) it seems like that is not going to be easy, so I decided to go with the simpler fix for now in #49633.

vtjnash · 2023-05-25T14:16:08Z

After #44995

$ ./julia -e 'using Test; @testset "foo" include("../julia/foo.jl")'
foo: Test Failed at /data/vtjnash/julia/foo.jl:1
  Expression: false

Stacktrace:
 [1] macro expansion
   @ /data/vtjnash/julia/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:672 [inlined]
 [2] top-level scope
   @ /data/vtjnash/julia/foo.jl:508
 [3] include(fname::String)
   @ Base.MainInclude ./client.jl:478
 [4] macro expansion
   @ /data/vtjnash/julia/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1547 [inlined]
 [5] top-level scope
   @ none:1543
 [6] eval(m::Module, e::Expr)
   @ Core ./boot.jl:383 [inlined]
 [7] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:280
 [8] _start()
   @ Base ./client.jl:541
Test Summary: | Fail  Total  Time
foo           |    1      1  0.6s
ERROR: Some tests did not pass: 0 passed, 1 failed, 0 errored, 0 broken.

vs. before

$ ./julia -e 'using Test; @testset "foo" include("../julia/foo.jl")'
foo: Test Failed at /data/vtjnash/julia/foo.jl:1
  Expression: false

Stacktrace:
 [1] top-level scope
   @ /data/vtjnash/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:512
 [2] include(fname::String)
   @ Base.MainInclude ./client.jl:478
 [3] top-level scope
   @ /data/vtjnash/julia1/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1546
 [4] eval(m::Module, e::Expr)
   @ Core ./boot.jl:383 [inlined]
 [5] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:280
 [6] _start()
   @ Base ./client.jl:541
Test Summary: | Fail  Total  Time
foo           |    1      1  0.7s
ERROR: Some tests did not pass: 0 passed, 1 failed, 0 errored, 0 broken.

notice we now have the missing top-level scope frame for foo.jl, though still appear to struggle a bit with deriving the file name for the @testset (since it is just from -e) and appear to accidentally combine it with the line number from @testset, and then still print the whole stack back to _start

serenity4 added 8 commits April 17, 2023 22:36

Extend backtraces for tests not directly under testset

16f6131

Add tests

90f6a70

Merge remote-tracking branch 'upstream/master' into serenity4/improve…

5f989aa

…-test-stacktraces

Add basic test when test errors

f8f3343

Merge remote-tracking branch 'upstream/master' into serenity4/improve…

4eff54d

…-test-stacktraces

Remove unused function

8b080e7

Add comments

dc8a021

Add NEWS.md entry

d636600

serenity4 changed the title ~~Include frames between test and testset in stacktraces~~ Extend stacktrace for failed tests not directly under testset Apr 21, 2023

timholy reviewed Apr 23, 2023

View reviewed changes

stdlib/Test/test/runtests.jl Show resolved Hide resolved

Add comment for eval tests

5c679a9

serenity4 added 2 commits April 23, 2023 16:57

Avoid duplicating calls to lookup

626ad2a

Rename variable

6bb23d7

serenity4 and others added 2 commits April 24, 2023 19:26

Remove any carriage returns for tests on Windows

70fa83c

Correctly handle Windows filenames

bdb0240

Allow stop_at to be nothing

32d9003

vtjnash reviewed Apr 28, 2023

View reviewed changes

timholy approved these changes Apr 28, 2023

View reviewed changes

timholy added the merge me PR is reviewed. Merge when all tests are passing label Apr 28, 2023

timholy merged commit 5304baa into JuliaLang:master May 3, 2023

serenity4 deleted the serenity4/improve-test-stacktraces branch May 3, 2023 09:57

serenity4 mentioned this pull request May 4, 2023

Fix lack of robustness in stacktrace scrubbing #49633

Merged

DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label May 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend stacktrace for failed tests not directly under testset #49451

Extend stacktrace for failed tests not directly under testset #49451

serenity4 commented Apr 21, 2023 •

edited

Loading

timholy left a comment

serenity4 commented Apr 23, 2023

serenity4 commented Apr 23, 2023

timholy commented Apr 24, 2023

serenity4 commented Apr 25, 2023

vtjnash left a comment

serenity4 commented May 3, 2023

Seelengrab commented May 3, 2023

aviatesk commented May 4, 2023

serenity4 commented May 4, 2023 •

edited

Loading

KristofferC commented May 4, 2023

serenity4 commented May 4, 2023

vtjnash commented May 25, 2023

Extend stacktrace for failed tests not directly under testset #49451

Extend stacktrace for failed tests not directly under testset #49451

Conversation

serenity4 commented Apr 21, 2023 • edited Loading

Issue

Proposed implementation

timholy left a comment

Choose a reason for hiding this comment

serenity4 commented Apr 23, 2023

serenity4 commented Apr 23, 2023

timholy commented Apr 24, 2023

serenity4 commented Apr 25, 2023

vtjnash left a comment

Choose a reason for hiding this comment

serenity4 commented May 3, 2023

Seelengrab commented May 3, 2023

aviatesk commented May 4, 2023

serenity4 commented May 4, 2023 • edited Loading

KristofferC commented May 4, 2023

serenity4 commented May 4, 2023

vtjnash commented May 25, 2023

serenity4 commented Apr 21, 2023 •

edited

Loading

serenity4 commented May 4, 2023 •

edited

Loading