Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TAP tests support #2923

Closed
ePirat opened this issue Jan 14, 2018 · 19 comments
Closed

Add TAP tests support #2923

ePirat opened this issue Jan 14, 2018 · 19 comments

Comments

@ePirat
Copy link
Contributor

ePirat commented Jan 14, 2018

It would be very useful if meson would support the TAP protocol for tests,
that would make it easy to write tests that do not have to be split up into individual binaries.

I could not find anything about that in the documentation, so I am quite sure this is not supported currently.

@bredelings
Copy link
Contributor

Hmm... I kind of like this idea. I have an external testsuite that runs a number of individual tests, and this protocol would work well for communicating the results, assuming that it can express XFAIL (expected failures). We could also make the 30-second time-out function as a per-test timeout if there was a way for testsuites to communicate with the test harness like this.

@TingPing
Copy link
Member

This overlaps with #186

@astavale
Copy link
Contributor

@bredelings XFAIL is the TODO directive. The test is run and fails, but the TODO means it appears in the XFAIL section of the results, e.g.
not ok 10 # TODO See issue #123456789

If the test passes then it is XPASS:
ok 10 # TODO See issue #123456789
Reference https://testanything.org/tap-version-13-specification.html#todo-tests

TAP is a streaming protocol so there should be no problem with a 30 second delay for a particular test result to be sent. Additional diagnostic information on the test failure can be sent in a YAMLish block: https://testanything.org/tap-version-13-specification.html#yaml-blocks

At present there is no spec for sub-tests. I think one of the Javascript frameworks came up with a good solution. I can't recall if it was this one: http://www.node-tap.org/subtests/
Discussion: TestAnything/Specification#2

TAP has a large number of tools like pretty printers:
https://github.com/substack/faucet
https://github.com/axross/tap-notify
and converters. For example converters to JUnit for integration with other CI tools:
https://github.com/dhershman1/tap-junit
https://github.com/jmason/tap-to-junit-xml

@ptomato
Copy link
Contributor

ptomato commented Mar 29, 2018

I have been using https://github.com/endlessm/webhelper/blob/master/test/tap.py to integrate a test harness that outputs TAP (not version 13 though) with Meson. However, Meson only supports outputting the results on a per-file level, not per-test. I'd really like to see per-test, since I find the per-test output in Autotools useful.

This is currently the only blocker for switching one of my projects to Meson.

@astavale
Copy link
Contributor

FWIW this is how I get a list of tests from my test runner script and pass it to the Meson test harness. Each Meson test () runs one test:

test_runner = find_program( 'test_runner' )

env = environment()
env.set( 'MESON_SOURCE_ROOT', meson.source_root() )
env.set( 'MESON_BUILD_ROOT', meson.build_root() )

test_list = run_command( test_runner, 'list' ).stdout().split()
foreach test : test_list
	description = run_command( test_runner, 'describe', test ).stdout()
	test( description, test_runner, args: ['run', test], env: env )
endforeach

So my test_runner script has three commands: list, describe and run. I can post the test script if it is of any interest, but it is longer.

This approach has lead me to wonder if Meson test() function should be left pretty much as is and for two new functions to be introduced: test_runner () and test_harness. test_runner() is used to declare an executable that responds to the Meson test protocol and sets up the environment for the test runner. test_harness() takes test_runner() return object and uses that to interact with the projects tests. I'd like it to be able to interact with test cases, test suites and test tags.

I've also noted that changing a test's description doesn't currently update in the test harness. I assume it is cached by Meson.

bonzini added a commit to bonzini/meson that referenced this issue Feb 21, 2019
This provides an initial support for parsing TAP output.  It detects failures
and skipped tests without relying on exit code, as well as early termination
of the test due to an error or a crash.

For now, subtests are not recorded in the TestRun object.  However, because the
TAP output goes on stdout, it is printed by --print-errorlogs when a test does
not behave as expected.  Handling subtests as TestRuns, and serializing them
to JSON, can be added later.

The parser was written specifically for Meson, and comes with its own
test suite.

Fixes mesonbuild#2923.
bonzini added a commit to bonzini/meson that referenced this issue Feb 21, 2019
This provides an initial support for parsing TAP output.  It detects failures
and skipped tests without relying on exit code, as well as early termination
of the test due to an error or a crash.

For now, subtests are not recorded in the TestRun object.  However, because the
TAP output goes on stdout, it is printed by --print-errorlogs when a test does
not behave as expected.  Handling subtests as TestRuns, and serializing them
to JSON, can be added later.

The parser was written specifically for Meson, and comes with its own
test suite.

Fixes mesonbuild#2923.
bonzini added a commit to bonzini/meson that referenced this issue Feb 21, 2019
This provides an initial support for parsing TAP output.  It detects failures
and skipped tests without relying on exit code, as well as early termination
of the test due to an error or a crash.

For now, subtests are not recorded in the TestRun object.  However, because the
TAP output goes on stdout, it is printed by --print-errorlogs when a test does
not behave as expected.  Handling subtests as TestRuns, and serializing them
to JSON, can be added later.

The parser was written specifically for Meson, and comes with its own
test suite.

Fixes mesonbuild#2923.
bonzini added a commit to bonzini/meson that referenced this issue Feb 27, 2019
This provides an initial support for parsing TAP output.  It detects failures
and skipped tests without relying on exit code, as well as early termination
of the test due to an error or a crash.

For now, subtests are not recorded in the TestRun object.  However, because the
TAP output goes on stdout, it is printed by --print-errorlogs when a test does
not behave as expected.  Handling subtests as TestRuns, and serializing them
to JSON, can be added later.

The parser was written specifically for Meson, and comes with its own
test suite.

Fixes mesonbuild#2923.
bonzini added a commit to bonzini/meson that referenced this issue Feb 27, 2019
This provides an initial support for parsing TAP output.  It detects failures
and skipped tests without relying on exit code, as well as early termination
of the test due to an error or a crash.

For now, subtests are not recorded in the TestRun object.  However, because the
TAP output goes on stdout, it is printed by --print-errorlogs when a test does
not behave as expected.  Handling subtests as TestRuns, and serializing them
to JSON, can be added later.

The parser was written specifically for Meson, and comes with its own
test suite.

Fixes mesonbuild#2923.
tbeloqui pushed a commit to pexip/meson that referenced this issue Aug 22, 2019
This provides an initial support for parsing TAP output.  It detects failures
and skipped tests without relying on exit code, as well as early termination
of the test due to an error or a crash.

For now, subtests are not recorded in the TestRun object.  However, because the
TAP output goes on stdout, it is printed by --print-errorlogs when a test does
not behave as expected.  Handling subtests as TestRuns, and serializing them
to JSON, can be added later.

The parser was written specifically for Meson, and comes with its own
test suite.

Fixes mesonbuild#2923.
@ePirat
Copy link
Contributor Author

ePirat commented Apr 16, 2020

@jpakkane I believe this is not fully addressed by #4958, as meson does not really integrate the tests the way it does with "native" tests:

$ ninja test
[0/1] Running all tests.
1/1 Example                                 FAIL    0.005408048629760742 s (exit status (0,))

Ok:                 0   
Expected Fail:      0   
Fail:               1   
Unexpected Pass:    0   
Skipped:            0   
Timeout:            0   


The output from the failed tests:

1/1 Example                                 FAIL    0.005408048629760742 s (exit status (0,))

--- command ---
12:57:58 /Users/epirat/Desktop/tap-test/build/tap_test
--- stdout ---
1..4
ok 1 - Input file opened
not ok 2 - First line of the input valid
# Expected line to start with foo but found food!
ok 3 - Read the rest of the file
not ok 4 - Suammarized correctly # TODO Not written yet
-------

Full log written to /Users/epirat/Desktop/tap-test/build/meson-logs/testlog.txt
FAILED: meson-test 
/Users/epirat/Library/Python/3.7/bin/meson test --no-rebuild --print-errorlogs
ninja: build stopped: subcommand failed.

While it should look somewhat like:

$ ninja test
[0/1] Running all tests.
1/4 Example: Input file opened              SUCCESS
2/4 Example: First line of the input valid  FAIL
3/4 Example: Read the rest of the file      SUCCESS
4/4 Example: Suammarized correctly          FAIL (TODO)

Ok:                 1   
Expected Fail:      1   
Fail:               2   
Unexpected Pass:    0   
Skipped:            0   
Timeout:            0   


The output from the failed tests:

2/4 Example: First line of the input valid  FAIL
--- Diangostics ---
Expected line to start with foo but found food!
-------

TODO Tests:
4/4 Example: Suammarized correctly - Not written yet


Full log written to /Users/epirat/Desktop/tap-test/build/meson-logs/testlog.txt
FAILED: meson-test 
/Users/epirat/Library/Python/3.7/bin/meson test --no-rebuild --print-errorlogs
ninja: build stopped: subcommand failed.

Basically currently meson just remembers success/failure for the whole TAP suite of tests depending if any of the tests failed, but it does not properly output those the way I would expect it to and does not print diagnostics properly either but just dumps the full TAP output.

@ePirat
Copy link
Contributor Author

ePirat commented Apr 16, 2020

It seems TODO is currently not handled correctly either, as TODO indicates it is not expected to succeed currently:

These tests represent a feature to be implemented or a bug to be fixed and act as something of an executable “things to do” list. They are not expected to succeed. Should a todo test point begin succeeding, the harness should report it as a bonus. This indicates that whatever you were supposed to do has been done and you should promote this to a normal test point.

@ePirat
Copy link
Contributor Author

ePirat commented Apr 20, 2020

Another issue is #6810

@bonzini
Copy link
Contributor

bonzini commented Apr 20, 2020

Each test is handled as a single Meson test, because it's normal for a single TAP output to produce hundreds, or even thousands of output lines. If for example you're using gtest, the way TAP was implemented means you have the same ninja test output for

  • protocol: 'exitcode' where the test executables are launched without arguments

  • protocol: 'tap' where the test executables are launched with the --tap argument.

TODO works as expected, it makes the overall test fail if you have an ok # TODO and pass if you have a not ok # TODO.

@ePirat
Copy link
Contributor Author

ePirat commented Apr 20, 2020

I don't really see it being useful at all in the current way. Whats the point in tap reporting detailed tests outputs if meson does not properly reports those either?

I compare the behavior to how autotools does it, which is similar to how I explained it.

@ePirat
Copy link
Contributor Author

ePirat commented Apr 20, 2020

If not having detailed output, but instead just a basic success/fail parsing of a TAP-output test is a needed feature for some cases, we could probably add a new kwarg for that? But lots of people that used TAP in autotools probably want more detailed output and not just a raw TAP dump on failure.

@bonzini
Copy link
Contributor

bonzini commented Apr 20, 2020

The detailed output is available if you use meson test --verbose.

@ePirat
Copy link
Contributor Author

ePirat commented Apr 20, 2020

Yes but I believe the current behavior is not how it should work. It's ok if you disagree with that as people have different usecases and thanks a lot for the initial support, I just think it is not integrated well enough yet and does not work out of the box the way one would expect it to…

@ePirat
Copy link
Contributor Author

ePirat commented Apr 20, 2020

Maybe it would make more sense to be able to specify that as testsuite() or something though, so its clear that it will be multiple tests vs one test…

@bonzini
Copy link
Contributor

bonzini commented Apr 20, 2020

Personally I prefer to see grouped results because the group leads me to which executable to run in order to reproduce. So what I do is to run "meson test -v", which lets me both see which subtests failed (and ideally cut-and-paste the command line into a shell, see #5025).

But I can see how people can feel different; these are things that affect the workflow quite directly.

@bonzini
Copy link
Contributor

bonzini commented Apr 20, 2020

In other words, I would keep the current overall behavior, but add support to mtest for reporting subtests individually.

@ePirat
Copy link
Contributor Author

ePirat commented Apr 20, 2020

That sounds good. My main issue is currently especially on failure that it just dumps the whole raw tap output, which is hard to read sometimes when there are a lot of tests. Would be nicer to have it just show the subtests that actually failed.

@ppaalanen
Copy link

I too would like to see an explicit and pretty list of the failed tests.

I can understand not listing individual TAP lines when things go as expected, but it would be good to see the counts in a summary, where you count each TAP test line rather than only each executable. Seeing the counts of XFAIL, SKIP etc. separate from the just OK and FAIL would be really good. Especially SKIP count, if you have a few cases inside one test executable that skip - that might be unexpected.

@bonzini
Copy link
Contributor

bonzini commented Nov 26, 2020

@ppaalanen, @ePirat, see #7830 for the plan around further TAP improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants