Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teardown function's logs not captured #127

Closed
shreyas707 opened this issue Jun 20, 2019 · 6 comments · Fixed by #151
Closed

Teardown function's logs not captured #127

shreyas707 opened this issue Jun 20, 2019 · 6 comments · Fixed by #151

Comments

@shreyas707
Copy link

shreyas707 commented Jun 20, 2019

Just installed spyder unittest plugin. When writing and testing a simple xunit style test, the teardown function's output did not get captured.

Simple xunit stype test:

image

First "Setting up test1!", next "Executing test1!" and finally "Tearing down test1!" should get displayed in that order. Instead, I only see "Setting up test1!" and "Executing test1!" getting displayed.
The output from teardown function does not appear.

image

  • Spyder version: 3.3.4
  • Version of spyder-unittest plugin: 0.3.1
  • Installation method for Spyder and the unittest plugin: conda
  • Python version: 3.6.2
  • Testing framework used: py.test
  • Testing framework version: 4.4.1
  • Operating system: Mac OS Mojave (10.14.5)
@jitseniesen
Copy link
Member

Thanks for the report. It looks like tear down logs are handled differently internally by pytest. I'll see what to do.

@StefRe
Copy link
Contributor

StefRe commented Apr 19, 2020

The reason for the above is that output from the setup and teardown test phases is only processed if these phases fail:

if report.when in ['setup', 'teardown'] and report.outcome == 'passed':
return

This is caused by different approaches taken by pytest and spyder-unittest in reporting test results: pytest has an outcome (passed/failed/skipped) for each test phase and thus a test can,
for instance, pass in the actual test call and have an error in teardown. In this case, pytest yields two results (.E = 1 passed, 1 error). See also here for some examples.

spyder-unittest, on the other hand, has exactly one result per test, i.e. one row in the widget, and the sum of passed + failed + other equals the number of collected tests. Currently, the final result of a test (an hence its color) is the LAST outcome of a non-passing setup/teardown or any outcome of the call phase, i.e. a passing call will be overwritten by a failed teardown outcome. This works fine for most situations but doesn't cover all cases.

So instead of sending the result of each test phase (except for passing setup/teardown phases) from the pytest_runtest_logreport hook to the datatree widget it would be better to accumulate the results
of all three phases and only send the final result. This can be done in the pytest_runtest_logfinish hook. For captured stdout and stderr output this accumulation is already done by pytest itself,
i.e. the sections field of the result argument contains all output from the beginning of the test up to and including the current phase (this is also the reason of why the output of the setup phase is currently shown in spyder-unittest although a passed setup is not processed in pytestworker).

I think the processing (accumulation) of the results of the individual phases should be done in the pytestworker and not in the datatree widget, the latter should only display whatever comes
from the test runner.

As we only have three categories (green=OK, red=failed, gray=skip, which in fact comprises all other) we need some convention about how do classify tests with differing outcomes per phase.

My proposal is the following (s/t error = error in setup and/or teardown):

status with s/t error without s/t error
passed red green
failed red red
xfailed red green
xpassed red red
skipped (n/a) gray

The only controvertial question, I guess, is about xfailed and xpassed tests (see also #47).
I prefer the way unittest treats them:

If the test fails it will be considered a success. If the test passes, it will be considered a failure.

Pytest shows xfailed tests yellow and xpassed with strict as red and without strict as yellow.

@jitseniesen: if you're OK with this approach and with the table I'm going to prepare a PR.

@StefRe
Copy link
Contributor

StefRe commented Apr 20, 2020

Further to the previous comment:

So it boils down to the following: the status column of the widget shows the outcome of the test itself (the call phase), whereas the color indicates if I'll have to care about it.

BTW, what is the reason for changing the status names from passed to ok and failed to failure?

if report['outcome'] == 'passed':
cat = Category.OK
status = 'ok'
elif report['outcome'] == 'failed':
cat = Category.FAIL
status = 'failure'

I'd rather stick to the traditional names passed and failed as used by pytest (unittests and nosetests report the overall result as either OK or FAILED).

@jitseniesen
Copy link
Member

Steffen, thanks for taking the time to write down your analysis. That all makes sense to me, so please go ahead and implement. Do the colours of xpassed/xfailed as you said.

I think the processing (accumulation) of the results of the individual phases should be done in the pytestworker and not in the datatree widget, the latter should only display whatever comes
from the test runner.

I agree, only processing that all the test frameworks have in common should be done in the datatree widget. I don't know how unittest and pytest handle setup/teardown, but for the moment it makes sense to do as much as possible in the pytestworker.

the status column of the widget shows the outcome of the test itself (the call phase), whereas the color indicates if I'll have to care about it

I am not sure I understand this. Do you mean that if a test passes but the teardown errors, the status will be ok and the colour red? That may be confusing for the user. Or will the message column show some explanation?

BTW, what is the reason for changing the status names from passed to ok and failed to failure?

I don't remember. I think it is to be consistent with what it was before. I agree with changing those status names to 'passed' and 'failed' as you suggest.

What happens if a fixture with module scope errors in the tear down phase? I assume that the last test in the module will get a tear down error?

@StefRe
Copy link
Contributor

StefRe commented Apr 20, 2020

I am not sure I understand this. Do you mean that if a test passes but the teardown errors, the status will be ok and the colour red?

yes

That may be confusing for the user. Or will the message column show some explanation?

yes, it will

What happens if a fixture with module scope errors in the tear down phase? I assume that the last test in the module will get a tear down error?

Consider the following code:

import pytest
@pytest.fixture(scope="module")
def resource(request):
    yield
    raise Exception('Error in teardown')
def test_1(resource): pass
def test_2(resource): pass         

Currently it looks like this:
grafik

My proposal is that the colors remain the same but the status of the second test will be ok instead of failure.
grafik

Tests 1 and 2 are exactly the same - so they should lead to the same result, right? (and self-contained tests should be invariant to the sequence in which they are being executed) The problem with the teardown is signalled by the red color and the message (which will, of course, in general not always be so obviously linked to the teardown as in this constructed example, but in any case you can expand the row and see the whole output for this item). The ok shows that the test itself passed, but in addition to that there is some other problem.
This is the best compromise I could think of for mapping multiple outcomes of the individual test phases onto one final result for the entire test.

@jitseniesen
Copy link
Member

That's fine, I don't have a better idea, Perhaps in the future we can think about reporting tear down failures separately, but not now.

StefRe added a commit to StefRe/spyder-unittest that referenced this issue Apr 27, 2020
Instead of writing events for each phase (setup, call, teardown) of a
test (and eventually displaying them in the datatree widget) we now
accumulate results over the individual test phases and only then write
the final result of the entire test upon test completion using the
pytest_runtest_logfinish hook.

This ensures the following:
1) Capturing of output from the teardown phase even if it passes
   without an error. This fixes spyder-ide#127.
2) Differentiating between the outcome of the test itself (the call
   phase) and the entire test as a whole. An error during teardown,
   for instance, used to overwrite the (potentially passed) test result
   of the call phase. Now we get a passed test with an error during
   teardown. So in the datatree widget, the test status (first column)
   is no longer 1:1 linked with the color of this line (see also 3)).
3) Better handling of xfailed and xpassed tests which now show the
   correct status `xfailed` or `xpassed` and are colored green and
   red respectively (instead of status `skipped` and gray color for
   both of them). This fixes spyder-ide#47.
4) Better messages: the first message, which is usually the most
   important one, will be placed in the message column of the datatree
   widget. Further messages will be _appended_ to the extra_text field
   (instead of overwriting messages from previous test phases). If the
   first message spans over multiple lines then only its first line
   will be displayed in the message column and the complete message
   will be repeated in the extra_text field for better readability.
   This improves the visual impression of the datatree widget so that
   each (collapsed) row is exactly one line high.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants