Exception in one test can cause trailing failures #1432

smcclure15 · 2020-02-21T14:52:09Z

Tell us about your runtime:

QUnit version: 2.9.3
What environment are you running QUnit in? (e.g., browser, Node): Observed in both Browser & Node

What are you trying to do?

Code that reproduces the problem:

Simple repro:

QUnit.module('Trailing failures on exception', {
  afterEach: assert => {
    let done = assert.async();
    setTimeout(done, 50);
  }
}, function () {
  QUnit.test('Test A', assert => {
    assert.ok(true);
    let done = assert.async();
    setTimeout(done, 10);
    throw new Error('oops');
  });
  QUnit.test('Test B', assert => {
    assert.ok(true);
  });
});

What did you expect to happen?

Test A should fail, and Test B should pass.

What actually happened?

Both Test A and Test B fail.

This is very similar to #1375 (I was running previously on Qunit 2.9.1, and hopeful that it was already fixed), but that does not resolve this. I avoided returning promises in an attempt to simplify it.

Added debug statements

I interspersed some debug help to better understand the lifecycles; maybe this can shine some light on the root cause and/or solution:

QUnit.module('Trailing failures on exception', {
  afterEach: assert => {
    let name = assert.test.testName;
    assert.ok(true, `entering afterEach for ${name}`);
    let done = assert.async();
    setTimeout(() => {
      // "done" marker from Test 1 resumes QUnit during this timeout
      console.log(`>>> done afterEach for ${name}`);
      done();
    }, 50);
  }
}, function () {
  QUnit.test('Test A', assert => {
    assert.ok(true, 'in test A');
    let done = assert.async();
    setTimeout(() => {
      assert.ok(true, 'done test A');
      done();
    }, 10);
    throw new Error('oops');
  });
  QUnit.test('Test B', assert => {
    // this test fails because afterEach from Test 1 resumes here
    assert.ok(true, 'in test B');
  });
});

QUnit.log(({message}) => {
  // just to help with tracking lifecycles
  console.log(`>>> ${message}`);
});

This produces the following TAP output (snipping my path a bit):

% qunit repro.js
TAP version 13
>>> in test A
>>> Died on test #2     at Object.<anonymous> (/SNIP/repro.js:16:9): oops
>>> entering afterEach for Test A
>>> done test A
not ok 1 Trailing failures on exception > Test A
  ---
  message: "Died on test #2     at Object.<anonymous> (/SNIP/repro.js:16:9): oops"
  severity: failed
  actual: null
  expected: undefined
  stack: Error: oops
    at Object.QUnit.test.assert (/SNIP/repro.js:23:11)
  ...
>>> in test B
>>> entering afterEach for Test B
>>> done afterEach for Test A
>>> assert.async callback called after test finished.
>>> done afterEach for Test B
not ok 2 Trailing failures on exception > Test B
  ---
  message: "assert.async callback called after test finished."
  severity: failed
  actual: null
  expected: undefined
  stack: undefined:undefined
  ...
1..2
# pass 0
# skip 0
# todo 0
# fail 2

Notice that we enter Test B before being done the afterEach associated with Test A.

The text was updated successfully, but these errors were encountered:

trentmwillis · 2020-03-07T22:18:57Z

Thanks for the detailed bug report! Unsure when we'll get around to fixing it, but if you'd like to take a stab at a fix, that would be great 🙂

smcclure15 · 2020-11-05T20:44:55Z

The more I debug, the more I am convinced that this is correct functional behavior.

This comes down to the semaphore handling which makes these async/done's simply as start/stop's.
From the repro steps,

STOP is called in Test A
Test A throws an exception, the semaphore is reset (0), and we start to process afterEach
STOP is called in A's afterEach
START is called from the lingering setTimeout in Test A (semaphore = 0), which advances the framework to process Test B
Test B runs and completes
STOP is called in B's afterEach
START is called from A's afterEach, but we are in Test B still, which throws the error (A's done callback occurred after test A was finished)
START is again called from B's afterEach (no harm done, it seems)

So our semaphore gets offset, and the framework keeps rolling on after toggling these starts/stops.
This all sounds like solid logic, though with maybe some room for improvement? However, we wouldn't know the intent of the user's code, whether they made their done callback exception safe or not, and I worry we'd risk some false-negatives by doing something unexpected here.

Bottom line, the test failed, it is accurately recorded, but it causes later failures because it used setTimeout's to cause disruption down the line. I think this type of cascading effect is forgivable given we can't guarantee a "clean state" if we continue, as much as we can safely try. I see this as effectively the same ill-posed problem as the following:

    QUnit.test('Test A', assert => {
        assert.ok(1);
        setTimeout(() => {
            throw new Error('oops2');
        }, 10);
        throw new Error('oops1')
    });
    QUnit.test('Test B', assert => {
        assert.ok(1);
        return new Promise(resolve => setTimeout(resolve, 100))
    });

Test A fails because of "oop1" being thrown in the test, but Test B tests fail because that "oops2" error is finally thrown during its runtime. Which makes sense, and the framework shouldn't try to intercept this because Test A had previously thrown an exception.

I would be onboard with closing this issue if others agree with the assessment/philosophy.

Krinkle · 2020-11-11T05:06:23Z

I'm inclined to agree, but I do think we should make sure that if this happens to you, that you have the best information available.

When a test fails, we kinda have to move on, since the failure could mean that we will never see a callback or settled promise. This justifies the transition from test-A to afterEach-A, and likewise the transition from afterEach-A to test-B.

What we could check:

Never start a afterEach unless the associated test was just completed (success or fail). In other words, it must not be possible for afterEach-A to begin after test-B has started or after test-B has failed. Is this already true today?
When an async callback or promose-test is resolved after the test ended, the error message should be clearly related to the test that it was started from. Either by mentioning the test in question or, if possible (e.g. in HTML reporter) by modifying the output of the original test and adding the extra error.
When an exception happens from an asynchronous stack trace, we can't easily know what test it relates to as it is caught from the global window.onerror handler. But we should in that case make sure the message clearly says that it was a "global error". It can render chronologically in the test wherever it happened, that seems fine and has the benefit of being right where you want it for the more common case of only one error happening during and it being caught during the test instead of after.

See also: #1377, #1478

Krinkle · 2021-07-28T02:09:54Z

I'm working on a patch that replaces our semaphore counter with a map of unique locks, thus making it impossible for a release callback from assert.async() to release an unrelated lock. I believe this was planned years ago when we implemented the QUnit 2 interface in our usual "fake it until you make it" manner (e.g. we implemented a thin interface in a 2.x release that added module scopes and scoped assert objects, but at the time they were still just aliasees to their global/stateless equivalents). We made them true instances in 2.x, but it seems we never got around to tightening up the semaphore.

The outcome of such patch would be that, in @smcclure15 's example above, the done() from the test's async lock won't (unintentionally) release the lock created in the afterEach() callback. Thus preventing test B from starting too early, and then encountering unexpected issues there.

== Background == When creating two async pauses in a test, it was possible for a test to pass by invoking one of them twice, and the other not at all. Easy scenario (though perhaps not realistic): > Use `assert.async()` twice, assigned as done1 and done2 in the same > `QUnit.test()` case, and then simulate the failure scenario such that > you wrongly call done1 two times, and forget to call done2. Complex scenario across `QUnit.test()` and "afterEach" hooks, since these previously shared a single semaphore: > Use `assert.async()` once in a simple test, and schedule the resume > call in the future, but then fail with an uncaught error. The uncaught > error is found and `Test.run()` would internally kill the pause by > resetting the semaphore to zero (this make sense since we shouldn't > wait for the release once the test is known to have failed). > After this reset, we proceed to the "afterEach" hook. Suppose this > hook is also async, and during its execution, the originally scheduled > resume call happens. This would effectively end up releasing the > afterEach's async pause despite not being finished yet, and then we > proceed to the next test. That test would then fail when the afterEach's > own release call happens, failing as "release during a different test". This is the scenario of #1432. Fix this and numerous other edge cases by making the returned callbacks from `assert.async()` strict about which locks they release. Each lock now adds a unique token to a map, and invoking the release function decrements/removes this token from the map. == Notes == * es6-map.js assigns the fallback in all browsers. This is a bug, to be fixed later. * The `isNaN(semaphore)` logic was originally added in 2015 by ea3e350. At the time, the internal resume function was public, and NaN could emerge through `QUnit.start("bla")` as result of `semaphore += "bla"`. This has not been possible for a while. During PR #1590, I did not trace the origin of this code, and thus did not realize that it was already obsolete (the semaphore itself is not publicly supported). * The "during different test" error is now almost impossible to trigger since we now kill pending locks during test failures and tolerate all late calls equally. This meant the `drooling-done.js` test case now fails in a more limited way. I added a new test case for coverage, that reproduces it still, but it's a lot more obscure – it requires the original test to pass and then also have an unexpected call during a different test. Fixes #1432.

== Background == When creating two async pauses in a test, it was possible for a test to pass by invoking one of them twice, and the other not at all. Easy scenario (though perhaps not realistic): > Use `assert.async()` twice, assigned as done1 and done2 in the same > `QUnit.test()` case, and then simulate the failure scenario such that > you wrongly call done1 two times, and forget to call done2. Complex scenario across `QUnit.test()` and "afterEach" hooks, since these previously shared a single semaphore: > Use `assert.async()` once in a simple test, and schedule the resume > call in the future, but then fail with an uncaught error. The uncaught > error is found and `Test.run()` would internally kill the pause by > resetting the semaphore to zero (this make sense since we shouldn't > wait for the release once the test is known to have failed). > After this reset, we proceed to the "afterEach" hook. Suppose this > hook is also async, and during its execution, the originally scheduled > resume call happens. This would effectively end up releasing the > afterEach's async pause despite not being finished yet, and then we > proceed to the next test. That test would then fail when the afterEach's > own release call happens, failing as "release during a different test". This is the scenario of #1432. Fix this and numerous other edge cases by making the returned callbacks from `assert.async()` strict about which locks they release. Each lock now adds a unique token to a map, and invoking the release function decrements/removes this token from the map. == Notes == * es6-map.js assigns the fallback in all browsers. This is a bug, to be fixed later. * The `isNaN(semaphore)` logic was originally added in 2015 by ea3e350. At the time, the internal resume function was public, and NaN could emerge through `QUnit.start("bla")` as result of `semaphore += "bla"`. This has not been possible for a while. During PR #1590, I did not trace the origin of this code, and thus did not realize that it was already obsolete (the semaphore itself is not publicly supported). * The "during different test" error is now almost impossible to trigger since we now kill pending locks during test failures and tolerate all late calls equally. This meant the `drooling-done.js` test case now fails in a more limited way. I added a new test case for coverage, that reproduces it still, but it's a lot more obscure – it requires the original test to pass and then also have an unexpected call during a different test. * I considered using the phrase "async lock" in the public-facing error messages, but found this perhaps too internal/technical when coming from the perspective of `var done = assert.async();`. In order to keep the code shared between handling of async-await, Promise, and assert.async, but remain friendly and understandable, I went for the phrase "async pause". Fixes #1432.

Closes #1432. Closes #1642.

trentmwillis added Component: Core For module, test, hooks, and reporters. Type: Bug Something isn't working right. labels Mar 7, 2020

This was referenced Apr 15, 2021

Test timeout with async/done can lead to unrelated error #1592

Closed

Produce clearer, contextual failures when assert.async callbacks are called incorrectly #1595

Closed

Krinkle mentioned this issue May 8, 2021

qunit does not run all tests in a module #1322

Closed

Krinkle self-assigned this Jul 28, 2021

Krinkle mentioned this issue Jul 29, 2021

Assert: Improve detection of bad calls to assert.async() callbacks #1642

Merged

Krinkle closed this as completed in #1642 Aug 2, 2021

Krinkle added a commit that referenced this issue Aug 2, 2021

Assert: Improve detection of bad calls to assert.async() callbacks

163c9bc

Closes #1432. Closes #1642.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exception in one test can cause trailing failures #1432

Exception in one test can cause trailing failures #1432

smcclure15 commented Feb 21, 2020

trentmwillis commented Mar 7, 2020 •

edited

Loading

smcclure15 commented Nov 5, 2020

Krinkle commented Nov 11, 2020 •

edited

Loading

Krinkle commented Jul 28, 2021

Exception in one test can cause trailing failures #1432

Exception in one test can cause trailing failures #1432

Comments

smcclure15 commented Feb 21, 2020

Tell us about your runtime:

What are you trying to do?

What did you expect to happen?

What actually happened?

Added debug statements

trentmwillis commented Mar 7, 2020 • edited Loading

smcclure15 commented Nov 5, 2020

Krinkle commented Nov 11, 2020 • edited Loading

Krinkle commented Jul 28, 2021

trentmwillis commented Mar 7, 2020 •

edited

Loading

Krinkle commented Nov 11, 2020 •

edited

Loading