Flaky tests #219

ronkorving · 2015-12-07T07:32:57Z

It would be nice if tape would support the ability to mark a test (or assertion) as flaky. It can be quite a pain when a test that depends on a 3rd party online service blocks the deployment of my own software, because that 3rd party is not cooperating.

ljharb · 2015-12-07T08:18:15Z

What would marking a test as "flaky" do? What's the value in keeping a flaky test?

ronkorving · 2015-12-07T08:19:39Z

Ah, I should've explained. Failure of such a test would not result in a non-0 exit code, but a warning.
(The node project for example uses this quite a bit)

ljharb · 2015-12-07T19:31:29Z

I feel like this is an antipattern that tape shouldn't encourage. Flaky tests should be fixed immediately or deleted immediately, with prejudice - a test whose failure doesn't fail the build adds no value, it just trains people to ignore test failures.

ronkorving · 2015-12-08T02:39:13Z

That's what my gut tells me too. But I think there are valid cases that are simply out of ones' control by design. I would still prefer to have the test than to not have it in those cases. Anyway, your call to keep this open or not. I would be happy with it, I'm not sure if I'm alone in this.

mstade · 2015-12-21T21:30:38Z

[...] I think there are valid cases that are simply out of ones' control by design.

Do you have any strong examples of such cases?

yoshuawuyts · 2015-12-21T21:43:04Z

I reckon if you want flaky tests, you can add them to a separate run and ignore the exit status. It does require a lot more involvement, but I feel like flaky tests are hazardous by definition. If a test fails: fix it. If you don't fix it straight away, chances are it'll never get fixed. Tape shouldn't encourage bad behavior.

mstade · 2015-12-21T21:50:35Z

I'm inclined to agree with you @yoshuawuyts, and I've personally never seen a flaky test that provided any value. But I also don't know everything, that's why I'm hoping @ronkorving can enlighten me.

Raynos · 2015-12-26T01:04:54Z

@ronkorving I think your asking for the "SKIP" feature which we support with `t.skip(str, block)

Raynos · 2015-12-26T01:06:03Z

You can always write code to make tests not flaky

test('some thingy', function t(assert) {
  doSomethingWeird(function (err, result) {
    if (result) {
      /* some asserts */
    } else {
       assert.ok(true, 'skipping error case because flap');
    }

    assert.end();
  });
});

awlayton · 2016-02-12T15:42:25Z

Isn't a "flaky" test what the TODO directive is for?

ronkorving · 2016-02-13T01:08:40Z

TODO directive? Well, the point was.. you do run a flaky test, but consider it a warning if it fails, not fatal. So skip doesn't solve that, TODO (you mean a code comment?) doesn't solve that. But of course, as @Raynos did point out, this can be done in userland. It does mean you can no longer really rely on t.assert* functions, which is a shame. It affects the test's structure a lot, and if a test becomes non-flaky, you can't just turn it into a normal test by replacing t.flaky by t.test.

ljharb · 2016-02-13T01:09:33Z

@ronkorving t.test('flaky test', { skip: !process.env.RUN_FLAKY }, function () {…});

ronkorving · 2016-02-13T01:12:38Z

Wow, one can do that? (that options object is not documented in t.test btw, only in "global" test)
I don't think that solves this use-case, as I really want a test to always run, just not be fatal. That's not the same as the on/off switch that t.skip seems to be.

awlayton · 2016-02-13T01:54:13Z

@ronkorving the TAP spec has a SKIP "directive" for marking a test as skipped. It has another TODO "directive" for tests that are not necessarily expected to pass (features in progress, etc.).

ronkorving · 2016-02-13T05:22:06Z

Ah, that sounds perfect! Now can tape generate that output? There's nothing in the tape readme on this.

awlayton · 2016-02-13T15:43:25Z

I had never tried to use it. If tape doesn't currently support todo, I think the TAP spec would be a pretty good argument for adding it.

ronkorving · 2016-02-14T04:17:27Z

Agreed, and rather than naming it t.flaky, it should then more naturally be called t.todo.

ronkorving · 2016-02-14T06:01:47Z

Turns out it was already implemented as far as the TAP output goes. I just had to hook into it, so I made the option available. See the PR referenced above this comment.

axit-joost · 2016-02-17T16:54:29Z

Is support and welcome this change.

Our use case:
We need to create a testsuite for a software project that probes a physical device. We have a device. The customer has a device. But our offsite CI server obviously does not. Having this will enable us to create tests that succeed in our environment, but are allowed to fail elsewhere.

The alternative way would be to create an emulated mock device that serves up faux data, but since we have a plethora of different devices to probe, the test suite will be much better qualitatively, if we can simply use the actual devices.

So from me and our team, a welcome +1

yoshuawuyts · 2016-02-17T23:43:19Z

@axit-joost isolate those tests, and don't run them on the other device - this way tests are allowed to fail on your device too, which is not what you want

ronkorving mentioned this issue Feb 14, 2016

Implements TAP TODO directive #254

Merged

ljharb closed this as completed in #254 Feb 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky tests #219

Flaky tests #219

ronkorving commented Dec 7, 2015

ljharb commented Dec 7, 2015

ronkorving commented Dec 7, 2015

ljharb commented Dec 7, 2015

ronkorving commented Dec 8, 2015

mstade commented Dec 21, 2015

yoshuawuyts commented Dec 21, 2015

mstade commented Dec 21, 2015

Raynos commented Dec 26, 2015

Raynos commented Dec 26, 2015

awlayton commented Feb 12, 2016

ronkorving commented Feb 13, 2016

ljharb commented Feb 13, 2016

ronkorving commented Feb 13, 2016

awlayton commented Feb 13, 2016

ronkorving commented Feb 13, 2016

awlayton commented Feb 13, 2016

ronkorving commented Feb 14, 2016

ronkorving commented Feb 14, 2016

axit-joost commented Feb 17, 2016

yoshuawuyts commented Feb 17, 2016

Flaky tests #219

Flaky tests #219

Comments

ronkorving commented Dec 7, 2015

ljharb commented Dec 7, 2015

ronkorving commented Dec 7, 2015

ljharb commented Dec 7, 2015

ronkorving commented Dec 8, 2015

mstade commented Dec 21, 2015

yoshuawuyts commented Dec 21, 2015

mstade commented Dec 21, 2015

Raynos commented Dec 26, 2015

Raynos commented Dec 26, 2015

awlayton commented Feb 12, 2016

ronkorving commented Feb 13, 2016

ljharb commented Feb 13, 2016

ronkorving commented Feb 13, 2016

awlayton commented Feb 13, 2016

ronkorving commented Feb 13, 2016

awlayton commented Feb 13, 2016

ronkorving commented Feb 14, 2016

ronkorving commented Feb 14, 2016

axit-joost commented Feb 17, 2016

yoshuawuyts commented Feb 17, 2016