Skip to content
This repository has been archived by the owner on Oct 7, 2020. It is now read-only.

[Converge] Flaky Tests #32

Closed
jasnell opened this issue May 22, 2015 · 17 comments
Closed

[Converge] Flaky Tests #32

jasnell opened this issue May 22, 2015 · 17 comments

Comments

@jasnell
Copy link
Member

jasnell commented May 22, 2015

Flaky tests were removed in io.js. They've still been useful in node.js's CI. Need a final decision on this.

@Fishrock123
Copy link
Contributor

I think the original flaky tests were fixed in io.js, but there's still one or two that give sporadic failures. The much larger, but not entirely related issue is jenkins messing up certain windows tests.

@rvagg
Copy link
Member

rvagg commented May 23, 2015

Please no no no, I have strong objections to this on a philosophical level, flakey tests should be fixed rather than accepted, that's why the whole thing was removed from io.js. I haven't checked but has it been put back in in this repo?

We have ongoing problems with the io.js CI because of Windows+Jenkins crap that we haven't resolved but as far as I'm concern this just means we need to work on it and figure out a fix, not accept it and move on.

@jbergstroem
Copy link
Member

I have a pretty strong opinion about this as well. Hiding them won't really solve anything.

@orangemocha
Copy link
Contributor

I think there might have been a bit of miscommunication around the topic of ‘flaky tests’. The point is not to to accept nor to ignore that tests are failing. The point is to enable an automated merge job, like node-accept-pull-request. (in fact, I think it would be best to rename the issue title to something more along those lines).

node-accept-pull-requests automatically merges changes, after rebasing them on top of the target branch and passing tests. We have found that useful because it pretty much eliminates the chances of anyone introducing regressions in the test set that it runs on. Using this merge job has ensured that a) nobody can forget to run tests on all platforms; b) nobody can make a mistake when pushing directly to the repo; c) changes are tested on the same final rebase that will be pushed to origin, so it’s not possible for change B to be pushed between the time change A was tested and the time it is rebased/pushed.

We could argue whether this process is useful and whether there are good cases for not following it. But in the context of an automated merge job, support for flaky tests is must. Perhaps it’s the terminology ‘flaky tests’ that stirs confusion. You may think of it as mere configuration for the node-accept-pull-request, to tell it which tests should not prevent an automated merge. Such configuration is versioned with the code, as it should be. But it’s not used for anything else other than the automated merge job. Even in the automated merge job, failures for the tests that are marked as flaky are not completely hidden. Builds with failures in flaky tests are marked as ‘yellow’, instead of ‘blue’, and the TAP test results show the tests that failed, with a ‘todo’ next to the outcome.

When a test is marked as flaky, a high-priority issue should be filed to follow up on the test flakiness, so that this test can be re-enabled again for the automated merge job.

In io.js you have been running CI on every PR, which is great. I am pretty sure, you have had cases where -say- John’s PR was tested and some tests failed that you recognized as not due to the John’s PR. Even if you fixed all the tests, this can always happen, as tests can be susceptible to environment or timing issues. When this happened, did you prevent John’s PR to be merged? Of course not. You made the determination that the failure was independent from John’s changes. Flaky test configuration allows node-accept-pull-request to do the same in an automated way.

Having this supported in an automated way allows for the process to scale. We can add more tests and more platform to verify every change that gets pushed to the main repo, and keep it manageable. Otherwise, we'll end up scribbling the same list of tests on a piece of paper. I think we should enhance this functionality, not remove it. It should be noted that this support in our test runner is based on support that was there in an ancient version of the v8 test runner. Since then, the v8 test runner has greatly improved support for this type of configuration.

@orangemocha
Copy link
Contributor

My suggestion would be to implement a node-accept-pull-request -like job in the converged CI system, without making mandatory, but giving people a chance to try it and show its usefulness. Only if people find it useful, we can adopt it in a more systematic way. I think that should be pretty uncontroversial.

But there's no point in disabling support for flaky tests in the test runner. It does nothing different unless you pass --flaky-tests=dontcare as an option.

@jbergstroem
Copy link
Member

We just had a quick talk about this in the build meeting; I think the thing "between us" in terms of mentality is the merge-pr job -- since the amount of flaky tests we're experience is probably around 10 we should see what can be done about them, but to avoid holding things up we should indeed support a flaky flag and make sure we have a high priority on resolving them once flagged as flaky.

@jasnell
Copy link
Member Author

jasnell commented May 26, 2015

Perhaps this as a compromise: incorporate the flaky tests for day to day
commits but require the test failures to be documented and to be fixed
prior to an LTS. That gives a significant amount of leeway.
On May 25, 2015 3:09 PM, "Johan Bergström" [email protected] wrote:

We just had a quick talk about this in the build meeting; I think the
thing "between us" in terms of mentality is the merge-pr job -- since the
amount of flaky tests we're experience is probably around 10 we should see
what can be done about them, but to avoid holding things up we should
indeed support a flaky flag and make sure we have a high priority on
resolving them once flagged as flaky.


Reply to this email directly or view it on GitHub
nodejs/node#32 (comment).

@bnoordhuis
Copy link
Member

to avoid holding things up we should indeed support a flaky flag and make sure we have a high priority on resolving them once flagged as flaky

Who is on the hook for flaky tests? If the answer is 'no one in particular', there is no real imperative for anyone to go in and fix them.

I'd love to see automated merges a la Rust, by the way. But with rebase and fast-forward, of course, anything else is blasphemy.

@jbergstroem
Copy link
Member

@orangemocha perhaps you could elaborate on what we ended yesterdays call with in regards to how you're procedure currently work?

@orangemocha
Copy link
Contributor

The current procedure is to file a P-1 issue and assign it to someone, at the time that we mark a test a flaky. We should probably also set a milestone goal. Not sure if we can do much better than that. If by the time the milestone date hits, the flakiness has not been resolved we can decide to hold up the milestone for it, or to postpone the issue. This is to say that unless a test is invalid, it's better to keep it around as flaky than to delete it. As I mentioned, flaky tests are run and their failures are shown (in yellow instead of red). Deleting a test really hides the failure.

But with rebase and fast-forward, of course, anything else is blasphemy.

@bnoordhuis : yes, that's how node-accept-pull-request does it by default. It also has a merge option, which was intended for branch merges.

@jasnell
Copy link
Member Author

jasnell commented May 27, 2015

Given that this is largely a build/CI issue, I'm inclined to defer this to the @nodejs/build WG for resolution.

@rvagg
Copy link
Member

rvagg commented May 27, 2015

agreed, we'll be discussing this at our next meeting and possibly beyond, current plan is to wait to experiment with the jenkins merge job and understand the value of flakey test marking and then attempt to move towards a compromise situation that satisfies everyone's concerns

@jasnell
Copy link
Member Author

jasnell commented Jul 9, 2015

@rvagg @misterdjules ... do we need to keep this issue open?

@misterdjules
Copy link

@orangemocha and @rvagg know more about whether the flaky tests mechanism will be kept in the converged CI platform, so deferring to them.

@rvagg
Copy link
Member

rvagg commented Jul 10, 2015

Yes, I believe current status is that we've capitulated and agreed that flaky tests will need to be in converged repo in order to make the node-accept-pull-request jenkins job to work satisfactorily, and in fact we may even want it to come back in to io.js for the same reason. Will let @orangemocha update on progress there.

@orangemocha
Copy link
Contributor

Update on progress:

  1. node-accept-pull-request is already implemented in the converged CI, and will support the current joyent\node v0.12 workflow as soon as this PR is landed (which should happen in the next day or two). I plan on taking a shortcut for 0.10 and port a simpler job to support the current workflow, since we don't use node-accept-pull-request for it. After that, we can start transitioning joyent\node collaborators to the new CI.
  2. node-accept-pull-request is technically ready to work in io.js though there are two work items (both on my plate) necessary to make it practical for wide adoption:
  • Adding support for flaky tests in io.js.
  • Reducing the time spent running the ARM builds

In terms of process for marking tests as flaky, I think the current process that we have used in joyent\node is a reasonable starting point: we open a P-1 issue to investigate the flakiness and assign it to the next milestone, cc'ing the project collaborators. I will document this in the wiki by the time we announce the availability of node-accept-pull-request. If we find that this process is not sufficient we might want to consider assigning to specific collaborators, either the test/area owner or -if that's not available- any collaborator in a round-robin fashion.

With regard to this issue, I think the original misalignment that it was meant to track has been resolved, so it should be fine to close it. 😄

@jasnell
Copy link
Member Author

jasnell commented Jul 10, 2015

Excellent!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants