[TEST] Make sure we restart the suite cluster after each test failure #9015

javanna · 2014-12-19T15:15:55Z

CurrentTestFailedMarker is a RunListener that gets notified whenever a test fails, and we were using it to be able to restart the global cluster after each failure. We were checking whether a test had failed in the @After method though, which runs before the listener gets notified, so the failed flag would always be false.

This PR makes sure that the suite/global cluster gets restarted not only when there are problems in the afterInternal method, but also after each test failure. In order to achieve this, we need to reset the cluster afterwards, when we get to know about both of the events (problem in afterInternal and test failure), and before resetting the currentCluster. Introduced a TestRule that keeps track of test failures and allows to execute arbitrary tasks when a test fails and when a test is completed (regardless of its result). Allows also to force the execution of the failure task (used in case of afterInternal issues rather than actual test failure).

Also updated ElasticsearchRestTests to make sure that the RestClient gets re-initialized in case we restart the cluster, otherwise all the subsequent tests fail. Improved this mechanism also to relate it directly to the restart of the cluster instead of checking whether the addresses have changed, which doesn't work anyway as the new cluster will use the same addresses but the client needs to be recreated anyway.

mikemccand · 2014-12-19T15:41:12Z

src/test/java/org/elasticsearch/test/ElasticsearchIntegrationTest.java

@@ -582,7 +582,7 @@ private TestCluster buildAndPutCluster(Scope currentClusterScope, long seed) thr
        return testCluster;
    }

-   private static void clearClusters() throws IOException {
+    private static void clearClusters() throws IOException {


Extra space :)

it was missing, I fixed it :)

Oh, woops, good :)

mikemccand · 2014-12-19T16:31:26Z

I left some minor comments, otherwise LGTM.

This does adds more complexity to our already complex test infra; I wonder if there is a simpler way to "run something after the test failed" in JUnit. E.g. is tearDown invoked after the listeners see the test failure? Or maybe instead of a Listener we should be making a TestRule and insert it at the right place, to record the failure? I don't know JUnit/RandomizedTest well enough but it seems like there could be a simpler fix here ...

javanna · 2014-12-19T16:42:15Z

I totally agree @mikemccand I spent some time trying to think of alternatives but this is the only one I came up with, and it does add complexity. This is kind of a catch-22 problem, since we want to do things after test, depending on two events: 1) whether something went wrong in the after test 2) whether the test failed, which we get to know only later on either through listeners or rules.

is tearDown invoked after the listeners see the test failure?

the @after methods in the class are invoked before any other after method (either rules or listeners)

Or maybe instead of a Listener we should be making a TestRule and insert it at the right place, to record the failure?

I did turn the Listener into a TestRule, but we can't just record the failure and use it elsewhere because the rule is invoked after the @after methods in the class (which is why I delayed the execution to the rule, so we know everything we need to know). I considered flipping things around a bit and moving some parts to the before test phase, e.g. in the before of the following method we would know if the previous test has failed, but it didn't make much sense in the end, cause this is all supposed to happen right after test and that's where we should also reset the currentCluster to null.

That said I am definitely open to alternative solutions!

mikemccand · 2014-12-19T17:00:39Z

Thanks @javanna I think we should just commit this approach since it works, and simplify it later.

After, tearDown, RunListener, TestRule/RuleChain: it's too much ;)

javanna · 2015-02-03T16:00:02Z

I just rebased this, still waiting for a few more opinions given that the change is not very easy to read, yet needed at the same time since the current infra never restarts the global/suite cluster after a failure. Thoughts @bleskes @s1monw ?

javanna · 2015-02-26T15:46:24Z

I just rebased this again and applied feedback. Also removed any mention of the global cluster since it was removed. This PR is still valid for suite level clusters though, as we currently never restart the cluster after any failure although the code suggests that we should. I would like to get this in if it still makes sense to restart the cluster after each failure, to make sure that subsequent tests get a clean new cluster. Thoughts? Review please?

s1monw · 2015-03-02T10:32:12Z

LGTM

@after

CurrentTestFailedMarker is a RunListener that gets notified whenever a test fails, and we were using it to be able to restart the global cluster after each failure. We were checking whether a test had failed in the @after method though, which runs before the listener gets notified, so the failed flag would always be false. This commit makes sure that the global cluster gets restarted not only when there are problems in the afterInternal method, but also after each test failure. In order to achieve this, we need to reset the cluster afterwards, when we get to know about both of the events (problem in afterInternal and test failure), and before resetting the currentCluster. Introduced a TestRule that keeps track of test failures and allows to execute arbitrary tasks when a test fails and when a test is completed (regardless of its result). Allows also to force the execution of the failure task (used in case of afterInternal issues rather than actual test failure). Also updated ElasticsearchRestTests to make sure that the RestClient gets re-initialized in case we restart the global cluster, otherwise all the subsequent tests fail. Improved this mechanism also to relate it directly to the restart of the cluster instead of checking whether the addresses have changed, which doesn't work anyway as the new cluster will use the same addresses but the client needs to be recreated anyway. Closes elastic#9015

@after

…failure CurrentTestFailedMarker is a RunListener that gets notified whenever a test fails, and we were using it to be able to restart the global/suite cluster after each failure. We were checking whether a test had failed in the @after method though, which runs before the listener gets notified, so the failed flag would always be false. This commit makes sure that the global/suite cluster gets restarted not only when there are problems in the afterInternal method, but also after each test failure. In order to achieve this, we need to reset the cluster afterwards, when we get to know about both of the events (problem in afterInternal and test failure), and before resetting the currentCluster. Introduced a TestRule that keeps track of test failures and allows to execute arbitrary tasks when a test fails and when a test is completed (regardless of its result). Allows also to force the execution of the failure task (used in case of afterInternal issues rather than actual test failure). Also updated ElasticsearchRestTests to make sure that the RestClient gets re-initialized in case we restart the global/suite cluster, otherwise all the subsequent tests fail. Improved this mechanism also to relate it directly to the restart of the cluster instead of checking whether the addresses have changed, which doesn't work anyway as the new cluster will use the same addresses but the client needs to be recreated anyway. Closes elastic#9015

@after

CurrentTestFailedMarker is a RunListener that gets notified whenever a test fails, and we were using it to be able to restart the suite cluster after each failure. We were checking whether a test had failed in the @after method though, which runs before the listener gets notified, so the failed flag would always be false. This commit makes sure that the suite cluster gets restarted not only when there are problems in the afterInternal method, but also after each test failure. In order to achieve this, we need to reset the cluster afterwards, when we get to know about both of the events (problem in afterInternal and test failure), and before resetting the currentCluster. Introduced a TestRule that keeps track of test failures and allows to execute arbitrary tasks when a test fails and when a test is completed (regardless of its result). Allows also to force the execution of the failure task (used in case of afterInternal issues rather than actual test failure). Also updated ElasticsearchRestTests to make sure that the RestClient gets re-initialized in case we restart the suite cluster, otherwise all the subsequent tests fail. Improved this mechanism also to relate it directly to the restart of the cluster instead of checking whether the addresses have changed, which doesn't work anyway as the new cluster will use the same addresses but the client needs to be recreated anyway. Closes elastic#9015

@after

…failure CurrentTestFailedMarker is a RunListener that gets notified whenever a test fails, and we were using it to be able to restart the global/suite cluster after each failure. We were checking whether a test had failed in the @after method though, which runs before the listener gets notified, so the failed flag would always be false. This commit makes sure that the global/suite cluster gets restarted not only when there are problems in the afterInternal method, but also after each test failure. In order to achieve this, we need to reset the cluster afterwards, when we get to know about both of the events (problem in afterInternal and test failure), and before resetting the currentCluster. Introduced a TestRule that keeps track of test failures and allows to execute arbitrary tasks when a test fails and when a test is completed (regardless of its result). Allows also to force the execution of the failure task (used in case of afterInternal issues rather than actual test failure). Also updated ElasticsearchRestTests to make sure that the RestClient gets re-initialized in case we restart the global/suite cluster, otherwise all the subsequent tests fail. Improved this mechanism also to relate it directly to the restart of the cluster instead of checking whether the addresses have changed, which doesn't work anyway as the new cluster will use the same addresses but the client needs to be recreated anyway. Closes elastic#9015

javanna added >test Issues or PRs that are addressing/adding tests >bug v2.0.0-beta1 v1.5.0 v1.4.3 review labels Dec 19, 2014

javanna changed the title ~~[TEST] Make sure we restart the global cluster after each test failure~~ [TEST] Make sure we restart the suite/global cluster after each test failure Dec 19, 2014

mikemccand reviewed Dec 19, 2014
View reviewed changes

clintongormley assigned bleskes Jan 23, 2015

clintongormley added v1.4.4 and removed v1.4.3 labels Jan 27, 2015

javanna force-pushed the test/global_cluster_reset branch from 045a675 to 60bca6b Compare February 3, 2015 15:57

spinscale added v1.4.5 and removed v1.4.4 labels Feb 19, 2015

drewr force-pushed the master branch from dcc3da0 to 7c20a8a Compare February 20, 2015 16:48

javanna changed the title ~~[TEST] Make sure we restart the suite/global cluster after each test failure~~ [TEST] Make sure we restart the suite cluster after each test failure Feb 26, 2015

javanna force-pushed the test/global_cluster_reset branch 2 times, most recently from 18d2f16 to 09944b5 Compare February 26, 2015 15:42

javanna force-pushed the test/global_cluster_reset branch from 09944b5 to 8b2d6a9 Compare February 26, 2015 16:07

javanna force-pushed the test/global_cluster_reset branch from 8b2d6a9 to 2957066 Compare March 2, 2015 14:05

javanna removed the review label Mar 2, 2015

javanna closed this in b053fc9 Mar 2, 2015

clintongormley removed the >bug label Mar 19, 2015

javanna mentioned this pull request Apr 20, 2015

Integrate better with lucene test framework and mockfilesystems #10656

Merged

exalate-issue-sync bot mentioned this pull request Nov 15, 2022

Cartesian support for aggregations #87804

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TEST] Make sure we restart the suite cluster after each test failure #9015

[TEST] Make sure we restart the suite cluster after each test failure #9015

javanna commented Dec 19, 2014

mikemccand Dec 19, 2014

javanna Dec 19, 2014

mikemccand Dec 19, 2014

mikemccand commented Dec 19, 2014

javanna commented Dec 19, 2014

mikemccand commented Dec 19, 2014

javanna commented Feb 3, 2015

javanna commented Feb 26, 2015

s1monw commented Mar 2, 2015

[TEST] Make sure we restart the suite cluster after each test failure #9015

[TEST] Make sure we restart the suite cluster after each test failure #9015

Conversation

javanna commented Dec 19, 2014

mikemccand Dec 19, 2014

Choose a reason for hiding this comment

javanna Dec 19, 2014

Choose a reason for hiding this comment

mikemccand Dec 19, 2014

Choose a reason for hiding this comment

mikemccand commented Dec 19, 2014

javanna commented Dec 19, 2014

mikemccand commented Dec 19, 2014

javanna commented Feb 3, 2015

javanna commented Feb 26, 2015

s1monw commented Mar 2, 2015