[CI] BasicWatcherTests.testModifyWatches fails in CI #35503

jaymode · 2018-11-13T16:51:34Z

BasicWatcherTests testModify watches failed in CI. I looked at the test and one possible cause could be a concurrent reloading of the watch service, which will remove all watches from the trigger engine since the trigger service gets paused. Looking at the logs we see a reload get triggered right after adding the watch due to new local watcher shard allocation ids. Maybe we should wait for a green cluster health or use ensureClusterStateConsistency in this test?

Failure link: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+release-tests/92/consoleFull

Reproduce line:

./gradlew :x-pack:plugin:watcher:test -Dtests.seed=CE0D14FE35FDBFFD -Dtests.class=org.elasticsearch.xpack.watcher.test.integration.BasicWatcherTests -Dtests.method="testModifyWatches" -Dtests.security.manager=true -Dbuild.snapshot=false -Dtests.jvm.argline="-Dbuild.snapshot=false" -Dtests.locale=pt-PT -Dtests.timezone=Asia/Bangkok -Dcompiler.java=11 -Druntime.java=8

  1> [2018-11-13T19:26:33,676][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] [BasicWatcherTests#testModifyWatches]: all set up test
  1> [2018-11-13T19:26:33,677][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] [BasicWatcherTests#testModifyWatches]: freezing time on nodes
  1> [2018-11-13T19:26:33,680][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s0] [.watches] creating index, cause [api], templates [.watches, random_index_template], shards [2]/[0], mappings [doc]
  1> [2018-11-13T19:26:33,680][INFO ][o.e.c.r.a.AllocationService] [node_s0] updating number_of_replicas to [1] for indices [.watches]
  1> [2018-11-13T19:26:33,721][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s0] [.triggered_watches] creating index, cause [api], templates [.triggered_watches, random_index_template], shards [1]/[1], mappings [doc]
  1> [2018-11-13T19:26:33,766][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s0] [.watcher-history-9-2018.11.13] creating index, cause [api], templates [.watch-history-9, random_index_template], shards [1]/[0], mappings [doc]
  1> [2018-11-13T19:26:33,766][INFO ][o.e.c.r.a.AllocationService] [node_s0] updating number_of_replicas to [1] for indices [.watcher-history-9-2018.11.13]
  1> [2018-11-13T19:26:33,797][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] creating watch history index [.watcher-history-9-2018.11.13]
  1> [2018-11-13T19:26:33,819][INFO ][o.e.c.r.a.AllocationService] [node_s0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.watcher-history-9-2018.11.13][0]] ...]).
  1> [2018-11-13T19:26:33,824][DEBUG][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] indices [.watcher-history-9-2018.11.13, .watches, .triggered_watches] are green
  1> [2018-11-13T19:26:33,824][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] waiting to start watcher, current states [Tuple [v1=node_s1, v2=STOPPED], Tuple [v1=node_s2, v2=STOPPED], Tuple [v1=node_s0, v2=STOPPED]]
  1> [2018-11-13T19:26:33,830][DEBUG][o.e.x.w.WatcherService   ] [node_s2] watch service has been reloaded, reason [starting]
  1> [2018-11-13T19:26:33,830][DEBUG][o.e.x.w.WatcherService   ] [node_s1] watch service has been reloaded, reason [starting]
  1> [2018-11-13T19:26:33,831][DEBUG][o.e.x.w.WatcherService   ] [node_s0] watch service has been reloaded, reason [starting]
  1> [2018-11-13T19:26:33,831][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] waiting to start watcher, current states [Tuple [v1=node_s1, v2=STARTED], Tuple [v1=node_s2, v2=STARTED], Tuple [v1=node_s0, v2=STARTING]]
  1> [2018-11-13T19:26:33,834][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] waiting to start watcher, current states [Tuple [v1=node_s1, v2=STARTED], Tuple [v1=node_s2, v2=STARTED], Tuple [v1=node_s0, v2=STARTED]]
  1> [2018-11-13T19:26:33,836][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s0] [idx] creating index, cause [api], templates [random_index_template], shards [7]/[1], mappings []
  1> [2018-11-13T19:26:33,904][DEBUG][o.e.x.w.WatcherIndexingListener] [node_s2] adding watch [_name] to trigger service
  1> [2018-11-13T19:26:33,904][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [node_s2] adding watch [_name]
  1> [2018-11-13T19:26:33,917][DEBUG][o.e.x.w.WatcherIndexingListener] [node_s0] watch [_name] should not be triggered
  1> [2018-11-13T19:26:33,926][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [testModifyWatches] firing watch [_name] at [2018-11-13T12:26:38.677Z]
  1> [2018-11-13T19:26:33,927][DEBUG][o.e.x.w.e.ExecutionService] [testModifyWatches] saving watch records [1]
  1> [2018-11-13T19:26:33,931][DEBUG][o.e.x.w.e.ExecutionService] [testModifyWatches] executing watch [_name]
  1> [2018-11-13T19:26:33,933][DEBUG][o.e.x.w.i.s.ExecutableSearchInput] [testModifyWatches] [_name_887065ee-0c7f-4ae6-9947-269be07f5479-2018-11-13T12:26:38.677Z] found [0] hits
  1> [2018-11-13T19:26:33,936][DEBUG][o.e.x.w.WatcherIndexingListener] [node_s2] adding watch [_name] to trigger service
  1> [2018-11-13T19:26:33,936][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [node_s2] adding watch [_name]
  1> [2018-11-13T19:26:33,939][DEBUG][o.e.x.w.WatcherIndexingListener] [node_s0] watch [_name] should not be triggered
  1> [2018-11-13T19:26:33,948][INFO ][o.e.c.m.MetaDataMappingService] [node_s0] [.watcher-history-9-2018.11.13/CmR9DRQkQD2WEcvGyaQC7A] update_mapping [doc]
  1> [2018-11-13T19:26:33,966][DEBUG][o.e.x.w.e.ExecutionService] [testModifyWatches] finished [_name]/[_name_887065ee-0c7f-4ae6-9947-269be07f5479-2018-11-13T12:26:38.677Z]
  1> [2018-11-13T19:26:34,008][DEBUG][o.e.x.w.WatcherIndexingListener] [node_s2] adding watch [_name] to trigger service
  1> [2018-11-13T19:26:34,008][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [node_s2] adding watch [_name]
  1> [2018-11-13T19:26:34,026][INFO ][o.e.c.r.a.AllocationService] [node_s0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[idx][6]] ...]).
  1> [2018-11-13T19:26:34,027][DEBUG][o.e.x.w.WatcherIndexingListener] [node_s0] watch [_name] should not be triggered
  1> [2018-11-13T19:26:34,029][INFO ][o.e.x.w.WatcherService   ] [node_s1] reloading watcher, reason [new local watcher shard allocation ids], cancelled [0] queued tasks
  1> [2018-11-13T19:26:34,030][INFO ][o.e.x.w.WatcherService   ] [node_s2] reloading watcher, reason [new local watcher shard allocation ids], cancelled [0] queued tasks
  1> [2018-11-13T19:26:34,038][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] [#testModifyWatches]: clearing watcher state
  1> [2018-11-13T19:26:34,039][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] waiting to stop watcher, current states [Tuple [v1=node_s1, v2=STARTED], Tuple [v1=node_s2, v2=STARTED], Tuple [v1=node_s0, v2=STARTED]]
  1> [2018-11-13T19:26:34,042][INFO ][o.e.x.w.WatcherService   ] [node_s1] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2018-11-13T19:26:34,042][INFO ][o.e.x.w.WatcherService   ] [node_s2] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2018-11-13T19:26:34,045][INFO ][o.e.x.w.WatcherService   ] [node_s0] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2018-11-13T19:26:34,048][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] waiting to stop watcher, current states [Tuple [v1=node_s1, v2=STOPPED], Tuple [v1=node_s2, v2=STOPPED], Tuple [v1=node_s0, v2=STOPPED]]
  1> [2018-11-13T19:26:34,048][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] [BasicWatcherTests#testModifyWatches]: cleaning up after test
  1> [2018-11-13T19:26:34,048][INFO ][o.e.t.InternalTestCluster] [testModifyWatches] Clearing active scheme time frozen, expected healing time 0s
  1> [2018-11-13T19:26:34,055][DEBUG][o.e.x.w.WatcherService   ] [node_s1] Loaded [0] watches for execution
  1> [2018-11-13T19:26:34,055][DEBUG][o.e.x.w.WatcherService   ] [node_s2] Loaded [1] watches for execution
  1> [2018-11-13T19:26:34,056][DEBUG][o.e.x.w.WatcherService   ] [node_s1] watch service has been reloaded, reason [new local watcher shard allocation ids]
  1> [2018-11-13T19:26:34,056][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [node_s2] adding watch [_name]
  1> [2018-11-13T19:26:34,056][DEBUG][o.e.x.w.WatcherService   ] [node_s2] watch service has been reloaded, reason [new local watcher shard allocation ids]
  1> [2018-11-13T19:26:34,065][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [idx/dvrs-NoDRECbdygx_jZlBA] deleting index
  1> [2018-11-13T19:26:34,065][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [.watches/4_uvYvRAQSabLevglsnYQg] deleting index
  1> [2018-11-13T19:26:34,066][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [.triggered_watches/HipCpVXNSWuosdZwXrIFBA] deleting index
  1> [2018-11-13T19:26:34,066][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [.watcher-history-9-2018.11.13/CmR9DRQkQD2WEcvGyaQC7A] deleting index
  1> [2018-11-13T19:26:34,123][INFO ][o.e.c.m.MetaDataIndexTemplateService] [node_s0] removing template [random_index_template]
  1> [2018-11-13T19:26:34,128][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] [BasicWatcherTests#testModifyWatches]: cleaned up after test
  1> [2018-11-13T19:26:34,129][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testModifyWatches] after test
FAILURE 0.50s J2 | BasicWatcherTests.testModifyWatches <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: could not find watch [_name] to trigger
   > Expected: is <true>
   >      but: was <false>
   > 	at __randomizedtesting.SeedInfo.seed([CE0D14FE35FDBFFD:13C578BEB78E4397]:0)
   > 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   > 	at org.elasticsearch.xpack.watcher.test.AbstractWatcherIntegrationTestCase$TimeWarp.trigger(AbstractWatcherIntegrationTestCase.java:569)
   > 	at org.elasticsearch.xpack.watcher.test.AbstractWatcherIntegrationTestCase$TimeWarp.trigger(AbstractWatcherIntegrationTestCase.java:559)
   > 	at org.elasticsearch.xpack.watcher.test.integration.BasicWatcherTests.testModifyWatches(BasicWatcherTests.java:195)
   > 	at java.lang.Thread.run(Thread.java:748)

The text was updated successfully, but these errors were encountered:

elasticmachine · 2018-11-13T16:51:36Z

Pinging @elastic/es-core-infra

gwbrown · 2019-02-19T22:55:12Z

After lots of trying, (10s of thousands of runs of this test) under various conditions, including on a heavily-loaded system, on 6.x and master, I have not been able to reproduce this failure once. This test isn't muted, and has not failed on CI since this issue was opened 3 months ago. While it has failed a few times on CI within the past year, it has only failed with this particular failure once, and the other times were before some significant Watcher refactorings/bug fixes, so I'm not sure if those failures are relevant.

I'm very hesitant to make any changes to this test to try to address this failure given that I can't reproduce it, and thus have no way of verifying if any changes actually increased the stability of the test, so I'm wondering what we should do with this issue: Unless someone else has some insight into how we might be able to better reproduce this, I don't think we'll be able to do much for the moment.

markharwood · 2019-02-22T09:14:23Z

Build-stats link

jtibshirani · 2019-03-05T22:28:01Z

I dug into these failures while investigating #37882, which seems to have the same root cause as both this issue and #39306.

From looking at logs and stepping through code, I saw the same cause that @jaymode identified above. There is a change in shard routing that causes the watch service to be reloaded, which then clears the list of registered watches from the mock trigger engine. The assertion errors we're seeing can be consistently reproduced if we simulate a change in shard routing, and repeat a test case ~10 times:

--- a/x-pack/plugin/watcher/src/main/java/org/elasticsearch/xpack/watcher/WatcherLifeCycleService.java
+++ b/x-pack/plugin/watcher/src/main/java/org/elasticsearch/xpack/watcher/WatcherLifeCycleService.java
@@ -137,7 +137,7 @@ public class WatcherLifeCycleService implements ClusterStateListener {
             .sorted(Comparator.comparing(ShardRouting::hashCode))
             .collect(Collectors.toList());
 
-        if (previousShardRoutings.get().equals(localAffectedShardRoutings) == false) {
+        if (Math.random() < 0.1 || previousShardRoutings.get().equals(localAffectedShardRoutings) == false) {
             if (watcherService.validate(event.state())) {
                 previousShardRoutings.set(localAffectedShardRoutings);
                 if (state.get() == WatcherState.STARTED) {

I don't think that waiting on a green cluster health will improve the situation, since we already wait for the watcher indices to be green in AbstractWatcherIntegrationTestCase#_setup. It looks like the reload is caused by a shard routing change that happens after the indices are created and watcher has already started:

  1> [2018-11-13T19:26:33,830][DEBUG][o.e.x.w.WatcherService   ] [node_s2] watch service has been reloaded, reason [starting]
  ...
  1> [2018-11-13T19:26:34,029][INFO ][o.e.x.w.WatcherService   ] [node_s1] reloading watcher, reason [new local watcher shard allocation ids], cancelled [0] queued tasks

From looking at the mock trigger engine, I actually don't think it's correct to be clearing the list of watches when watcher is reloaded. I opened a PR that proposes to remove this behavior, which could also help with these test failures: #39724. With that change, I can no longer reproduce the failures when simulating a change in shard routing.

jtibshirani · 2019-03-26T20:09:19Z

I'm closing this now that #39724 has merged. Feel free to reopen the issue if similar failures pop up again.

benwtrent · 2019-03-27T13:20:49Z

Previous issues around these tests (but not convinced that they are the same cause or not). #35503 #39306

Previous investigations have brought about changes in this PR #39724

Previous guesses at the cause: reloading causes watches to be cleared (fixed via the PR), internalCluster not shutting down properly.

Failing builds:

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+periodic/951/consoleFull
Reproduction:

./gradlew :x-pack:plugin:watcher:unitTest -Dtests.seed=CF3EAFDCCBFC8BED -Dtests.class=org.elasticsearch.xpack.watcher.test.integration.BasicWatcherTests -Dtests.method="testConditionSearchWithIndexedTemplate" -Dtests.security.manager=true -Dtests.locale=ca -Dtests.timezone=America/Atka -Dcompiler.java=12 -Druntime.java=8

Relevant trace:

1> [2019-03-26T22:30:43,355][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] before test
  1> [2019-03-26T22:30:43,355][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] [BasicWatcherTests#testIndexWatch]: setting up test
  1> [2019-03-26T22:30:43,360][INFO ][o.e.c.m.MetaDataIndexTemplateService] [node_s2] adding template [random_index_template] for index patterns [*]
  1> [2019-03-26T22:30:43,370][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] [BasicWatcherTests#testIndexWatch]: all set up test
  1> [2019-03-26T22:30:43,371][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] [BasicWatcherTests#testIndexWatch]: freezing time on nodes
  1> [2019-03-26T22:30:43,373][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] waiting to stop watcher, current states [Tuple [v1=node_s2, v2=STOPPED], Tuple [v1=node_s0, v2=STOPPED], Tuple [v1=node_s1, v2=STOPPED]]
  1> [2019-03-26T22:30:43,389][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s2] [.watchesyvkwv] creating index, cause [Index to test aliases with .watches index], templates [.watches, random_index_template], shards [1]/[0], mappings [_doc]
  1> [2019-03-26T22:30:43,389][INFO ][o.e.c.r.a.AllocationService] [node_s2] updating number_of_replicas to [1] for indices [.watchesyvkwv]
  1> [2019-03-26T22:30:43,465][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s2] [vuebvrpaq] creating index, cause [api], templates [random_index_template], shards [1]/[1], mappings [_doc]
  1> [2019-03-26T22:30:43,626][INFO ][o.e.c.r.a.AllocationService] [node_s2] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[vuebvrpaq][0]] ...]).
  1> [2019-03-26T22:30:43,636][DEBUG][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] indices [vuebvrpaq] are green
  1> [2019-03-26T22:30:43,638][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s2] [.watchesyvkwv/YxVv6AmvQH6ojNeIB4BeCw] deleting index
  1> [2019-03-26T22:30:43,676][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] set alias for .watches index to [vuebvrpaq]
  1> [2019-03-26T22:30:43,680][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s2] [.triggered_watches] creating index, cause [api], templates [.triggered_watches, random_index_template], shards [1]/[0], mappings [_doc]
  1> [2019-03-26T22:30:43,681][INFO ][o.e.c.r.a.AllocationService] [node_s2] updating number_of_replicas to [1] for indices [.triggered_watches]
  1> [2019-03-26T22:30:43,785][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s2] [.watcher-history-9-2019.03.27] creating index, cause [api], templates [.watch-history-9, random_index_template], shards [1]/[0], mappings [_doc]
  1> [2019-03-26T22:30:43,786][INFO ][o.e.c.r.a.AllocationService] [node_s2] updating number_of_replicas to [1] for indices [.watcher-history-9-2019.03.27]
  1> [2019-03-26T22:30:43,874][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] creating watch history index [.watcher-history-9-2019.03.27]
  1> [2019-03-26T22:30:43,967][INFO ][o.e.c.r.a.AllocationService] [node_s2] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.watcher-history-9-2019.03.27][0]] ...]).
  1> [2019-03-26T22:30:43,978][DEBUG][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] indices [.watcher-history-9-2019.03.27, vuebvrpaq, .triggered_watches] are green
  1> [2019-03-26T22:30:43,979][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] waiting to start watcher, current states [Tuple [v1=node_s2, v2=STOPPED], Tuple [v1=node_s0, v2=STOPPED], Tuple [v1=node_s1, v2=STOPPED]]
  1> [2019-03-26T22:30:43,995][DEBUG][o.e.x.w.WatcherService   ] [node_s0] watch service has been reloaded, reason [starting]
  1> [2019-03-26T22:30:44,005][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] waiting to start watcher, current states [Tuple [v1=node_s2, v2=STARTING], Tuple [v1=node_s0, v2=STARTED], Tuple [v1=node_s1, v2=STARTED]]
  1> [2019-03-26T22:30:44,009][DEBUG][o.e.x.w.WatcherService   ] [node_s2] watch service has been reloaded, reason [starting]
  1> [2019-03-26T22:30:44,009][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] waiting to start watcher, current states [Tuple [v1=node_s2, v2=STARTING], Tuple [v1=node_s0, v2=STARTED], Tuple [v1=node_s1, v2=STARTED]]
  1> [2019-03-26T22:30:44,015][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] waiting to start watcher, current states [Tuple [v1=node_s2, v2=STARTED], Tuple [v1=node_s0, v2=STARTED], Tuple [v1=node_s1, v2=STARTED]]
  1> [2019-03-26T22:30:44,018][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s2] [idx] creating index, cause [api], templates [random_index_template], shards [2]/[0], mappings []
  1> [2019-03-26T22:30:44,082][INFO ][o.e.c.r.a.AllocationService] [node_s2] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[idx][1]] ...]).
  1> [2019-03-26T22:30:44,096][INFO ][o.e.c.m.MetaDataMappingService] [node_s2] [idx/HYDL5JI3RrWT65ANheMyHw] create_mapping [type]
  1> [2019-03-26T22:30:44,135][DEBUG][o.e.x.w.WatcherIndexingListener] [node_s2] adding watch [_name] to trigger service
  1> [2019-03-26T22:30:44,135][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [node_s2] adding watch [_name]
  1> [2019-03-26T22:30:44,149][DEBUG][o.e.x.w.WatcherIndexingListener] [node_s0] watch [_name] should not be triggered
  1> [2019-03-26T22:30:44,163][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [testIndexWatch] firing watch [_name] at [2019-03-27T07:30:43.372Z]
  1> [2019-03-26T22:30:44,166][DEBUG][o.e.x.w.e.ExecutionService] [testIndexWatch] watcher execution service paused, not processing [1] events
  1> [2019-03-26T22:30:54,339][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] Found [0] records for watch [_name]
  1> [2019-03-26T22:30:54,339][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] [#testIndexWatch]: clearing watcher state
  1> [2019-03-26T22:30:54,341][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] waiting to stop watcher, current states [Tuple [v1=node_s2, v2=STARTED], Tuple [v1=node_s0, v2=STARTED], Tuple [v1=node_s1, v2=STARTED]]
  1> [2019-03-26T22:30:54,353][INFO ][o.e.x.w.WatcherService   ] [node_s0] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-26T22:30:54,353][INFO ][o.e.x.w.WatcherService   ] [node_s1] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-26T22:30:54,357][INFO ][o.e.x.w.WatcherService   ] [node_s2] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-26T22:30:54,362][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] waiting to stop watcher, current states [Tuple [v1=node_s2, v2=STOPPED], Tuple [v1=node_s0, v2=STOPPED], Tuple [v1=node_s1, v2=STOPPED]]
  1> [2019-03-26T22:30:54,362][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] [BasicWatcherTests#testIndexWatch]: cleaning up after test
  1> [2019-03-26T22:30:54,362][INFO ][o.e.t.InternalTestCluster] [testIndexWatch] Clearing active scheme time frozen, expected healing time 0s
  1> [2019-03-26T22:30:54,403][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s2] [idx/HYDL5JI3RrWT65ANheMyHw] deleting index
  1> [2019-03-26T22:30:54,403][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s2] [vuebvrpaq/sE_znfh3QHedE0HaGoXQXQ] deleting index
  1> [2019-03-26T22:30:54,403][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s2] [.watcher-history-9-2019.03.27/sIe7e7WWSaa4Xi6RwCtI7Q] deleting index
  1> [2019-03-26T22:30:54,403][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s2] [.triggered_watches/gu6jqmx3QguzdYDSU3dTgg] deleting index
  1> [2019-03-26T22:30:54,462][INFO ][o.e.c.m.MetaDataIndexTemplateService] [node_s2] removing template [random_index_template]
  1> [2019-03-26T22:30:54,474][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] [BasicWatcherTests#testIndexWatch]: cleaned up after test
  1> [2019-03-26T22:30:54,474][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testIndexWatch] after test
FAILURE 11.2s J3 | BasicWatcherTests.testIndexWatch <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: could not find executed watch record for watch _name
   > Expected: a value equal to or greater than <1L>
   >      but: <0L> was less than <1L>

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/2761/console

Reproduction:

./gradlew :x-pack:plugin:watcher:unitTest -Dtests.seed=27A4E6E50AAC7BBF -Dtests.class=org.elasticsearch.xpack.watcher.test.integration.BasicWatcherTests -Dtests.method="testConditionSearchWithIndexedTemplate" -Dtests.security.manager=true -Dtests.locale=is-IS -Dtests.timezone=America/Santarem -Dcompiler.java=12 -Druntime.java=8

Trace:

1> [2019-03-27T10:01:24,459][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] before test
  1> [2019-03-27T10:01:24,459][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] [BasicWatcherTests#testConditionSearchWithIndexedTemplate]: setting up test
  1> [2019-03-27T10:01:24,466][INFO ][o.e.c.m.MetaDataIndexTemplateService] [node_sm2] adding template [random_index_template] for index patterns [*]
  1> [2019-03-27T10:01:24,475][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] [BasicWatcherTests#testConditionSearchWithIndexedTemplate]: all set up test
  1> [2019-03-27T10:01:24,477][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] [BasicWatcherTests#testConditionSearchWithIndexedTemplate]: freezing time on nodes
  1> [2019-03-27T10:01:24,478][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] waiting to stop watcher, current states [Tuple [v1=node_sd3, v2=STOPPED], Tuple [v1=node_sm2, v2=STOPPED], Tuple [v1=node_sm1, v2=STOPPED], Tuple [v1=node_sm0, v2=STOPPED], Tuple [v1=node_sd5, v2=STOPPED], Tuple [v1=node_sd4, v2=STOPPED]]
  1> [2019-03-27T10:01:24,482][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_sm2] [.watchesxyrni] creating index, cause [Index to test aliases with .watches index], templates [.watches, random_index_template], shards [1]/[0], mappings [_doc]
  1> [2019-03-27T10:01:24,483][INFO ][o.e.c.r.a.AllocationService] [node_sm2] updating number_of_replicas to [1] for indices [.watchesxyrni]
  1> [2019-03-27T10:01:24,536][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_sm2] [frabdmxca] creating index, cause [api], templates [random_index_template], shards [1]/[1], mappings [_doc]
  1> [2019-03-27T10:01:24,608][INFO ][o.e.c.r.a.AllocationService] [node_sm2] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[frabdmxca][0]] ...]).
  1> [2019-03-27T10:01:24,616][DEBUG][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] indices [frabdmxca] are green
  1> [2019-03-27T10:01:24,619][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_sm2] [.watchesxyrni/VLE2R2FqSgy5i1duOszs7g] deleting index
  1> [2019-03-27T10:01:24,652][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] set alias for .watches index to [frabdmxca]
  1> [2019-03-27T10:01:24,655][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_sm2] [.triggered_watches-alias-index] creating index, cause [Index to test aliases with .triggered-watches index], templates [.triggered_watches, random_index_template], shards [1]/[1], mappings [_doc]
  1> [2019-03-27T10:01:24,696][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_sm2] [murwfypk] creating index, cause [api], templates [random_index_template], shards [1]/[1], mappings [_doc]
  1> [2019-03-27T10:01:24,760][INFO ][o.e.c.r.a.AllocationService] [node_sm2] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[murwfypk][0]] ...]).
  1> [2019-03-27T10:01:24,770][DEBUG][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] indices [murwfypk] are green
  1> [2019-03-27T10:01:24,773][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_sm2] [.triggered_watches-alias-index/8rnRreMKTyqgOEQnCiQaSA] deleting index
  1> [2019-03-27T10:01:24,797][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] set alias for .triggered-watches index to [murwfypk]
  1> [2019-03-27T10:01:24,802][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_sm2] [.watcher-history-9-2019.03.27] creating index, cause [api], templates [.watch-history-9, random_index_template], shards [1]/[0], mappings [_doc]
  1> [2019-03-27T10:01:24,802][INFO ][o.e.c.r.a.AllocationService] [node_sm2] updating number_of_replicas to [1] for indices [.watcher-history-9-2019.03.27]
  1> [2019-03-27T10:01:24,845][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] creating watch history index [.watcher-history-9-2019.03.27]
  2> NOTE: leaving temporary files on disk at: /var/lib/jenkins/workspace/elastic+elasticsearch+7.x+intake/x-pack/plugin/watcher/build/testrun/unitTest/J7/temp/org.elasticsearch.xpack.watcher.test.integration.BasicWatcherTests_27A4E6E50AAC7BBF-001
  1> [2019-03-27T10:01:24,876][INFO ][o.e.c.r.a.AllocationService] [node_sm2] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.watcher-history-9-2019.03.27][0]] ...]).
  1> [2019-03-27T10:01:24,884][DEBUG][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] indices [.watcher-history-9-2019.03.27, frabdmxca, murwfypk] are green
  1> [2019-03-27T10:01:24,885][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] waiting to start watcher, current states [Tuple [v1=node_sd3, v2=STOPPED], Tuple [v1=node_sm2, v2=STOPPED], Tuple [v1=node_sm1, v2=STOPPED], Tuple [v1=node_sm0, v2=STOPPED], Tuple [v1=node_sd5, v2=STOPPED], Tuple [v1=node_sd4, v2=STOPPED]]
  1> [2019-03-27T10:01:24,891][DEBUG][o.e.x.w.WatcherService   ] [node_sm0] watch service has been reloaded, reason [starting]
  1> [2019-03-27T10:01:24,892][DEBUG][o.e.x.w.WatcherService   ] [node_sm1] watch service has been reloaded, reason [starting]
  1> [2019-03-27T10:01:24,895][DEBUG][o.e.x.w.WatcherService   ] [node_sd5] watch service has been reloaded, reason [starting]
  1> [2019-03-27T10:01:24,895][DEBUG][o.e.x.w.WatcherService   ] [node_sd4] watch service has been reloaded, reason [starting]
  1> [2019-03-27T10:01:24,896][DEBUG][o.e.x.w.WatcherService   ] [node_sm2] watch service has been reloaded, reason [starting]
  1> [2019-03-27T10:01:24,897][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] waiting to start watcher, current states [Tuple [v1=node_sd3, v2=STARTED], Tuple [v1=node_sm2, v2=STARTED], Tuple [v1=node_sm1, v2=STARTED], Tuple [v1=node_sm0, v2=STARTED], Tuple [v1=node_sd5, v2=STARTED], Tuple [v1=node_sd4, v2=STARTED]]
  1> [2019-03-27T10:01:24,914][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] using custom data_path for index: [xZlFgQZDuX]
  1> [2019-03-27T10:01:24,917][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_sm2] [events] creating index, cause [api], templates [random_index_template], shards [3]/[0], mappings [event]
  1> [2019-03-27T10:01:24,952][INFO ][o.e.c.r.a.AllocationService] [node_sm2] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[events][1], [events][2]] ...]).
  1> [2019-03-27T10:01:24,961][WARN ][o.e.d.x.w.s.s.WatcherSearchTemplateRequest] [testConditionSearchWithIndexedTemplate] [types removal] Specifying types in a watcher search request is deprecated.
  1> [2019-03-27T10:01:24,963][DEBUG][o.e.x.w.WatcherIndexingListener] [node_sd4] watch [_name] should not be triggered
  2> NOTE: All tests run in this JVM: [WatchAckTests, YearlyScheduleTests, JiraAccountTests, TextTemplateTests, BasicWatcherTests]
  1> [2019-03-27T10:01:24,974][DEBUG][o.e.x.w.WatcherIndexingListener] [node_sd5] adding watch [_name] to trigger service
  1> [2019-03-27T10:01:24,974][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [node_sd5] adding watch [_name]
  1> [2019-03-27T10:01:24,984][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] created watch [_name] at [2019-03-27T13:01:24.984Z]
  1> [2019-03-27T10:01:25,002][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [testConditionSearchWithIndexedTemplate] firing watch [_name] at [2019-03-27T13:01:25.914Z]
  1> [2019-03-27T10:01:25,004][DEBUG][o.e.x.w.e.ExecutionService] [testConditionSearchWithIndexedTemplate] watcher execution service paused, not processing [1] events
  1> [2019-03-27T10:01:35,109][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] Found [0] records for watch [_name]
  1> [2019-03-27T10:01:35,109][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] [#testConditionSearchWithIndexedTemplate]: clearing watcher state
  1> [2019-03-27T10:01:35,110][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] waiting to stop watcher, current states [Tuple [v1=node_sd3, v2=STARTED], Tuple [v1=node_sm2, v2=STARTED], Tuple [v1=node_sm1, v2=STARTED], Tuple [v1=node_sm0, v2=STARTED], Tuple [v1=node_sd5, v2=STARTED], Tuple [v1=node_sd4, v2=STARTED]]
  1> [2019-03-27T10:01:35,116][INFO ][o.e.x.w.WatcherService   ] [node_sm1] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-27T10:01:35,117][INFO ][o.e.x.w.WatcherService   ] [node_sm0] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-27T10:01:35,119][INFO ][o.e.x.w.WatcherService   ] [node_sd4] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-27T10:01:35,119][INFO ][o.e.x.w.WatcherService   ] [node_sd3] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-27T10:01:35,119][INFO ][o.e.x.w.WatcherService   ] [node_sd5] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-27T10:01:35,121][INFO ][o.e.x.w.WatcherService   ] [node_sm2] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-27T10:01:35,124][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] waiting to stop watcher, current states [Tuple [v1=node_sd3, v2=STOPPED], Tuple [v1=node_sm2, v2=STOPPED], Tuple [v1=node_sm1, v2=STOPPED], Tuple [v1=node_sm0, v2=STOPPED], Tuple [v1=node_sd5, v2=STOPPED], Tuple [v1=node_sd4, v2=STOPPED]]
  1> [2019-03-27T10:01:35,124][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] [BasicWatcherTests#testConditionSearchWithIndexedTemplate]: cleaning up after test
  1> [2019-03-27T10:01:35,124][INFO ][o.e.t.InternalTestCluster] [testConditionSearchWithIndexedTemplate] Clearing active scheme time frozen, expected healing time 0s
  1> [2019-03-27T10:01:35,148][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_sm2] [murwfypk/MDpXU-3TRmuH2S8doaI00g] deleting index
  1> [2019-03-27T10:01:35,148][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_sm2] [frabdmxca/ks9SYzn5RGafbPcbzp2ccg] deleting index
  1> [2019-03-27T10:01:35,148][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_sm2] [.watcher-history-9-2019.03.27/N21v22jHSy6h1-sJGbReyg] deleting index
  1> [2019-03-27T10:01:35,148][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_sm2] [events/P5a_l-A2R1eNHAXV92bjjA] deleting index
  1> [2019-03-27T10:01:35,181][INFO ][o.e.c.m.MetaDataIndexTemplateService] [node_sm2] removing template [random_index_template]
  1> [2019-03-27T10:01:35,192][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] [BasicWatcherTests#testConditionSearchWithIndexedTemplate]: cleaned up after test
  1> [2019-03-27T10:01:35,192][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithIndexedTemplate] after test
FAILURE 10.8s J7 | BasicWatcherTests.testConditionSearchWithIndexedTemplate <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: 
   > Expected: a value equal to or greater than <1L>

Looking at the two traces, it seems that the watcher reload occurs BEFORE the watch is created. So, I am not sure it is the same suspected cause as before (reloading causing the watches to be removed).

Similar failure:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.x+intake/746/console

./gradlew :x-pack:plugin:watcher:unitTest -Dtests.seed=27A4E6E50AAC7BBF -Dtests.class=org.elasticsearch.xpack.watcher.test.integration.BasicWatcherTests -Dtests.method="testConditionSearchWithIndexedTemplate" -Dtests.security.manager=true -Dtests.locale=is-IS -Dtests.timezone=America/Santarem -Dcompiler.java=12 -Druntime.java=8

  1> [2019-03-27T05:12:08,662][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] before test
  1> [2019-03-27T05:12:08,662][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] [BasicWatcherTests#testConditionSearchWithSource]: setting up test
  1> [2019-03-27T05:12:08,669][INFO ][o.e.c.m.MetaDataIndexTemplateService] [node_s0] adding template [random_index_template] for index patterns [*]
  1> [2019-03-27T05:12:08,678][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] [BasicWatcherTests#testConditionSearchWithSource]: all set up test
  1> [2019-03-27T05:12:08,679][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] [BasicWatcherTests#testConditionSearchWithSource]: freezing time on nodes
  1> [2019-03-27T05:12:08,680][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] waiting to stop watcher, current states [Tuple [v1=node_s1, v2=STOPPED], Tuple [v1=node_s2, v2=STOPPED], Tuple [v1=node_s0, v2=STOPPED]]
  1> [2019-03-27T05:12:08,686][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s0] [.watchesauonl] creating index, cause [Index to test aliases with .watches index], templates [.watches, random_index_template], shards [1]/[0], mappings [_doc]
  1> [2019-03-27T05:12:08,687][INFO ][o.e.c.r.a.AllocationService] [node_s0] updating number_of_replicas to [1] for indices [.watchesauonl]
  1> [2019-03-27T05:12:08,776][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s0] [vrurzbbrej] creating index, cause [api], templates [random_index_template], shards [1]/[1], mappings [_doc]
  1> [2019-03-27T05:12:08,874][INFO ][o.e.c.r.a.AllocationService] [node_s0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[vrurzbbrej][0]] ...]).
  1> [2019-03-27T05:12:08,888][DEBUG][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] indices [vrurzbbrej] are green
  1> [2019-03-27T05:12:08,890][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [.watchesauonl/QcgNNCikRqeU9toD6E6MvQ] deleting index
  1> [2019-03-27T05:12:08,922][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] set alias for .watches index to [vrurzbbrej]
  1> [2019-03-27T05:12:08,927][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s0] [.triggered_watches] creating index, cause [api], templates [.triggered_watches, random_index_template], shards [1]/[1], mappings [_doc]
  1> [2019-03-27T05:12:09,006][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s0] [.watcher-history-9-2019.03.27] creating index, cause [api], templates [.watch-history-9, random_index_template], shards [1]/[0], mappings [_doc]
  1> [2019-03-27T05:12:09,007][INFO ][o.e.c.r.a.AllocationService] [node_s0] updating number_of_replicas to [1] for indices [.watcher-history-9-2019.03.27]
  1> [2019-03-27T05:12:09,095][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] creating watch history index [.watcher-history-9-2019.03.27]
  1> [2019-03-27T05:12:09,198][INFO ][o.e.c.r.a.AllocationService] [node_s0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.watcher-history-9-2019.03.27][0]] ...]).
  1> [2019-03-27T05:12:09,212][DEBUG][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] indices [.watcher-history-9-2019.03.27, vrurzbbrej, .triggered_watches] are green
  1> [2019-03-27T05:12:09,213][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] waiting to start watcher, current states [Tuple [v1=node_s1, v2=STOPPED], Tuple [v1=node_s2, v2=STOPPED], Tuple [v1=node_s0, v2=STOPPED]]
  1> [2019-03-27T05:12:09,228][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] waiting to start watcher, current states [Tuple [v1=node_s1, v2=STARTING], Tuple [v1=node_s2, v2=STARTING], Tuple [v1=node_s0, v2=STARTED]]
  1> [2019-03-27T05:12:09,229][DEBUG][o.e.x.w.WatcherService   ] [node_s1] watch service has been reloaded, reason [starting]
  1> [2019-03-27T05:12:09,230][DEBUG][o.e.x.w.WatcherService   ] [node_s2] watch service has been reloaded, reason [starting]
  1> [2019-03-27T05:12:09,233][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] waiting to start watcher, current states [Tuple [v1=node_s1, v2=STARTED], Tuple [v1=node_s2, v2=STARTED], Tuple [v1=node_s0, v2=STARTED]]
  1> [2019-03-27T05:12:09,237][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s0] [events] creating index, cause [api], templates [random_index_template], shards [7]/[1], mappings [event]
  1> [2019-03-27T05:12:09,499][DEBUG][o.e.x.w.WatcherIndexingListener] [node_s2] watch [_name] should not be triggered
  1> [2019-03-27T05:12:09,539][DEBUG][o.e.x.w.WatcherIndexingListener] [node_s1] adding watch [_name] to trigger service
  1> [2019-03-27T05:12:09,539][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [node_s1] adding watch [_name]
  1> [2019-03-27T05:12:09,592][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] created watch [_name] at [2019-03-27T09:12:09.592Z]
  1> [2019-03-27T05:12:09,622][INFO ][o.e.c.r.a.AllocationService] [node_s0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[events][1]] ...]).
  1> [2019-03-27T05:12:09,651][DEBUG][o.e.x.w.t.ScheduleTriggerEngineMock] [testConditionSearchWithSource] firing watch [_name] at [2019-03-27T09:12:10.233Z]
  1> [2019-03-27T05:12:09,654][DEBUG][o.e.x.w.e.ExecutionService] [testConditionSearchWithSource] watcher execution service paused, not processing [1] events
  1> [2019-03-27T05:12:19,785][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] Found [0] records for watch [_name]
  1> [2019-03-27T05:12:19,785][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] [#testConditionSearchWithSource]: clearing watcher state
  1> [2019-03-27T05:12:19,786][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] waiting to stop watcher, current states [Tuple [v1=node_s1, v2=STARTED], Tuple [v1=node_s2, v2=STARTED], Tuple [v1=node_s0, v2=STARTED]]
  1> [2019-03-27T05:12:19,791][INFO ][o.e.x.w.WatcherService   ] [node_s2] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-27T05:12:19,791][INFO ][o.e.x.w.WatcherService   ] [node_s1] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-27T05:12:19,793][INFO ][o.e.x.w.WatcherService   ] [node_s0] stopping watch service, reason [watcher manually marked to shutdown by cluster state update]
  1> [2019-03-27T05:12:19,796][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] waiting to stop watcher, current states [Tuple [v1=node_s1, v2=STOPPED], Tuple [v1=node_s2, v2=STOPPED], Tuple [v1=node_s0, v2=STOPPED]]
  1> [2019-03-27T05:12:19,796][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] [BasicWatcherTests#testConditionSearchWithSource]: cleaning up after test
  1> [2019-03-27T05:12:19,796][INFO ][o.e.t.InternalTestCluster] [testConditionSearchWithSource] Clearing active scheme time frozen, expected healing time 0s
  1> [2019-03-27T05:12:19,821][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [events/UURKiggsQcm349t1qHMpNQ] deleting index
  1> [2019-03-27T05:12:19,821][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [.triggered_watches/8dqLO65eQASBHboBUPB3yQ] deleting index
  1> [2019-03-27T05:12:19,821][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [.watcher-history-9-2019.03.27/YxxObJ3OThmY9cGQSSKF3w] deleting index
  1> [2019-03-27T05:12:19,821][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [vrurzbbrej/7zLSzJHzQryFj0_XE14I5g] deleting index
  1> [2019-03-27T05:12:19,885][INFO ][o.e.c.m.MetaDataIndexTemplateService] [node_s0] removing template [random_index_template]
  1> [2019-03-27T05:12:19,892][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] [BasicWatcherTests#testConditionSearchWithSource]: cleaned up after test
  1> [2019-03-27T05:12:19,892][INFO ][o.e.x.w.t.i.BasicWatcherTests] [testConditionSearchWithSource] after test
FAILURE 11.3s J7 | BasicWatcherTests.testConditionSearchWithSource <<< FAILURES!
   > Throwable #1: java.lang.AssertionError:

benwtrent · 2019-03-27T13:31:32Z

Another failure around the same thing:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/2765/console

FAILURE 13.1s J6 | WebhookIntegrationTests.testWebhookWithBasicAuth <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: could not find executed watch record for watch _id
   > Expected: a value equal to or greater than <1L>
   >      but: <0L> was less than <1L>
   > 	at __randomizedtesting.SeedInfo.seed([552D2F2E1DEA46BD:BE10AAB71FC19159]:0)
   > 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   > 	at

Muting these BasicWatcherTests and WebhookIntegrationTests as many of them appear to be failing across master and 7.x

* Muting watcher tests for issue #35503 * blocking the two troublsome suites

Due to #35503

droberts195 · 2019-04-01T13:33:55Z

Another test suffering from the same could not find executed watch record for watch _id error is WebhookHttpsIntegrationTests.testHttps.

For example:

I muted this test too:

master: dbbad81
7.x: 54b0ff8
7.0: 6b7a607
6.7: 3e5fe29

It didn't reproduce locally for me with:

./gradlew :x-pack:plugin:watcher:unitTest \
  -Dtests.seed=3AFF7F9BF0A9BF85 \
  -Dtests.class=org.elasticsearch.xpack.watcher.actions.webhook.WebhookHttpsIntegrationTests \
  -Dtests.method="testHttps" \
  -Dtests.security.manager=true \
  -Dtests.locale=lv-LV \
  -Dtests.timezone=Europe/Monaco \
  -Dcompiler.java=12 \
  -Druntime.java=8

gwbrown · 2019-04-09T14:49:50Z

This should be fixed in master by #40658, if anyone runs across this failure let me know. I'll wait for the backport before closing this.

gwbrown · 2019-04-12T22:55:48Z

#40658 has been merged to all currently maintained branches after a week of running in master with no failures in this test, so I'm closing this issue.

danielmitterdorfer · 2019-05-06T08:28:44Z

ActionErrorIntegrationTests.testErrorInAction which has been tracked in #39306 has failed again but #39306 (comment) suggests to reopen this issue instead if additional test failures occur.

Test failure link
Test failure: java.lang.AssertionError: watch was triggered on [0] schedulers, expected [1] (same as was mentioned in Improve Watcher test framework resiliency #40658 (comment)).
Full test log
Reproduction line (does not reproduce locally (I tried 100 iterations)):

./gradlew :x-pack:plugin:watcher:test --tests "org.elasticsearch.xpack.watcher.actions.ActionErrorIntegrationTests.testErrorInAction" \
  -Dtests.seed=930476443B322E24 \
  -Dtests.security.manager=true \
  -Dtests.locale=pt-MZ \
  -Dtests.timezone=America/Tegucigalpa \
  -Dcompiler.java=12 \
  -Druntime.java=12

Note: Later in the logs we see the output along the lines of non-failed nodes do not form a quorum. As far as I understand it this is related to nodes being shutdown as part of the suite's teardown process (i.e. after the test failure has occured) so I think that is unrelated.

Due to elastic#35503

jakelandis · 2019-10-08T15:46:48Z

Closing this issue as I can not find any actionable items here. None of the tests referenced here are currently disabled and (outside a very small handful of PRs) could not find any evidence of failures in the past 90 days. The small handful of PRs appear to have larger problems and include many other test failures.

If any of the referenced tests fail please log a new issue.

jaymode added >test-failure Triaged test failures from CI :Data Management/Watcher labels Nov 13, 2018

jaymode mentioned this issue Nov 13, 2018

[CI] WatchAckTests testAckAllActions failure in CI #35506

Closed

jakelandis self-assigned this Nov 15, 2018

gwbrown self-assigned this Feb 6, 2019

This was referenced Mar 5, 2019

TransformIntegrationTests#testSearchTransform can fail with an assertion error #37882

Closed

Stop clearing all watches in the mock trigger engine. #39724

Merged

jtibshirani closed this as completed Mar 26, 2019

jtibshirani mentioned this issue Mar 26, 2019

BasicWatcherTests.testConditionSearchWithIndexedTemplate failure in CI #39306

Closed

benwtrent reopened this Mar 27, 2019

jpountz mentioned this issue Mar 27, 2019

[CI] BasicWatcherTests failures #40524

Closed

benwtrent added a commit to benwtrent/elasticsearch that referenced this issue Mar 27, 2019

Muting watcher tests for issue elastic#35503

85b90bd

benwtrent mentioned this issue Mar 27, 2019

Muting watcher tests for issue #35503 #40526

Merged

benwtrent added a commit to benwtrent/elasticsearch that referenced this issue Mar 27, 2019

Muting watcher tests for issue elastic#35503

bcfb9db

benwtrent mentioned this issue Mar 27, 2019

Muting watcher tests for issue #35503 #40527

Merged

benwtrent added a commit that referenced this issue Mar 27, 2019

Muting watcher tests for issue #35503 (#40527)

22e688a

* Muting watcher tests for issue #35503 * blocking the two troublsome suites

benwtrent added a commit that referenced this issue Mar 27, 2019

Muting watcher tests for issue #35503 (#40526)

8df07d6

* Muting watcher tests for issue #35503 * blocking the two troublsome suites

jpountz mentioned this issue Mar 27, 2019

[CI] WebhookIntegrationTests failures #40522

Closed

jpountz pushed a commit that referenced this issue Mar 27, 2019

Muting watcher tests for issue #35503 (#40527)

43892ce

* Muting watcher tests for issue #35503 * blocking the two troublsome suites

benwtrent mentioned this issue Mar 27, 2019

muting watcher tests for issue #35503 #40558

Merged

benwtrent added a commit that referenced this issue Mar 27, 2019

muting watcher tests for issue #35503 (#40558)

1a0a696

droberts195 added a commit that referenced this issue Apr 1, 2019

[TEST] Mute WebhookHttpsIntegrationTests.testHttps

54b0ff8

Due to #35503

droberts195 added a commit that referenced this issue Apr 1, 2019

[TEST] Mute WebhookHttpsIntegrationTests.testHttps

dbbad81

Due to #35503

droberts195 added a commit that referenced this issue Apr 1, 2019

[TEST] Mute WebhookHttpsIntegrationTests.testHttps

6b7a607

Due to #35503

droberts195 added a commit that referenced this issue Apr 1, 2019

[TEST] Mute WebhookHttpsIntegrationTests.testHttps

3e5fe29

Due to #35503

gwbrown mentioned this issue Apr 2, 2019

Improve Watcher test framework resiliency #40658

Merged

gwbrown closed this as completed Apr 12, 2019

danielmitterdorfer reopened this May 6, 2019

jakelandis mentioned this issue May 22, 2019

Meta: Fix Watcher Test Failures #42409

Closed

gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this issue May 27, 2019

[TEST] Mute WebhookHttpsIntegrationTests.testHttps

4e2a768

Due to elastic#35503

jakelandis closed this as completed Oct 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] BasicWatcherTests.testModifyWatches fails in CI #35503

[CI] BasicWatcherTests.testModifyWatches fails in CI #35503

jaymode commented Nov 13, 2018

elasticmachine commented Nov 13, 2018

gwbrown commented Feb 19, 2019 •

edited

Loading

markharwood commented Feb 22, 2019

jtibshirani commented Mar 5, 2019

jtibshirani commented Mar 26, 2019

benwtrent commented Mar 27, 2019 •

edited

Loading

benwtrent commented Mar 27, 2019 •

edited

Loading

droberts195 commented Apr 1, 2019

gwbrown commented Apr 9, 2019

gwbrown commented Apr 12, 2019

danielmitterdorfer commented May 6, 2019

jakelandis commented Oct 8, 2019

[CI] BasicWatcherTests.testModifyWatches fails in CI #35503

[CI] BasicWatcherTests.testModifyWatches fails in CI #35503

Comments

jaymode commented Nov 13, 2018

elasticmachine commented Nov 13, 2018

gwbrown commented Feb 19, 2019 • edited Loading

markharwood commented Feb 22, 2019

jtibshirani commented Mar 5, 2019

jtibshirani commented Mar 26, 2019

benwtrent commented Mar 27, 2019 • edited Loading

benwtrent commented Mar 27, 2019 • edited Loading

droberts195 commented Apr 1, 2019

gwbrown commented Apr 9, 2019

gwbrown commented Apr 12, 2019

danielmitterdorfer commented May 6, 2019

jakelandis commented Oct 8, 2019

gwbrown commented Feb 19, 2019 •

edited

Loading

benwtrent commented Mar 27, 2019 •

edited

Loading

benwtrent commented Mar 27, 2019 •

edited

Loading