-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] SmokeTestWatcherWithSecurityIT.testSearchInputWithInsufficientPrivileges Failure #29893
Comments
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.3+multijob-unix-compatibility/os=debian/18/console looks like another instance of this |
This commit unmutes the org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT test suite, fixes a bug [1] that was introduced while the test was muted, and adds some additional debug logging, and enables debug for the ES instance used in this Watcher test. The bug fixed here is minor and unlikely to happen. It requires ES to be started with ILM disabled, Watcher enabled, and Watcher explicitly stopped and restarted. Due to validation Watcher does not fully start and can result in a partially started state. This is an unlikely scenerio outside of the testing framework. Optimistically closing the following Fixes elastic#35361 Fixes elastic#30777 Fixes elastic#35361 Fixes elastic#33291 Fixes elastic#29893 If this does not fully fix the issue, there will now be better debug logging.
There are likely multiple root causes to the seemingly random failures generated by SmokeTestWatcherWithSecurityIT. This commit un-mutes this this test, address one known cause and adds debug logging for this test. The known root cause for one failure is that we can have a Watch running that is reading data from an index. Before we stop Watcher we delete that index. If Watcher happens to execute after deletion of the index but before the stop of Watcher the test can fail. The fix here is to simply move the index deletion after the stop of Watcher. Related elastic#35361 Related elastic#30777 Related elastic#35361 Related elastic#33291 Related elastic#29893
* Address test failures for SmokeTestWatcherWithSecurityIT There are likely multiple root causes to the seemingly random failures generated by SmokeTestWatcherWithSecurityIT. This commit un-mutes this this test, address one known cause and adds debug logging for this test. The known root cause for one failure is that we can have a Watch running that is reading data from an index. Before we stop Watcher we delete that index. If Watcher happens to execute after deletion of the index but before the stop of Watcher the test can fail. The fix here is to simply move the index deletion after the stop of Watcher. Related #35361 Related #30777 Related #33291 Related #29893
Un-muted this test on PR #42409 to obtain additional logs. If (when?) this test fails again please obtain the following information before muting the test:
|
* Address test failures for SmokeTestWatcherWithSecurityIT There are likely multiple root causes to the seemingly random failures generated by SmokeTestWatcherWithSecurityIT. This commit un-mutes this this test, address one known cause and adds debug logging for this test. The known root cause for one failure is that we can have a Watch running that is reading data from an index. Before we stop Watcher we delete that index. If Watcher happens to execute after deletion of the index but before the stop of Watcher the test can fail. The fix here is to simply move the index deletion after the stop of Watcher. Related elastic#35361 Related elastic#30777 Related elastic#33291 Related elastic#29893
Copy of the relevant failureExpected: is "execution_not_needed" but: was "not_executed_already_queued"Stacktrace java.lang.AssertionError: Standard Output [2019-05-31T07:04:45,233][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testSearchInputWithInsufficientPrivileges] before test Copy of the reproduce line
(does not reproduce) The Jenkins build linkThe Gradle scan linkhttps://gradle.com/s/kgzehspj7rn22 The relevant cluster logs from "Google Cloud Storage Upload Report" (link found in Jenkins build)Link to full logs: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/3818/gcsObjects |
I have a fix for this on #42764 Note - the "fix" does not guarantee this won't happen again, it just greatly reduces the odds of it happening by triggering fewer watches (to prevent concurrent watches) and adds a busy wait for expected state. |
Original comment by @martijnvg:
A different failure than in LINK REDACTED.
I was unable to reproduce this failure locally.
LINK REDACTED
Build url: LINK REDACTED
The text was updated successfully, but these errors were encountered: