-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution][Entity Analytics]WIP: determining cypress test flake #169714
base: main
Are you sure you want to change the base?
Conversation
Seeing if this is a timing issue, or whether data from another test is to blame.
This should reduce the time/noise in the flaky test runner, but not running other tests means these should definitely pass.
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/3712 was all green, but upon closer inspection of @jpdjere 's flaky run it looks like the tests only failed legitimately 2/150 times. I'm going to run this one more time (well, 150 more times) to see if I can't reproduce the failure in isolation like this: follow along here |
Previous test run succeeded (with one random failure unrelated to the above issue). HOWEVER, taking an even closer look at @jpdjere 's flaky run it appears that the failing test there is NOT the one that had been skipped 🤷♂️ . I think this invalidates the above run. I'm going to run both tests in this file, and see how the 150 runs behave: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/3734 |
No (legit/expected) failures on the isolated EA FTR run; running again with all risk engine cypress tests to see if we can't get a failure: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/3744 |
Just to be safe, do it before every test, too.
We had another 2/150 legit failures on the "run all EA cypress tests build". I'm now adding some data guards to the failing tests and rerunning them. If these pass, it will confirm that it's data from other tests causing the issue. At that point, we'll either just keep the guards (good) or try to track down the contaminating tests (better). |
The above tests did not fail, which is a good sign. Since the failure rate is so low, though (1/75), I'm running them another 200 times to try and surface an error: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/3768 |
Conflicts: x-pack/test/security_solution_cypress/cypress/e2e/entity_analytics/enrichments.cy.ts x-pack/test/security_solution_cypress/package.json
Tests failed above, so we're not quite there. It occurred to me in the interim, however, that this behavior we're seeing may not just be due to old risk scores, but also due to alerts containing risk enrichments. Based on that theory, I'm going to try another run that additionally deletes alerts. If those pass, I'll probably keep the potentially-unnecessary data guards prior to this as "just in case" test setup. New run: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/3952 |
It was removed in elastic#170636, and appears not to have been replaced.
💔 Build FailedFailed CI StepsTest Failures
Metrics [docs]
History
To update your PR or re-run it, just comment with: |
Tests continue to fail, seemingly due to the presence of "old" risk score data on alerts. However, after deleting all alerts AND all risk score data before each test, they continue to fail. I'm stumped as to what's going on here, I'm going to have to rope in @nkhristinin for help as the original author. |
Seeing if this is a timing issue, or whether data from another test is to blame.
Relates to #169154.