Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] XPackRestIT test {p0=ml/validate/Test job config that is invalid only because of the job ID} failing #79635

Closed
fcofdez opened this issue Oct 21, 2021 · 2 comments
Labels
:ml Machine learning Team:ML Meta label for the ML team >test-failure Triaged test failures from CI

Comments

@fcofdez
Copy link
Contributor

fcofdez commented Oct 21, 2021

Build scan:
https://gradle-enterprise.elastic.co/s/f56oj4ypkuqyo/tests/:x-pack:plugin:yamlRestTestV7CompatTest/org.elasticsearch.xpack.test.rest.XPackRestIT/test%20%7Bp0=ml%2Fvalidate%2FTest%20job%20config%20that%20is%20invalid%20only%20because%20of%20the%20job%20ID%7D

Reproduction line:
./gradlew ':x-pack:plugin:yamlRestTestV7CompatTest' --tests "org.elasticsearch.xpack.test.rest.XPackRestIT.test {p0=ml/validate/Test job config that is invalid only because of the job ID}" -Dtests.seed=92F388027EF5C9C0 -Dtests.locale=es-GT -Dtests.timezone=Asia/Tbilisi -Druntime.java=11

Applicable branches:
master

Reproduces locally?:
Didn't try

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.xpack.test.rest.XPackRestIT&tests.test=test%20%7Bp0%3Dml/validate/Test%20job%20config%20that%20is%20invalid%20only%20because%20of%20the%20job%20ID%7D

Failure excerpt:

java.lang.Exception: Test abandoned because suite timeout was reached.

  at __randomizedtesting.SeedInfo.seed([92F388027EF5C9C0]:0)

@fcofdez fcofdez added Team:ML Meta label for the ML team :ml Machine learning >test-failure Triaged test failures from CI labels Oct 21, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@droberts195
Copy link
Contributor

The ML test only took 1.2 seconds, so was not responsible for the timeout. It just happened to be the one running when the suite timed out.

Looking at the test timings the tests that took a long time were snapshot/restore tests.

But also, this is a Darwin CI run, where we're running a workload designed for machines with 128GB of RAM on machines with 32GB of RAM, so it's not surprising it fails most of the time. This problem is covered by #58286.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning Team:ML Meta label for the ML team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

3 participants