Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Relax assertion to expect at least on trained model #67710

Conversation

dimitris-athanasiou
Copy link
Contributor

... for data frame analytics native tests. This fixes assertion
failures seen in #67581.

This commit also enables debug logging for RegressionIT and
unmutes test_stop_and_restart in order to collect more information
about why some times progress gets stuck after loading_data.

... for data frame analytics native tests. This fixes assertion
failures seen in elastic#67581.

This commit also enables debug logging for `RegressionIT` and
unmutes `test_stop_and_restart` in order to collect more information
about why some times progress gets stuck after `loading_data`.
@dimitris-athanasiou dimitris-athanasiou added >test Issues or PRs that are addressing/adding tests :ml Machine learning v8.0.0 v7.12.0 labels Jan 19, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@dimitris-athanasiou
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure that only one. model is persisted. I think your new state machine changes might facilitate that.

Copy link
Contributor

@przemekwitek przemekwitek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

assertThat("Hits were: " + Strings.toString(searchResponse.getHits()), searchResponse.getHits().getHits(), arrayWithSize(1));
// If the job is stopped during writing_results phase and it is then restarted, there is a chance two trained models
// were persisted as there is no way currently for the process to be certain the model was persisted.
assertThat("Hits were: " + Strings.toString(searchResponse.getHits()), searchResponse.getHits().getHits().length,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional] Another way to express this would be:

...getHits(), is(arrayWithSize(greaterThanOrEqualTo(1))));

@dimitris-athanasiou
Copy link
Contributor Author

dimitris-athanasiou commented Jan 20, 2021

We should make sure that only one. model is persisted. I think your new state machine changes might facilitate that.

We can ensure that when the job is run without problems. But when we stop/resume, it's not possible to guarantee that unless we do additional work. For example, we add a model checksum and when we read a model from the process we check if we already have a model with that checksum to avoid storing the same model twice. But that is quite an edge case.

However, your comment @benwtrent made me realize I shouldn't relax the assertion for all tests, just the stop/restart tests. I'll update the PR.

@dimitris-athanasiou dimitris-athanasiou merged commit 2468639 into elastic:master Jan 20, 2021
@dimitris-athanasiou dimitris-athanasiou deleted the relax-assertion-to-expect-at-least-one-trained-model branch January 20, 2021 11:52
dimitris-athanasiou added a commit that referenced this pull request Jan 20, 2021
#67758)

... for data frame analytics native tests. This fixes assertion
failures seen in #67581.

This commit also enables debug logging for `RegressionIT` and
unmutes `test_stop_and_restart` in order to collect more information
about why some times progress gets stuck after `loading_data`.

Backport of #67710
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >test Issues or PRs that are addressing/adding tests v7.12.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants