Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Fix upgrade test assertion #95748

Merged
merged 3 commits into from
May 3, 2023
Merged

Conversation

davidkyle
Copy link
Member

The failure described in #95501 is reproducible when run in isolation and is likely to fail depending where the deployment is running after before the last node is upgraded.

The failure was in an assertion that the trained model metadata has been renamed trained_model_allocation -> trained_model_assignment but this can only happen if there is a cluster change event after all nodes have been upgraded. Other activity in test suite usually ensures that this happens but after #95360 was muted the test started to fail. The fix is to relax the assertion.

Closes #95501

@davidkyle davidkyle added >test Issues or PRs that are addressing/adding tests :ml Machine learning auto-backport-and-merge v8.8.0 v8.9.0 labels May 2, 2023
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label May 2, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but it would be good to include the explanation from the PR description about why the test is asserting what it's now asserting in a comment in the test itself. The code in the test no longer matches the name of the test method, so it would be good to explain why within the code itself rather than forcing a future reader to look up this PR to find out why.

@davidkyle
Copy link
Member Author

@elasticmachine update branch

@davidkyle davidkyle merged commit dde2506 into elastic:main May 3, 2023
@davidkyle davidkyle deleted the upgrade-meta branch May 3, 2023 12:01
davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request May 3, 2023
)

The failure was in an assertion that the trained model metadata
has been renamed trained_model_allocation -> 
trained_model_assignment but this can only happen if there is 
a cluster change event after all nodes have been upgraded.
Relaxes the assertion to account for this case.
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.8

elasticsearchmachine pushed a commit that referenced this pull request May 3, 2023
…95779)

The failure was in an assertion that the trained model metadata
has been renamed trained_model_allocation -> 
trained_model_assignment but this can only happen if there is 
a cluster change event after all nodes have been upgraded.
Relaxes the assertion to account for this case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning Team:ML Meta label for the ML team >test Issues or PRs that are addressing/adding tests v8.8.0 v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] MLModelDeploymentsUpgradeIT testTrainedModelDeploymentStopOnMixedCluster failing
4 participants