Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] [ml-commons] Support model auto-reload (fault tolerance) #2654

Closed
1 of 4 tasks
ylwu-amzn opened this issue Feb 4, 2023 · 1 comment · Fixed by #3803
Closed
1 of 4 tasks

[DOC] [ml-commons] Support model auto-reload (fault tolerance) #2654

ylwu-amzn opened this issue Feb 4, 2023 · 1 comment · Fixed by #3803
Assignees
Labels
2 - In progress Issue/PR: The issue or PR is in progress. Machine Learning v2.7.0

Comments

@ylwu-amzn
Copy link
Contributor

What do you want to do?

  • Request a change to existing documentation
  • Add new documentation
  • Report a technical problem with the documentation
  • Other

Tell us about your request. Provide a summary of the request and all versions that are affected.
In current ml-commons (2.5), we don't support model auto reload. That means if some node crashed, then restart, all models deployed on it will be gone. User needs to manually deploy the model to these nodes. We are going to support model auto reload in 2.6 release. When node restarted, we will check which model deployed to this node, and then auto reload the model.

What other resources are available? Provide links to related issues, POCs, steps for testing, etc.

@hdhalter hdhalter added 1 - Backlog Issue: The issue is unassigned or assigned but not started v2.6.0 and removed untriaged labels Feb 4, 2023
@hdhalter hdhalter added this to the v2.6 milestone Feb 4, 2023
@hdhalter hdhalter changed the title [DOC] [ml-commons] Support model auto-reload [DOC] [ml-commons] Support model auto-reload (fault tolerance) Feb 11, 2023
@Naarcha-AWS
Copy link
Collaborator

Issue moved to 2.7

@Naarcha-AWS Naarcha-AWS modified the milestones: v2.6, v2.7 Feb 24, 2023
@Naarcha-AWS Naarcha-AWS added v2.7.0 and removed v2.6.0 labels Feb 27, 2023
@Naarcha-AWS Naarcha-AWS added 2 - In progress Issue/PR: The issue or PR is in progress. and removed 1 - Backlog Issue: The issue is unassigned or assigned but not started labels Apr 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In progress Issue/PR: The issue or PR is in progress. Machine Learning v2.7.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants