-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Start and stop model deployments #70713
[ML] Start and stop model deployments #70713
Conversation
Pinging @elastic/ml-core (Team:ML) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
} | ||
|
||
public void setTimeout(TimeValue timeout) { | ||
this.timeout = timeout; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this.timeout = timeout; | |
this.timeout = ExceptionsHelper.requireNonNull(timeout, TIMEOUT); |
|
||
public static class TaskParams implements PersistentTaskParams { | ||
|
||
public static final Version VERSION_INTRODUCED = Version.V_7_13_0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ambitious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change to 8 and we see :-)
listener.onResponse(new StopTrainedModelDeploymentAction.Response(true)); | ||
return; | ||
} | ||
if (models.size() > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In future we may have more than one model config using the deployment. I might be that we don't do the GetTrainedModelsAction
here and just look for persistent tasks that match the model ID
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. That's how we typically implement stop actions where we handle stopping multiple tasks at once. Just thought this was simpler for now.
} | ||
|
||
private void doStartDeployment(TrainedModelDeploymentTask task) { | ||
logger.info("[{}] Starting model deployment", task.getModelId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.info("[{}] Starting model deployment", task.getModelId()); | |
logger.debug("[{}] Starting model deployment", task.getModelId()); |
The feature branch contains changes to configure PyTorch models with a TrainedModelConfig and defines a format to store the binary models. The _start and _stop deployment actions control the model lifecycle and the model can be directly evaluated with the _infer endpoint. 2 Types of NLP tasks are supported: Named Entity Recognition and Fill Mask. The feature branch consists of these PRs: #73523, #72218, #71679 #71323, #71035, #71177, #70713
No description provided.