-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add batch inference API #7853
Add batch inference API #7853
Conversation
Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged. Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer. When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review. |
For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations). | ||
|
||
|
||
For information about connectors and remote models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). For more details of the connector blurprints for batch predict, see [GitHub docs](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think best action item would be put this blueprint under a sub folder for batch_prediction
and link to that sub folder. In this way, if we add blueprints for sagemaker and cohere later, CX will still find these in the sub folder.
Right now we are saying this will work for Sagemaker or Cohere but there's no example for this. And also why customer needs to go another link to get the same blue print what is here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Dhrubo. This will avoid having to maintain a list of blueprints here in the documentation. @Zhangxunmt could you create a subfolder so we can link to that from the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to have a subfolder for offline actions. But please note that this is the API page showing how this API works. So let's keep this page directly to the point. Using OpenAI as an example to show this API is enough. Other details will be documented elsewhere in blueprints or tutorials.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this API page should be simple. Ideally, I would even remove the prerequisite steps from this API. However, our users have a very disjointed experience when going back and forth from the doc website to the ML repo. I didn't realize blueprints contained the workflow and not just the blueprint itself. I think we should port all ML blueprints and tutorials to the doc repo and have them on the doc site. I can take this on once this version is released. For now, it's fine to leave this API page with the current information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @Zhangxunmt! Please see my comments below.
--- | ||
layout: default | ||
title: Batch inference | ||
parent: Model APIs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently have the Predict API under the train-predict directory, not model-apis. Either we need to move this one to train-predict, or we can move the predict API into the model-apis section. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes more sense to move Predict to the model-apis section. The training part doesn't matter much as most of the cases are remote models or pre-trained models which are directly predicable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Zhangxunmt Should we move train APIs to the model-apis section as well so the train-predict and model-apis sections are combined?
|
||
# Batch inference | ||
|
||
ML Commons can predict large datasets in an offline asynchronous mode with your remote model deployed in external model servers. To use the Batch_Predict API, the `model_id` for a remote model is required. This new API is released as an experimental feature in the OpenSearch version 2.16, and only SageMaker, Cohere, and OpenAI are verified as the external servers that support this features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ML Commons can predict large datasets in an offline asynchronous mode with your remote model deployed in external model servers. To use the Batch_Predict API, the `model_id` for a remote model is required. This new API is released as an experimental feature in the OpenSearch version 2.16, and only SageMaker, Cohere, and OpenAI are verified as the external servers that support this features. | |
ML Commons can perform inference on large datasets in an offline asynchronous mode using a model deployed on external model servers. To use the Batch Predict API, you must provide the `model_id` for an externally hosted model. This new API is released as experimental in OpenSearch version 2.16, and only Amazon SageMaker, Cohere, and OpenAI are verified as the external servers that support this feature. |
grand_parent: ML Commons APIs | ||
nav_order: 20 | ||
--- | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add an experimental header https://github.com/opensearch-project/documentation-website/blob/main/templates/EXPERIMENTAL_TEMPLATE.md and provide either a link to an issue where users can track the progress of the feature or a link to the OpenSearch forum.
For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations). | ||
|
||
|
||
For information about connectors and remote models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). For more details of the connector blurprints for batch predict, see [GitHub docs](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For information about connectors and remote models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). For more details of the connector blurprints for batch predict, see [GitHub docs](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md) | |
For information about externally hosted models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). For the batch predict operation connector blueprints, see: | |
- [Amazon SageMaker batch predict connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_sagemaker_connector_blueprint.md). | |
- [OpenAI batch predict connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a Cohere blueprint for batch predict?
For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations). | ||
|
||
|
||
For information about connectors and remote models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). For more details of the connector blurprints for batch predict, see [GitHub docs](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Dhrubo. This will avoid having to maintain a list of blueprints here in the documentation. @Zhangxunmt could you create a subfolder so we can link to that from the docs?
"model_id": "lyjxwZABNrAVdFa9zrcZ" | ||
} | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/). Once the registration is complete, the task `state` changes to `COMPLETED`. |
``` | ||
|
||
#### Example request | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you have completed the prerequisite steps, you can call the Batch Predict API. The parameters in the batch predict request override those defined in the connector: | |
POST /_plugins/_ml/models/lyjxwZABNrAVdFa9zrcZ/_batch_predict | ||
{ | ||
"parameters": { | ||
"model": "text-embedding-ada-002" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This parameter has the same value as the one in the connector. Can we show the users how to change this or any other parameters to a different value?
} | ||
``` | ||
{% include copy-curl.html %} | ||
The parameters in the batch_predict request will override those defined in the connector. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameters in the batch_predict request will override those defined in the connector. |
{ | ||
"inference_results": [ | ||
{ | ||
"output": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We normally need to provide the descriptions of all response fields in the API doc. Is this the format of all batch predict responses? And where is the actual predict result? Maybe this API page should just show the API itself, and we need to add a complete end-to-end example under the remote-models section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The actual predict results are in the output_file_id in the response, as this is offline asyc prediction. I provided some descriptions of this results in the OpenAI blueprint which is linked in this API. I think this page should just show the API itself and we should keep it simple and straight. The end-to-end example/explanation should be done in another tutorial page somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a complete example in a file under the remote-models directory.
"request_body": "{ \"input_file_id\": \"${parameters.input_file_id}\", \"endpoint\": \"${parameters.endpoint}\", \"completion_window\": \"24h\" }" | ||
} | ||
] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | |
} | |
{% include copy-curl.html %} |
"function_name": "remote", | ||
"description": "OpenAI text embedding model", | ||
"connector_id": "XU5UiokBpXT9icfOM0vt" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | |
} | |
{% include copy-curl.html %} |
Signed-off-by: Xun Zhang <[email protected]>
Signed-off-by: Xun Zhang <[email protected]>
Signed-off-by: Xun Zhang <[email protected]>
Signed-off-by: Xun Zhang <[email protected]>
Description
Add doc for Batch Inference as a new API under the Ml-Commons/Model-API.
Issues Resolved
Closes #7848
Version
List the OpenSearch version to which this PR applies, e.g. 2.14, 2.12--2.14, or all.
Frontend features
If you're submitting documentation for an OpenSearch Dashboards feature, add a video that shows how a user will interact with the UI step by step. A voiceover is optional.
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.