From 55b2e9a7631dd8905d169655024a6fb384e5f8c0 Mon Sep 17 00:00:00 2001
From: "opensearch-trigger-bot[bot]"
 <98922864+opensearch-trigger-bot[bot]@users.noreply.github.com>
Date: Wed, 12 Apr 2023 12:12:44 -0500
Subject: [PATCH] Add config parameters for traced models (#3456) (#3758)

* Add config parameters for traced models


* Update model-serving-framework.md

* Update model-serving-framework.md

* Update model-serving-framework.md

* Update model-serving-framework.md

---------


(cherry picked from commit fa643f904066209913c4d281cd45768b7ca7afdc)

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
---
 _ml-commons-plugin/model-serving-framework.md | 23 ++++++++++++-------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/_ml-commons-plugin/model-serving-framework.md b/_ml-commons-plugin/model-serving-framework.md
index 11deb62560..3a4808ea9b 100644
--- a/_ml-commons-plugin/model-serving-framework.md
+++ b/_ml-commons-plugin/model-serving-framework.md
@@ -53,20 +53,27 @@ The URL upload method requires the following request fields.
 
 Field | Data type | Description
 :---  | :--- | :--- 
-`name`| string | The name of the model. |
-`version` | string | The version number of the model. Since OpenSearch does not enforce a specific version schema for models, you can choose any number or format that makes sense for your models. |
-`model_format` | string | The portable format of the model file. Currently only supports `TORCH_SCRIPT`. |
-[`model_config`](#the-model_config-object) | json object | The model's configuration, including the `model_type`, `embedding_dimension`, and `framework_type`. |
+`name`| String | The name of the model. |
+`version` | String | The version number of the model. Since OpenSearch does not enforce a specific version schema for models, you can choose any number or format that makes sense for your models. |
+`model_format` | String | The portable format of the model file. Currently only supports `TORCH_SCRIPT`. |
+[`model_config`](#the-model_config-object) | JSON object | The model's configuration, including the `model_type`, `embedding_dimension`, and `framework_type`. |
 `url` | string | The URL where the model is located. |
 
 ### The `model_config` object
 
 | Field | Data type | Description |
 | :--- | :--- | :--- |
-| `model_type` | string | The model type, such as `bert`. For a Huggingface model, the model type is specified in `config.json`. For an example, see the [`all-MiniLM-L6-v2` Huggingface model `config.json`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/blob/main/config.json#L15).|
-| `embedding_dimension` | integer | The dimension of the model-generated dense vector. For a Huggingface model, the dimension is specified in the model card. For example, in the [`all-MiniLM-L6-v2` Huggingface model card](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2), the statement `384 dimensional dense vector space` specifies 384 as the embedding dimension. |
-| `framework_type` | string  | The framework the model is using. Currently, we support `sentence_transformers` and `huggingface_transformers` frameworks. The `sentence_transformers` model outputs text embeddings directly, so ML Commons does not perform any post processing. For `huggingface_transformers`, ML Commons performs post processing by applying mean pooling to get text embeddings. See the example [`all-MiniLM-L6-v2` Huggingface model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) for more details. |
-| `all_config` _(Optional)_ | string | This field is used for reference purposes. You can specify all model configurations in this field. For example, if you are using a Huggingface model, you can minify the `config.json` file to one line and save its contents in the `all_config` field. Once the model is uploaded, you can use the get model API operation to get all model configurations stored in this field. |
+| `model_type` | String | The model type, such as `bert`. For a Huggingface model, the model type is specified in `config.json`. For an example, see the [`all-MiniLM-L6-v2` Huggingface model `config.json`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/blob/main/config.json#L15).|
+| `embedding_dimension` | Integer | The dimension of the model-generated dense vector. For a Huggingface model, the dimension is specified in the model card. For example, in the [`all-MiniLM-L6-v2` Huggingface model card](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2), the statement `384 dimensional dense vector space` specifies 384 as the embedding dimension. |
+| `framework_type` | String  | The framework the model is using. Currently, we support `sentence_transformers` and `huggingface_transformers` frameworks. The `sentence_transformers` model outputs text embeddings directly, so ML Commons does not perform any post processing. For `huggingface_transformers`, ML Commons performs post processing by applying mean pooling to get text embeddings. See the example [`all-MiniLM-L6-v2` Huggingface model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) for more details. |
+| `all_config` _(Optional)_ | String | This field is used for reference purposes. You can specify all model configurations in this field. For example, if you are using a Huggingface model, you can minify the `config.json` file to one line and save its contents in the `all_config` field. Once the model is uploaded, you can use the get model API operation to get all model configurations stored in this field. |
+
+You can further customize a pre-trained sentence transformer model's post-processing logic with the following optional fields in the `model_config` object.
+
+| Field | Data type | Description |
+| :--- | :--- | :--- |
+| `pooling_mode` | String | The post-process model output, either `mean`, `mean_sqrt_len`, `max`, `weightedmean`, or `cls`.|
+| `normalize_result` | Boolean | When set to `true`, normalizes the model output in order to scale to a standard range for the model. |
 
 #### Example request