Apply suggestions from code review

Signed-off-by: kolchfa-aws <[email protected]>
opensearch-project · Jun 20, 2024 · 8894c8c · 8894c8c
1 parent 6cde747
commit 8894c8c
Show file tree

Hide file tree

Showing 2 changed files with 8 additions and 7 deletions.
diff --git a/_ingest-pipelines/processors/index-processors.md b/_ingest-pipelines/processors/index-processors.md
@@ -28,7 +28,7 @@ Processor types and their required or optional parameters vary depending on your
 
 ### Batch-enabled processors
 
-Some processors support batch ingestion in a way that they can handle multiple docs in a batch at the same time. Usually, these batch-enabled processors could have better performance through batch processing. The batch processing needs to be triggered through `_bulk` API with `batch_size` parameter. All batched processors are implemented with both batch mode and single mode. When ingesting via the `PUT` API, the processor's single mode will be activated and process the document in series. Currently, `text_embedding` and `sparse_encoding` processors are batch-enabled processors. For other processors, even documents are ingested through `_bulk` API with `batch_size` parameter set, documents are processed one by one by them.
+Some processors support batch ingestion: they can process multiple documents at the same time as a batch. Usually, these batch-enabled processors provide better performance through batch processing. For batch processing, use the [Bulk API]({{site.url}}{{site.baseurl}}/api-reference/document-apis/bulk/) and provide a `batch_size` parameter. All batch-enabled processors have a batch mode and a single-document mode. When you ingest documents using the `PUT` method, the processor functions in a single-document mode and processes documents in series. Currently, only the `text_embedding` and `sparse_encoding` processors are batch-enabled. All other processors process documents one by one.
 
 
 Processor type | Description

diff --git a/_ml-commons-plugin/remote-models/batch-ingestion.md b/_ml-commons-plugin/remote-models/batch-ingestion.md
@@ -7,12 +7,12 @@ parent: Connecting to externally hosted models
 grand_parent: Integrating ML models
 ---
 
-# Using externally hosted ML model for batch ingestion
+# Using externally hosted ML models for batch ingestion
 
 **Introduced 2.15**
 {: .label .label-purple }
 
-To ingest multiple documents which involve generating embeddings through external machine learning services, you can use OpenSearch's batch ingestion feature to achieve improved ingestion performance.
+If you are ingesting multiple documents and generating embeddings by invoking an externally hosted model, you can use batch ingestion to improve performance.
 
 The [Bulk API]({{site.url}}{{site.baseurl}}/api-reference/document-apis/bulk/) accepts a `batch_size` parameter that indicates to process documents in batches of the specified size. Processors that support batch ingestion will send each batch of documents to an externally hosted model in a single request.
 
@@ -49,7 +49,8 @@ To learn more about model groups, see [Model access control]({{site.url}}{{site.
 
 ## Step 2: Create a connector
 
-You can create a standalone connector that can be reused for multiple models. Alternatively, you can specify a connector when creating a model so that it can be used only for that model. For more information and example connectors, see [Connectors](https://github.com/opensearch-project/documentation-website/blob/7c4fe91ec9a16bb75e33726c2c86441edd56e08a/_ml-commons-plugin/remote-models/%7B%7Bsite.url%7D%7D%7B%7Bsite.baseurl%7D%7D/ml-commons-plugin/remote-models/connectors).
+You can create a standalone connector that can be reused for multiple models. Alternatively, you can specify a connector when creating a model so that it can be used only for that model. For more information and example connectors, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/connectors/).
+
 The Connectors Create API, `/_plugins/_ml/connectors/_create`, creates connectors that facilitate registering and deploying external models in OpenSearch. Using the `endpoint` parameter, you can connect ML Commons to any supported ML tool by using its specific API endpoint. For example, you can connect to a ChatGPT model by using the `api.openai.com` endpoint:
 
 ```json
@@ -159,14 +160,14 @@ PUT _cluster/settings
 
 To undeploy the model, use the [Undeploy API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/).
 
-```
+```json
 POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy
 ```
 {% include copy-curl.html %}
 
 The response contains the task ID that you can use to check the status of the deploy operation:
 
-```
+```json
 {
   "task_id": "vVePb4kBJ1eYAeTM7ljG",
   "status": "CREATED"
@@ -201,7 +202,7 @@ When the operation is complete, the state changes to `COMPLETED`:
 
 The following example request creates an ingest pipeline with a `text_embedding` processor. The processor converts the text in the `passage_text` field into text embeddings and stores the embeddings in `passage_embedding`:
 
-```
+```json
 PUT /_ingest/pipeline/nlp-ingest-pipeline
 {
   "description": "A text embedding pipeline",