From 61360d082555454d0fe4d48b1ff651d917cf8cc4 Mon Sep 17 00:00:00 2001 From: Naarcha-AWS Date: Mon, 10 Jul 2023 13:08:33 -0500 Subject: [PATCH 1/4] Add document limits to index and buld pages Signed-off-by: Naarcha-AWS --- _api-reference/document-apis/bulk.md | 3 +++ _im-plugin/index.md | 4 ++++ 2 files changed, 7 insertions(+) diff --git a/_api-reference/document-apis/bulk.md b/_api-reference/document-apis/bulk.md index a4b6370629..6fe6c781e7 100644 --- a/_api-reference/document-apis/bulk.md +++ b/_api-reference/document-apis/bulk.md @@ -14,6 +14,9 @@ Introduced 1.0 The bulk operation lets you add, update, or delete multiple documents in a single request. Compared to individual OpenSearch indexing requests, the bulk operation has significant performance benefits. Whenever practical, we recommend batching indexing operations into bulk requests. +Beginning in OpenSearch 2.9, the bulk operation will contain a memory limit for each document in the request of 512mb or less. +{: .note} + ## Example ```json diff --git a/_im-plugin/index.md b/_im-plugin/index.md index fb6cc8c980..a06f0b375d 100644 --- a/_im-plugin/index.md +++ b/_im-plugin/index.md @@ -16,6 +16,8 @@ You index data using the OpenSearch REST API. Two APIs exist: the index API and For situations in which new data arrives incrementally (for example, customer orders from a small business), you might use the index API to add documents individually as they arrive. For situations in which the flow of data is less frequent (for example, weekly updates to a marketing website), you might prefer to generate a file and send it to the `_bulk` API. For large numbers of documents, lumping requests together and using the `_bulk` API offers superior performance. If your documents are enormous, however, you might need to index them individually. +To make sure that many documents can be indexed through both index API and bulk API are manageable, each document inside an index must be less than 512mb. + ## Introduction to indexing @@ -91,6 +93,8 @@ OpenSearch indexes have the following naming restrictions: `:`, `"`, `*`, `+`, `/`, `\`, `|`, `?`, `#`, `>`, or `<` + + ## Read data After you index a document, you can retrieve it by sending a GET request to the same endpoint that you used for indexing: From 29816427ee4ad88c7bfe036a89ccf6466b03db6c Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Tue, 18 Jul 2023 13:36:14 -0700 Subject: [PATCH 2/4] Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --- _api-reference/document-apis/bulk.md | 2 +- _im-plugin/index.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/_api-reference/document-apis/bulk.md b/_api-reference/document-apis/bulk.md index 6fe6c781e7..e94b04d603 100644 --- a/_api-reference/document-apis/bulk.md +++ b/_api-reference/document-apis/bulk.md @@ -14,7 +14,7 @@ Introduced 1.0 The bulk operation lets you add, update, or delete multiple documents in a single request. Compared to individual OpenSearch indexing requests, the bulk operation has significant performance benefits. Whenever practical, we recommend batching indexing operations into bulk requests. -Beginning in OpenSearch 2.9, the bulk operation will contain a memory limit for each document in the request of 512mb or less. +Beginning in OpenSearch 2.9, the bulk operation will contain a memory limit for each document in the request of 512 mb or less. {: .note} ## Example diff --git a/_im-plugin/index.md b/_im-plugin/index.md index a06f0b375d..4945114a0b 100644 --- a/_im-plugin/index.md +++ b/_im-plugin/index.md @@ -16,7 +16,7 @@ You index data using the OpenSearch REST API. Two APIs exist: the index API and For situations in which new data arrives incrementally (for example, customer orders from a small business), you might use the index API to add documents individually as they arrive. For situations in which the flow of data is less frequent (for example, weekly updates to a marketing website), you might prefer to generate a file and send it to the `_bulk` API. For large numbers of documents, lumping requests together and using the `_bulk` API offers superior performance. If your documents are enormous, however, you might need to index them individually. -To make sure that many documents can be indexed through both index API and bulk API are manageable, each document inside an index must be less than 512mb. +To make sure that many documents can be indexed through both the index API and the bulk API are manageable, each document inside an index must be less than 512 mb. ## Introduction to indexing From fc1e8b4491deb26f455980babf5fb902437eff21 Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Tue, 18 Jul 2023 14:45:15 -0700 Subject: [PATCH 3/4] Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --- _api-reference/document-apis/bulk.md | 2 +- _im-plugin/index.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/_api-reference/document-apis/bulk.md b/_api-reference/document-apis/bulk.md index e94b04d603..4d4b0cbf30 100644 --- a/_api-reference/document-apis/bulk.md +++ b/_api-reference/document-apis/bulk.md @@ -14,7 +14,7 @@ Introduced 1.0 The bulk operation lets you add, update, or delete multiple documents in a single request. Compared to individual OpenSearch indexing requests, the bulk operation has significant performance benefits. Whenever practical, we recommend batching indexing operations into bulk requests. -Beginning in OpenSearch 2.9, the bulk operation will contain a memory limit for each document in the request of 512 mb or less. +Beginning in OpenSearch 2.9, the bulk operation will contain a memory limit for the document `_id` in the request of 512 MB or less. {: .note} ## Example diff --git a/_im-plugin/index.md b/_im-plugin/index.md index 4945114a0b..4b92c7b7a0 100644 --- a/_im-plugin/index.md +++ b/_im-plugin/index.md @@ -16,7 +16,7 @@ You index data using the OpenSearch REST API. Two APIs exist: the index API and For situations in which new data arrives incrementally (for example, customer orders from a small business), you might use the index API to add documents individually as they arrive. For situations in which the flow of data is less frequent (for example, weekly updates to a marketing website), you might prefer to generate a file and send it to the `_bulk` API. For large numbers of documents, lumping requests together and using the `_bulk` API offers superior performance. If your documents are enormous, however, you might need to index them individually. -To make sure that many documents can be indexed through both the index API and the bulk API are manageable, each document inside an index must be less than 512 mb. +When indexing documents, the document `_id` must be less than 512 MB. ## Introduction to indexing From 2deabc8728cf37e39cb460225e688a0614ca43a5 Mon Sep 17 00:00:00 2001 From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Date: Wed, 19 Jul 2023 15:43:50 -0700 Subject: [PATCH 4/4] Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --- _api-reference/document-apis/bulk.md | 2 +- _im-plugin/index.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/_api-reference/document-apis/bulk.md b/_api-reference/document-apis/bulk.md index 4d4b0cbf30..efb52db7c1 100644 --- a/_api-reference/document-apis/bulk.md +++ b/_api-reference/document-apis/bulk.md @@ -14,7 +14,7 @@ Introduced 1.0 The bulk operation lets you add, update, or delete multiple documents in a single request. Compared to individual OpenSearch indexing requests, the bulk operation has significant performance benefits. Whenever practical, we recommend batching indexing operations into bulk requests. -Beginning in OpenSearch 2.9, the bulk operation will contain a memory limit for the document `_id` in the request of 512 MB or less. +Beginning in OpenSearch 2.9, when indexing documents using the bulk operation, the document `_id` must be 512 MB or less in size. {: .note} ## Example diff --git a/_im-plugin/index.md b/_im-plugin/index.md index 4b92c7b7a0..5804a23666 100644 --- a/_im-plugin/index.md +++ b/_im-plugin/index.md @@ -16,7 +16,7 @@ You index data using the OpenSearch REST API. Two APIs exist: the index API and For situations in which new data arrives incrementally (for example, customer orders from a small business), you might use the index API to add documents individually as they arrive. For situations in which the flow of data is less frequent (for example, weekly updates to a marketing website), you might prefer to generate a file and send it to the `_bulk` API. For large numbers of documents, lumping requests together and using the `_bulk` API offers superior performance. If your documents are enormous, however, you might need to index them individually. -When indexing documents, the document `_id` must be less than 512 MB. +When indexing documents, the document `_id` must be 512 MB or less in size. ## Introduction to indexing