Merge branch 'main' into drop-processor

opensearch-project · Mar 7, 2024 · e996085 · e996085
2 parents 4477666 + a742d47
commit e996085
Show file tree

Hide file tree

Showing 36 changed files with 416 additions and 165 deletions.
diff --git a/.github/vale/styles/Vocab/OpenSearch/Words/accept.txt b/.github/vale/styles/Vocab/OpenSearch/Words/accept.txt
@@ -77,6 +77,7 @@ Levenshtein
 [Mm]ultiword
 [Nn]amespace
 [Oo]versamples?
+[Oo]nboarding
 pebibyte
 [Pp]erformant
 [Pp]luggable

diff --git a/_about/version-history.md b/_about/version-history.md
@@ -27,6 +27,7 @@ OpenSearch version | Release highlights | Release date
 [2.0.1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.0.1.md) | Includes bug fixes and maintenance updates for Alerting and Anomaly Detection. | 16 June 2022
 [2.0.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.0.0.md) | Includes document-level monitors for alerting, OpenSearch Notifications plugins, and Geo Map Tiles in OpenSearch Dashboards. Also adds support for Lucene 9 and bug fixes for all OpenSearch plugins. For a full list of release highlights, see the Release Notes. | 26 May 2022
 [2.0.0-rc1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.0.0-rc1.md) | The Release Candidate for 2.0.0. This version allows you to preview the upcoming 2.0.0 release before the GA release. The preview release adds document-level alerting, support for Lucene 9, and the ability to use term lookup queries in document level security. | 03 May 2022
+[1.3.15](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.15.md) | Includes bug fixes and maintenance updates for cross-cluster replication, SQL, OpenSearch Dashboards reporting, and alerting.  | 05 March 2024
 [1.3.14](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.14.md) | Includes bug fixes and maintenance updates for OpenSearch security and OpenSearch Dashboards security.  | 12 December 2023
 [1.3.13](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.13.md) | Includes bug fixes for Anomaly Detection, adds maintenance updates and infrastructure enhancements.  | 21 September 2023
 [1.3.12](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.12.md) | Adds maintenance updates for OpenSearch security and OpenSearch Dashboards observability. Includes bug fixes for observability, OpenSearch Dashboards visualizations, and OpenSearch security. | 10 August 2023

diff --git a/_automating-configurations/api/get-workflow-status.md b/_automating-configurations/api/get-workflow-status.md
@@ -83,7 +83,8 @@ While provisioning is in progress, OpenSearch returns a partial resource list:
     {
       "workflow_step_name": "create_connector",
       "workflow_step_id": "create_connector_1",
-      "connector_id": "NdjCQYwBLmvn802B0IwE"
+      "resource_type": "connector_id",
+      "resource_id": "NdjCQYwBLmvn802B0IwE"
     }
   ]
 }
@@ -99,12 +100,14 @@ Upon provisioning completion, OpenSearch returns the full resource list:
     {
       "workflow_step_name": "create_connector",
       "workflow_step_id": "create_connector_1",
-      "connector_id": "NdjCQYwBLmvn802B0IwE"
+      "resource_type": "connector_id",
+      "resource_id": "NdjCQYwBLmvn802B0IwE"
     },
     {
       "workflow_step_name": "register_remote_model",
       "workflow_step_id": "register_model_2",
-      "model_id": "N9jCQYwBLmvn802B0oyh"
+      "resource_type": "model_id",
+      "resource_id": "N9jCQYwBLmvn802B0oyh"
     }
   ]
 }

diff --git a/_automating-configurations/api/index.md b/_automating-configurations/api/index.md
@@ -19,5 +19,6 @@ OpenSearch supports the following workflow APIs:
 * [Get workflow status]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/)
 * [Get workflow steps]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-steps/)
 * [Search workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/search-workflow/)
+* [Search workflow state]({{site.url}}{{site.baseurl}}/automating-configurations/api/search-workflow-state/)
 * [Deprovision workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/deprovision-workflow/)
 * [Delete workflow]({{site.url}}{{site.baseurl}}/automating-configurations/api/delete-workflow/)
diff --git a/_automating-configurations/api/search-workflow-state.md b/_automating-configurations/api/search-workflow-state.md
@@ -0,0 +1,63 @@
+---
+layout: default
+title: Search for a workflow state
+parent: Workflow APIs
+nav_order: 65
+---
+
+# Search for a workflow
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/flow-framework/issues/475).    
+{: .warning}
+
+You can search for resources created by workflows by matching a query to a field. The fields you can search correspond to those returned by the [Get Workflow Status API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/).
+
+## Path and HTTP methods
+
+```json
+GET /_plugins/_flow_framework/workflow/state/_search
+POST /_plugins/_flow_framework/workflow/state/_search
+``` 
+
+#### Example request: All workflows with a state of `NOT_STARTED`
+
+```json
+GET /_plugins/_flow_framework/workflow/state/_search
+{
+  "query": {
+    "match": {
+      "state": "NOT_STARTED"
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example request: All workflows that have a `resources_created` field with a `workflow_step_id` of `register_model_2`
+
+```json
+GET /_plugins/_flow_framework/workflow/state/_search
+{
+  "query": {
+    "nested": {
+      "path": "resources_created",
+      "query": {
+        "bool": {
+          "must": [
+            {
+              "match": {
+                "resources_created.workflow_step_id": "register_model_2"
+              }
+            }
+          ]
+        }
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+The response contains documents matching the search parameters.
diff --git a/_automating-configurations/api/search-workflow.md b/_automating-configurations/api/search-workflow.md
@@ -24,7 +24,7 @@ POST /_plugins/_flow_framework/workflow/_search
 ```json
 GET /_plugins/_flow_framework/workflow/_search
 {
-    "query": {
+  "query": {
     "match_all": {}
   }
 }
@@ -36,7 +36,7 @@ GET /_plugins/_flow_framework/workflow/_search
 ```json
 GET /_plugins/_flow_framework/workflow/_search
 {
-    "query": {
+  "query": {
     "match": {
       "use_case": "REMOTE_MODEL_DEPLOYMENT"
     }

diff --git a/_benchmark/index.md b/_benchmark/index.md
@@ -18,7 +18,7 @@ OpenSearch Benchmark is a macrobenchmark utility provided by the [OpenSearch Pro
 - Informing decisions about when to upgrade your cluster to a new version.
 - Determining how changes to your workflow---such as modifying mappings or queries---might impact your cluster.
 
-OpenSearch Benchmark can be installed directly on a compatible host running Linux and macOS. You can also run OpenSearch Benchmark in a Docker container. See [Installing OpenSearch Benchmark]({{site.url}}{{site.baseurl}}/benchmark/installing-benchmark/) for more information.
+OpenSearch Benchmark can be installed directly on a compatible host running Linux or macOS. You can also run OpenSearch Benchmark in a Docker container. See [Installing OpenSearch Benchmark]({{site.url}}{{site.baseurl}}/benchmark/installing-benchmark/) for more information.
 
 The following diagram visualizes how OpenSearch Benchmark works when run against a local host:
 

diff --git a/_benchmark/user-guide/contributing-workloads.md b/_benchmark/user-guide/contributing-workloads.md
@@ -0,0 +1,57 @@
+---
+layout: default
+title: Sharing custom workloads
+nav_order: 11
+parent: User guide
+---
+
+# Sharing custom workloads
+
+You can share a custom workload with other OpenSearch users by uploading it to the [workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/) on GitHub. 
+
+Make sure that any data included in the workload's dataset does not contain proprietary data or personally identifiable information (PII). 
+
+To share a custom workload, follow these steps.
+
+## Create a README.md
+
+Provide a detailed `README.MD` file that includes the following:  
+
+- The purpose of the workload. When creating a description for the workload, consider its specific use and how the that use case differs from others in the [workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/).
+- An example document from the dataset that helps users understand the data's structure.
+- The workload parameters that can be used to customize the workload.
+- A list of default test procedures included in the workload as well as other test procedures that the workload can run.
+- An output sample produced by the workload after a test is run.
+- A copy of the open-source license that gives the user and OpenSearch Benchmark permission to use the dataset.
+
+For an example workload README file, go to the `http_logs` [README](https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/main/http_logs/README.md).
+
+## Verify the workload's structure
+
+The workload must include the following files: 
+
+- `workload.json`
+- `index.json`
+- `files.txt`
+- `test_procedures/default.json`
+- `operations/default.json` 
+
+Both `default.json` file names can be customized to have a descriptive name. The workload can include an optional `workload.py` file to add more dynamic functionality. For more information about a file's contents, go to [Anatomy of a workload]({{site.url}}{{site.baseurl}}/benchmark/user-guide/understanding-workloads/anatomy-of-a-workload/).
+
+## Testing the workload
+
+All workloads contributed to OpenSearch Benchmark must fulfill the following testing requirements: 
+
+- All tests run to explore and produce an example from the workload must target an OpenSearch cluster.
+- The workload must pass all integration tests. Follow these steps to ensure that the workload passes the integration tests:
+   1. Add the workload to your forked copy of the [workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). Make sure that you've forked both the `opensearch-benchmark-workloads` repository and the [OpenSeach Benchmark](https://github.com/opensearch-project/opensearch-benchmark) repository.
+   3. In your forked OpenSearch Benchmark repository, update the `benchmark-os-it.ini` and `benchmark-in-memory.ini` files in the `/osbenchmark/it/resources` directory to point to the forked workloads repository containing your workload.
+   4. After you've modified the `.ini` files, commit your changes to a branch for testing.
+   6. Run your integration tests using GitHub actions by selecting the branch for which you committed your changes. Verify that the tests have run as expected.
+   7. If your integration tests run as expected, go to your forked workloads repository and merge your workload changes into branches `1` and `2`. This allows for your workload to appear in both major versions of OpenSearch Benchmark.
+
+## Create a PR
+
+After testing the workload, create a pull request (PR) from your fork to the `opensearch-project` [workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). Add a sample output and summary result to the PR description. The OpenSearch Benchmark maintainers will review the PR.
+
+Once the PR is approved, you must share the data corpora of your dataset. The OpenSearch Benchmark team can then add the dataset to a shared S3 bucket. If your data corpora is stored in an S3 bucket, you can use [AWS DataSync](https://docs.aws.amazon.com/datasync/latest/userguide/create-s3-location.html) to share the data corpora. Otherwise, you must inform the maintainers of where the data corpora resides.
diff --git a/_benchmark/user-guide/distributed-load.md b/_benchmark/user-guide/distributed-load.md
@@ -1,7 +1,7 @@
 ---
 layout: default
 title: Running distributed loads
-nav_order: 10
+nav_order: 15
 parent: User guide
 ---
 

diff --git a/_benchmark/user-guide/telemetry.md b/_benchmark/user-guide/telemetry.md
@@ -1,7 +1,7 @@
 ---
 layout: default
 title: Enabling telemetry devices
-nav_order: 15
+nav_order: 30
 parent: User guide
 ---
 

diff --git a/_clients/java.md b/_clients/java.md
@@ -344,7 +344,7 @@ client.delete(b -> b.index(index).id("1"));
 The following sample code deletes an index:
 
 ```java
-DeleteIndexRequest deleteIndexRequest = new DeleteRequest.Builder().index(index).build();
+DeleteIndexRequest deleteIndexRequest = new DeleteIndexRequest.Builder().index(index).build();
 DeleteIndexResponse deleteIndexResponse = client.indices().delete(deleteIndexRequest);
 ```
 {% include copy.html %}

diff --git a/_dashboards/discover/time-filter.md b/_dashboards/discover/time-filter.md
@@ -5,7 +5,7 @@ parent: Analyzing data
 nav_order: 20
 redirect_from:
   - /dashboards/get-started/time-filter/
-  -/dashboards/discover/time-filter/
+  - /dashboards/discover/time-filter/
 ---
 
 # Time filter

diff --git a/_dashboards/management/management-index.md b/_dashboards/management/management-index.md
@@ -9,7 +9,7 @@ has_children: true
 Introduced 2.10
 {: .label .label-purple }
 
-Dashboards Management serves as the command center for customizing OpenSearch Dashboards to your needs. A view of the interface is shown in the following image.
+**Dashboards Management** serves as the command center for customizing OpenSearch Dashboards to your needs. A view of the interface is shown in the following image.
 
 <img src="{{site.url}}{{site.baseurl}}/images/dashboards/dashboards-management-ui.png" alt="Dashboards Management interface" width="700"/>
 
@@ -18,9 +18,9 @@ Dashboards Management serves as the command center for customizing OpenSearch Da
 
 ## Applications
 
-The following applications are available in Dashboards Management:
+The following applications are available in **Dashboards Management**:
 
 - **[Index Patterns]({{site.url}}{{site.baseurl}}/dashboards/management/index-patterns/):** To access OpenSearch data, you need to create an index pattern so that you can select the data you want to use and define the properties of the fields. The Index Pattern tool gives you the ability to create an index pattern from within the UI. Index patterns point to one or more indexes, data streams, or index aliases. 
 - **[Data Sources]({{site.url}}{{site.baseurl}}/dashboards/management/multi-data-sources/):** The Data Sources tool is used to configure and manage the data sources that OpenSearch uses to collect and analyze data. You can use the tool to specify the source configuration in your copy of the [OpenSearch Dashboards configuration file]({{site.url}}{{site.baseurl}}https://github.com/opensearch-project/OpenSearch-Dashboards/blob/main/config/opensearch_dashboards.yml).
-- **Saved Objects:** The Saved Objects tool helps you organize and manage your saved objects. Saved objects are files that store data, such as dashboards, visualizations, and maps, for later use.
+- **[Saved Objects](https://opensearch.org/blog/enhancement-multiple-data-source-import-saved-object/):** The Saved Objects tool helps you organize and manage your saved objects. Saved objects are files that store data, such as dashboards, visualizations, and maps, for later use.
 - **[Advanced Settings]({{site.url}}{{site.baseurl}}/dashboards/management/advanced-settings/):** The Advanced Settings tool gives you the flexibility to personalize the behavior of OpenSearch Dashboards. The tool is divided into settings sections, such as General, Accessibility, and Notifications, and you can use it to customize and optimize many of your Dashboards settings.
diff --git a/_data-prepper/common-use-cases/anomaly-detection.md b/_data-prepper/common-use-cases/anomaly-detection.md
@@ -2,7 +2,7 @@
 layout: default
 title: Anomaly detection
 parent: Common use cases
-nav_order: 30
+nav_order: 5
 ---
 
 # Anomaly detection

diff --git a/_data-prepper/common-use-cases/codec-processor-combinations.md b/_data-prepper/common-use-cases/codec-processor-combinations.md
@@ -2,7 +2,7 @@
 layout: default
 title: Codec processor combinations
 parent: Common use cases
-nav_order: 25
+nav_order: 10
 ---
 
 # Codec processor combinations

diff --git a/_data-prepper/common-use-cases/event-aggregation.md b/_data-prepper/common-use-cases/event-aggregation.md
@@ -2,7 +2,7 @@
 layout: default
 title: Event aggregation
 parent: Common use cases
-nav_order: 40
+nav_order: 25
 ---
 
 # Event aggregation

diff --git a/_data-prepper/common-use-cases/log-analytics.md b/_data-prepper/common-use-cases/log-analytics.md
@@ -2,7 +2,7 @@
 layout: default
 title: Log analytics
 parent: Common use cases
-nav_order: 10
+nav_order: 30
 ---
 
 # Log analytics

diff --git a/_data-prepper/common-use-cases/log-enrichment.md b/_data-prepper/common-use-cases/log-enrichment.md
@@ -1,11 +1,11 @@
 ---
 layout: default
-title: Log enrichment with Data Prepper
+title: Log enrichment
 parent: Common use cases
-nav_order: 50
+nav_order: 35
 ---
 
-# Log enrichment with Data Prepper
+# Log enrichment
 
 You can perform different types of log enrichment with Data Prepper, including:
 

diff --git a/_data-prepper/common-use-cases/metrics-traces.md b/_data-prepper/common-use-cases/metrics-traces.md
@@ -2,7 +2,7 @@
 layout: default
 title: Deriving metrics from traces
 parent: Common use cases
-nav_order: 60
+nav_order: 20
 ---
 
 # Deriving metrics from traces

diff --git a/_data-prepper/common-use-cases/s3-logs.md b/_data-prepper/common-use-cases/s3-logs.md
@@ -2,7 +2,7 @@
 layout: default
 title: S3 logs
 parent: Common use cases
-nav_order: 20
+nav_order: 40
 ---
 
 # S3 logs

diff --git a/_data-prepper/common-use-cases/text-processing.md b/_data-prepper/common-use-cases/text-processing.md
@@ -2,7 +2,7 @@
 layout: default
 title: Text processing
 parent: Common use cases
-nav_order: 35
+nav_order: 55
 ---
 
 # Text processing

diff --git a/_data-prepper/common-use-cases/trace-analytics.md b/_data-prepper/common-use-cases/trace-analytics.md
@@ -2,7 +2,7 @@
 layout: default
 title: Trace analytics
 parent: Common use cases
-nav_order: 5
+nav_order: 60
 ---
 
 # Trace analytics
@@ -15,7 +15,7 @@ When using Data Prepper as a server-side component to collect trace data, you ca
 
 The following flowchart illustrates the trace analytics workflow, from running OpenTelemetry Collector to using OpenSearch Dashboards for visualization.
 
-<img src="{{site.url}}{{site.baseurl}}/images/data-prepper/trace-analytics/trace-analytics-components.jpg" alt="Trace analyticis component overview">{: .img-fluid}
+<img src="{{site.url}}{{site.baseurl}}/images/data-prepper/trace-analytics/trace-analytics-components.jpg" alt="Trace analytics component overview">{: .img-fluid}
 
 To monitor trace analytics, you need to set up the following components in your service environment:
 - Add **instrumentation** to your application so it can generate telemetry data and send it to an OpenTelemetry collector.
@@ -322,7 +322,7 @@ For other configurations available for OpenSearch sinks, see [Data Prepper OpenS
 
 ## OpenTelemetry Collector
 
-You need to run OpenTelemetry Collector in your service environment. Follow [Getting Started](https://opentelemetry.io/docs/collector/getting-started/#getting-started) to install an OpenTelemetry collector.  Ensure that you configure the collector with an exporter configured for your Data Prepper instance. The following example `otel-collector-config.yaml` file receives data from various instrumentations and exports it to Data Prepper.
+You need to run OpenTelemetry Collector in your service environment. Follow [Getting Started](https://opentelemetry.io/docs/collector/getting-started/#getting-started) to install an OpenTelemetry collector. Ensure that you configure the collector with an exporter configured for your Data Prepper instance. The following example `otel-collector-config.yaml` file receives data from various instrumentations and exports it to Data Prepper.
 
 ### Example otel-collector-config.yaml file
 

diff --git a/_ingest-pipelines/processors/index-processors.md b/_ingest-pipelines/processors/index-processors.md
@@ -30,7 +30,8 @@ Processor type | Description
 :--- | :--- 
 `append` | Adds one or more values to a field in a document. 
 `bytes` | Converts a human-readable byte value to its value in bytes.
-`convert` | Changes the data type of a field in a document. 
+`convert` | Changes the data type of a field in a document.
+`copy` | Copies an entire object in an existing field to another field.
 `csv` | Extracts CSVs and stores them as individual fields in a document. 
 `date` | Parses dates from fields and then uses the date or timestamp as the timestamp for a document.
 `date_index_name` | Indexes documents into time-based indexes based on a date or timestamp field in a document.