Add images; rename tutorial

Signed-off-by: Tyler Ohlsen <[email protected]>
opensearch-project · Nov 19, 2024 · 80d2f77 · 80d2f77
1 parent 446e570
commit 80d2f77
Show file tree

Hide file tree

Showing 27 changed files with 23 additions and 20 deletions.
diff --git a/documentation/images/advanced-input-ingest-input-schema.png b/documentation/images/advanced-input-ingest-input-schema.png
diff --git a/documentation/images/advanced-input-ingest.png b/documentation/images/advanced-input-ingest.png
diff --git a/documentation/images/advanced-output-ingest-output-schema.png b/documentation/images/advanced-output-ingest-output-schema.png
diff --git a/documentation/images/advanced-output-ingest.png b/documentation/images/advanced-output-ingest.png
diff --git a/documentation/images/build-and-run-ingestion-response.png b/documentation/images/build-and-run-ingestion-response.png
diff --git a/documentation/images/buttons.png b/documentation/images/buttons.png
diff --git a/documentation/images/edit-query-term.png b/documentation/images/edit-query-term.png
diff --git a/documentation/images/enrich-data.png b/documentation/images/enrich-data.png
diff --git a/documentation/images/enrich-query-request.png b/documentation/images/enrich-query-request.png
diff --git a/documentation/images/enrich-query-results.png b/documentation/images/enrich-query-results.png
diff --git a/documentation/images/export-modal.png b/documentation/images/export-modal.png
diff --git a/documentation/images/form.png b/documentation/images/form.png
diff --git a/documentation/images/import-data-populated.png b/documentation/images/import-data-populated.png
diff --git a/documentation/images/import-data.png b/documentation/images/import-data.png
diff --git a/documentation/images/index-settings-updated.png b/documentation/images/index-settings-updated.png
diff --git a/documentation/images/index-settings.png b/documentation/images/index-settings.png
diff --git a/documentation/images/input-config-ingest.png b/documentation/images/input-config-ingest.png
diff --git a/documentation/images/inspector.png b/documentation/images/inspector.png
diff --git a/documentation/images/ml-config-ingest.png b/documentation/images/ml-config-ingest.png
diff --git a/documentation/images/output-config-ingest.png b/documentation/images/output-config-ingest.png
diff --git a/documentation/images/override-query-with-placeholders.png b/documentation/images/override-query-with-placeholders.png
diff --git a/documentation/images/override-query.png b/documentation/images/override-query.png
diff --git a/documentation/images/presets-page.png b/documentation/images/presets-page.png
diff --git a/documentation/images/search-response.png b/documentation/images/search-response.png
diff --git a/documentation/images/sidenav.png b/documentation/images/sidenav.png
diff --git a/documentation/images/workspace.png b/documentation/images/workspace.png
diff --git a/documentation/tutorial.md → documentation/tutorial-11-18-2024.md b/documentation/tutorial.md → documentation/tutorial-11-18-2024.md
@@ -1,3 +1,5 @@
+The following tutorial is an accurate representation of the experimental OpenSearch Flow OSD Plugin as of 11/18/2024.
+
 # Overview
 
 The OpenSearch Flow plugin on OpenSearch Dashboards (OSD) gives users the ability to iteratively build out search and ingest pipelines, initially focusing on ease-of-use for AI/ML-enhanced use cases via [ML inference processors](https://opensearch.org/docs/latest/ingest-pipelines/processors/ml-inference/). Behind the scenes, the plugin uses the [Flow Framework OpenSearch plugin](https://opensearch.org/docs/latest/automating-configurations/index/) for resource management for each use case / workflow a user creates. For example, most use cases involve configuring and creating indices, ingest pipelines, and search pipelines. All of these resources are created, updated, deleted, and maintained by the Flow Framework plugin. When users are satisfied with a use case they have built out, they can export the produced [Workflow Template](https://opensearch.org/docs/latest/automating-configurations/workflow-templates/) to re-create resources for their use cases across different clusters / data sources.
@@ -12,23 +14,23 @@ This plugin is not responsible for connector/model creation, this should be done
 
 The "OpenSearch Flow" plugin will be under "Search" in the side navigation on OSD. Click to enter the plugin home page.
 
-[[image:sidenav.png||height="240" width="94"]]
+![sidenav](./images/sidenav.png)
 
 ## 3. Select your use case
 
 Start by selecting a preset template for your particular use case. If you want to first test out some basic use cases, you may choose one of the preset templates. You can fill out some initial information about your use case, such as the model, and some of the different input fields. It is all optional, but will help auto-populate some of the configuration if provided. If you anticipate a more advanced/custom use case, you can choose "Custom", which will provide a blank slate, letting you build out all of your configuration from scratch.
 
 The below screenshots will illustrate a basic semantic search use case starting from scratch.
 
-[[image:presets-page.png||height="166" width="332"]]
+![presets-page](./images/presets-page.png)
 
 ## 4. Get familiar with the Workflow Details page
 
 After selecting, you will enter the Workflow Details page. This page is broken down into 3 main sections:
 
 1. The form. This is where you will spend most of your time, configuring your ingest and search pipelines. It is split into 2 main steps - first configuring your ingest flow, and secondly, configuring your search flow. We will go into more detail on these later.
 
-[[image:form.png||height="207" width="136"]]
+![form](./images/form.png)
 
 2. The preview workspace. This is a read-only workspace, provided as a visual helper to see how your data flows & is transformed across ingest & search. You can toggle to the JSON view to get more details on the underlying resource configurations as you build your flows out.
 
@@ -40,19 +42,19 @@ After selecting, you will enter the Workflow Details page. This page is broken d
 
 4. Header buttons
 
-These allow you to undo current changes, save your current form, export your workflow, or exit and return to the homepage. NOTE: depending on the OSD configuration ((% style="font-family:Courier New,Courier,monospace" %)useNewHomePage (%%)feature flag), these buttons may look different.
+These allow you to undo current changes, save your current form, export your workflow, or exit and return to the homepage. NOTE: depending on the OSD configuration `useNewHomePage` feature flag), these buttons may look different.
 
-[[image:buttons.png||height="42" width="202"]]
+![buttons](./images/buttons.png)
 
 ## 5. Provide some sample data
 
 Now we can begin building the use case! Let's start by providing some sample data. The data should be in a JSON array format. 3 options are provided for your convenience: manual input, importing from a file, or taking some sample data from an existing index. _Note if you already have sample data and are only interested in adding search functionality, you can skip this step entirely by un-checking the "Enabled" checkbox. This will let you navigate directly to the search flow_.
 
 For this example, we will manually input some sample data containing various clothing items.
 
-[[image:import-data.png||height="210" width="258"]]
+![import-data](./images/import-data.png)
 
-==== [[image:import-data-populated.png||height="223" width="256"]] ====
+![import-data-populated](./images/import-data-populated.png)
 
 ## 6. Enrich your data
 
@@ -62,31 +64,32 @@ You can now enrich your data by building out an ingest pipeline & chaining toget
 
 Continuing with the semantic search example, we will now select and configure an ML inference processor to embed my input text. I have a deployed Amazon Bedrock Titan text embedding model. The model expects a single input called "inputText", and returns a single output called "embedding".
 
-[[image:ml-config-ingest.png||height="214" width="268"]]
+![ml-config-ingest](./images/ml-config-ingest.png)
 
 This is where you can now flexibly configure your data via the "Inputs" and "Outputs" sections. "Inputs" allows you to select and transform your data to conform to the expected model inputs. "Outputs" allows you to select and transforms your model outputs to new document fields. You can either select a document field from the dropdown, or perform more detailed transformation using dot notation or [JSONPath](https://en.wikipedia.org/wiki/JSONPath). _(Behind the scenes, this is configuring the "input_map" and "output_map" configuration settings for [ML inference ingest processors](https://opensearch.org/docs/latest/ingest-pipelines/processors/ml-inference/))_
 
 For this example, we can just select the "item_text" field to map to the "inputText" model input, and creating a new document field called "my_embedding" to persist the returned generated embedding from the model:
 
-[[image:input-config-ingest.png||height="92" width="378"]] [[image:output-config-ingest.png||height="98" width="379"]]
+![input-config-ingest](./images/input-config-ingest.png)
+![output-config-ingest](./images/output-config-ingest.png)
 
 For a more detailed look into these transformations, and to verify that they will be valid, you can click the associated "Preview inputs/output" button on the right-hand side. "Preview inputs" allows you to see how the input data (the source document) will be transformed. You can click "Fetch data" to fetch a sample document. There is some helpful visual elements to determine whether the transformed input meets the model interface requirements. You can also view the explicit [JSON Schema](https://json-schema.org/) input interface by clicking "Input schema" on the right-hand side. The top "Define transform" section allows you to edit the transformation directly. You can cancel or save & update the transformation after testing. "Preview outputs" is very similar - it allows you to fetch the model outputs (NOTE: this will execute actual model inference and incur costs - this should be run with caution), and see how it is transformed. You can also view the explicit output interface by clicking the "Output schema" button.
 
 The below images show how the transforms map the "item_text" field into an "inputText" field expected by the model, and how the "embedding" model output is saved as a new "my_embedding" field in the document.
 
-[[image:advanced-input-ingest.png||height="257" width="259"]]
+![advanced-input-ingest](./images/advanced-input-ingest.png)
 
-[[image:advanced-output-ingest.png||height="266" width="259"]]
+![advanced-output-ingest](./images/advanced-output-ingest.png)
 
 ## 7. Ingest data
 
 Ensure your index configurations are up-to-date, and optionally enter an index name. For vector search use cases like in this example, ensure any vector fields are mapped as such, and with appropriate vector dimensions. Additionally, the index settings should ensure this is labeled as a knn index. Note that for preset use cases (non-"Custom" use cases), many of this will be automatically populated for your convenience.
 
-[[image:index-settings-updated.png||height="225" width="258"]]
+![index-settings-updated](./images/index-settings-updated.png)
 
 After configuring, click "Build and run ingestion". This will build out your index, ingest pipeline, and finally bulk ingest your sample documents. The OpenSearch response will be visible under the Inspector panel, as well as any errors if they should occur.
 
-[[image:build-and-run-ingestion-response.png||height="120" width="262"]]
+![build-and-run-ingestion-response](./images/build-and-run-ingestion-response.png)
 
 You have now completed your ingest flow! Let's move on to configuring search by clicking the "Search pipeline >" button.
 
@@ -100,39 +103,39 @@ The query is the starting point for your search flow. Note the index is already
 
 So, we will provide a basic term query with the input data to be vectorized here:
 
-[[image:edit-query-term.png||height="207" width="263"]]
+![edit-query-term](./images/edit-query-term.png)
 
 ## 9. Enrich query request
 
 Similar to Step 6 - Enrich data, this allows you to enrich the query request by configuring a series of processors - in this case, [search request processors](https://opensearch.org/docs/latest/search-plugins/search-pipelines/search-processors/#search-request-processors). Currently, only the ML inference processor is supported. Continuing with the semantic search example, we will configure an ML processor using the same Titan text embedding model. First, configure the input and output mappings to generate the vector, similar to what was done on the ingest side. Specifically, here we select the query value containing the text we want to embed, "shoes". And, we map the embedding to some field called "vector".
 
-[[image:enrich-query-request.png||height="215" width="266"]]
+![enrich-query-request](./images/enrich-query-request.png)
 
 Next, we need to update our query to use this generated vector embedding. Click "Override query" to open the modal. We can select a knn query preset to start.
 
 [[image:override-query-with-placeholders.png||height="256" width="265"]]
 
 From there, populate any placeholder values, such as "${vector_field}" with the associated vector field you have in your index. In this case, "my_embedding" that we configured on ingest. To use the produced vector in the model output, we can see the list of available model outputs under "Model outputs". There is a utility copy button on the right-hand side to copy the template variable. Inject/paste this variable anywhere in the query to dynamically inject it into the query at runtime. In this example, it has already populated "${vector}" as the "vector" value for the knn query, so there is nothing left to do. The final query should have no placeholders, besides any model output dynamic variables that will be populated at runtime.
 
-[[image:override-query.png||height="259" width="266"]]
+![override-query](./images/override-query.png)
 
 ## 10. Enrich query results
 
 Similar to Step 9 - Enrich query request, we can configure a series of [search response processors](https://opensearch.org/docs/latest/search-plugins/search-pipelines/search-processors/#search-response-processors) to enrich/transform the returned matching documents. For this particular example, this is not needed. _For more examples using search response processors, see "More examples" below, including RAG & reranking use cases which involve processing & manipulating the search response._
 
-[[image:enrich-query-results.png||height="98" width="566"]]
+![enrich-query-results](./images/enrich-query-results.png)
 
 ## 11. Execute search
 
 We are finished configuring! Now click "Build and run query" to build out the search pipeline and execute the search request against the index. The final results will pop up in the "Inspector" panel. For this example, we see the top results pertaining to shoes.
 
-[[image:search-response.png||height="248" width="201"]]
+![search-response](./images/search-response.png)
 
 ## 12. Export workflow
 
 If you are satisfied with the final workflow and the results it is producing, you can click the "Export" button in the header. This will open a modal, showing you the end-to-end [workflow template](https://opensearch.org/docs/latest/automating-configurations/workflow-templates/) containing all of the configuration details for your index, ingest pipeline, and search pipeline, as well as associated UI metadata (for example, certain things like the search request are not concrete resources - we persist them here for ease-of-use if importing this template on the UI). It can be copied in JSON or YAML format. Note: any cluster-specific IDs, such as model IDs, will need to be updated, if importing into a different cluster.
 
-[[image:export-modal.png||height="289" width="204"]]
+![export-modal](./images/export-modal.png)
 
 And that's it! If you have followed all of these steps, you now have a successful semantic search use case, with all of the required resources bundled up into a single template. You can import this template on the UI and rebuild for different clusters, or execute directly using the [Flow Framework Provision API](https://opensearch.org/docs/latest/automating-configurations/api/provision-workflow/).
 
@@ -1306,7 +1309,7 @@ Optionally store the rescored result in the model output under a new field. You
 ],
 ```
 
-Rerank processor config: under target_field, select the model score field - continuing with this example, we set it to (% style="font-family:Courier New,Courier,monospace" %)new_score(%%).
+Rerank processor config: under target_field, select the model score field - continuing with this example, we set it to `new_score`.
 
 ---