From 12eac85bc87142d5f9b0f33bfa0f19c08498861d Mon Sep 17 00:00:00 2001 From: ron-unstructured <138828701+ron-unstructured@users.noreply.github.com> Date: Wed, 4 Oct 2023 14:38:32 -0700 Subject: [PATCH] add embedding brick in TOC & fix sphinx warnings --- docs/source/bricks.rst | 1 + docs/source/bricks/embedding.rst | 12 ++++++------ .../azure_cognitive_search.rst | 6 ++++-- 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/docs/source/bricks.rst b/docs/source/bricks.rst index 71dbf337db..f09718f322 100644 --- a/docs/source/bricks.rst +++ b/docs/source/bricks.rst @@ -19,3 +19,4 @@ After reading this section, you should understand the following: bricks/extracting bricks/staging bricks/chunking + bricks/embedding diff --git a/docs/source/bricks/embedding.rst b/docs/source/bricks/embedding.rst index ef51c7c364..29481a4215 100644 --- a/docs/source/bricks/embedding.rst +++ b/docs/source/bricks/embedding.rst @@ -1,21 +1,21 @@ -######## +######### Embedding -######## +######### EmbeddingEncoder classes in ``unstructured`` use document elements detected with ``partition`` or document elements grouped with ``chunking`` to obtain embeddings for each element, for uses cases such as Retrieval Augmented Generation (RAG). -``BaseEmbeddingEncoder`` ------------------- +BaseEmbeddingEncoder +-------------------- The ``BaseEmbeddingEncoder`` is an abstract base class that defines the methods to be implemented for each ``EmbeddingEncoder`` subclass. -``OpenAIEmbeddingEncoder`` ------------------- +OpenAIEmbeddingEncoder +---------------------- The ``OpenAIEmbeddingEncoder`` class uses langchain OpenAI integration under the hood to connect to the OpenAI Text&Embedding API to obtain embeddings for pieces of text. diff --git a/docs/source/destination_connectors/azure_cognitive_search.rst b/docs/source/destination_connectors/azure_cognitive_search.rst index 9cd0eb09db..e08a1d6b8b 100644 --- a/docs/source/destination_connectors/azure_cognitive_search.rst +++ b/docs/source/destination_connectors/azure_cognitive_search.rst @@ -1,5 +1,6 @@ Azure Cognitive Search -========== +====================== + Batch process all your records using ``unstructured-ingest`` to store structured outputs locally on your filesystem and upload those local files to an Azure Cognitive Search index. First you'll need to install the azure cognitive search dependencies as shown here. @@ -72,7 +73,8 @@ For a full list of the options the CLI accepts check ``unstructured-ingest `_. Sample Index Schema ------------ +------------------- + To make sure the schema of the index matches the data being written to it, a sample schema json can be used: .. literalinclude:: azure_cognitive_sample_index_schema.json