Skip to content

Commit

Permalink
add embedding brick in TOC & fix sphinx warnings
Browse files Browse the repository at this point in the history
  • Loading branch information
ron-unstructured committed Oct 4, 2023
1 parent c39be92 commit 12eac85
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 8 deletions.
1 change: 1 addition & 0 deletions docs/source/bricks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ After reading this section, you should understand the following:
bricks/extracting
bricks/staging
bricks/chunking
bricks/embedding
12 changes: 6 additions & 6 deletions docs/source/bricks/embedding.rst
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
########
#########
Embedding
########
#########

EmbeddingEncoder classes in ``unstructured`` use document elements detected
with ``partition`` or document elements grouped with ``chunking`` to obtain
embeddings for each element, for uses cases such as Retrieval Augmented Generation (RAG).


``BaseEmbeddingEncoder``
------------------
BaseEmbeddingEncoder
--------------------

The ``BaseEmbeddingEncoder`` is an abstract base class that defines the methods to be implemented
for each ``EmbeddingEncoder`` subclass.


``OpenAIEmbeddingEncoder``
------------------
OpenAIEmbeddingEncoder
----------------------

The ``OpenAIEmbeddingEncoder`` class uses langchain OpenAI integration under the hood
to connect to the OpenAI Text&Embedding API to obtain embeddings for pieces of text.
Expand Down
6 changes: 4 additions & 2 deletions docs/source/destination_connectors/azure_cognitive_search.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
Azure Cognitive Search
==========
======================

Batch process all your records using ``unstructured-ingest`` to store structured outputs locally on your filesystem and upload those local files to an Azure Cognitive Search index.

First you'll need to install the azure cognitive search dependencies as shown here.
Expand Down Expand Up @@ -72,7 +73,8 @@ For a full list of the options the CLI accepts check ``unstructured-ingest <upst
NOTE: Keep in mind that you will need to have all the appropriate extras and dependencies for the file types of the documents contained in your data storage platform if you're running this locally. You can find more information about this in the `installation guide <https://unstructured-io.github.io/unstructured/installing.html>`_.

Sample Index Schema
-----------
-------------------

To make sure the schema of the index matches the data being written to it, a sample schema json can be used:

.. literalinclude:: azure_cognitive_sample_index_schema.json
Expand Down

0 comments on commit 12eac85

Please sign in to comment.