Unstructured-IO · ron-unstructured · Oct 5, 2023 · Oct 4, 2023 · Oct 4, 2023 · Oct 4, 2023
diff --git a/docs/source/bricks.rst b/docs/source/bricks.rst
@@ -19,3 +19,4 @@ After reading this section, you should understand the following:
    bricks/extracting
    bricks/staging
    bricks/chunking
+   bricks/embedding
diff --git a/docs/source/bricks/embedding.rst b/docs/source/bricks/embedding.rst
@@ -1,21 +1,21 @@
-########
+#########
 Embedding
-########
+#########
 
-EmbeddingEncoder classes in ``unstructured`` use document elements detected
+Embedding encoder classes in ``unstructured`` use document elements detected
 with ``partition`` or document elements grouped with ``chunking`` to obtain
 embeddings for each element, for uses cases such as Retrieval Augmented Generation (RAG).
 
 
 ``BaseEmbeddingEncoder``
-------------------
+------------------------
 
 The ``BaseEmbeddingEncoder`` is an abstract base class that defines the methods to be implemented
 for each ``EmbeddingEncoder`` subclass.
 
 
 ``OpenAIEmbeddingEncoder``
-------------------
+--------------------------
 
 The ``OpenAIEmbeddingEncoder`` class uses langchain OpenAI integration under the hood
 to connect to the OpenAI Text&Embedding API to obtain embeddings for pieces of text.

diff --git a/docs/source/destination_connectors/azure_cognitive_search.rst b/docs/source/destination_connectors/azure_cognitive_search.rst
@@ -1,5 +1,6 @@
 Azure Cognitive Search
-==========
+======================
+
 Batch process all your records using ``unstructured-ingest`` to store structured outputs locally on your filesystem and upload those local files to an Azure Cognitive Search index.
 
 First you'll need to install the azure cognitive search dependencies as shown here.
@@ -72,7 +73,8 @@ For a full list of the options the CLI accepts check ``unstructured-ingest <upst
 NOTE: Keep in mind that you will need to have all the appropriate extras and dependencies for the file types of the documents contained in your data storage platform if you're running this locally. You can find more information about this in the `installation guide <https://unstructured-io.github.io/unstructured/installing.html>`_.
 
 Sample Index Schema
------------
+-------------------
+
 To make sure the schema of the index matches the data being written to it, a sample schema json can be used:
 
 .. literalinclude:: azure_cognitive_sample_index_schema.json

diff --git a/docs/source/installation/full_installation.rst b/docs/source/installation/full_installation.rst
@@ -1,28 +1,45 @@
+.. role:: raw-html(raw)
+    :format: html
+
 Full Installation
 =================
 
-1. **Installing Extras for Specific Document Types**: 
-   If you're processing document types beyond the basics, you can install the necessary extras:
+**Basic Usage**
+
+For a complete set of extras catering to every document type, use:
+
+.. code-block:: bash
+
+  pip install "unstructured[all-docs]"
+
+**Installation for Specific Document Types**
+
+If you're processing document types beyond the basics, you can install the necessary extras:
 
    .. code-block:: bash
 
       pip install "unstructured[docx,pptx]"
 
-   For a complete set of extras catering to every document type, use:
+*Available document types:*
 
    .. code-block:: bash
 
-      pip install "unstructured[all-docs]"
+        "csv", "doc", "docx", "epub", "image", "md", "msg", "odt", "org", "pdf", "ppt", "pptx", "rtf", "rst", "tsv", "xlsx"
 
-2. **Note on Older Versions**:
-   For versions earlier than `unstructured<0.9.0`, the following installation pattern was recommended:
+:raw-html:`<br />`
+**Installation for Specific Data Connectors**
+
+To use any of the data connectors, you must install the specific dependency:
 
    .. code-block:: bash
 
-      pip install "unstructured[local-inference]"
+      pip install "unstructured[s3]"
 
-   While "local-inference" remains supported in newer versions for backward compatibility, it might be deprecated in future releases. It's advisable to transition to the "all-docs" extra for comprehensive support.
+*Available data connectors:*
 
+   .. code-block:: bash
+
+        "airtable", "azure", "azure-cognitive-search", "biomed", "box", "confluence", "delta-table", "discord", "dropbox", "elasticsearch", "gcs", "github", "gitlab", "google-drive", "jira", "notion", "onedrive", "outlook", "reddit", "s3", "sharepoint", "salesforce", "slack", "wikipedia"
 
 Installation with ``conda`` on Windows
 --------------------------------------
@@ -155,3 +172,14 @@ library. This is not included as an ``unstructured`` dependency because it only
 to some tokenizers. See the
 `sentencepiece install instructions <https://github.com/google/sentencepiece#installation>`_ for
 information on how to install ``sentencepiece`` if your tokenizer requires it.
+
+Note on Older Versions
+----------------------
+   For versions earlier than `unstructured<0.9.0`, the following installation pattern was recommended:
+
+   .. code-block:: bash
+
+      pip install "unstructured[local-inference]"
+
+   While "local-inference" remains supported in newer versions for backward compatibility, it might be deprecated in future releases. It's advisable to transition to the "all-docs" extra for comprehensive support.
+
diff --git a/docs/source/introduction/getting_started.rst b/docs/source/introduction/getting_started.rst
@@ -101,20 +101,41 @@ Document elements
 When we partition a document, the output is a list of document ``Element`` objects.
 These element objects represent different components of the source document. Currently, the ``unstructured`` library supports the following element types:
 
-* ``Element``
-	* ``Text``
-		* ``FigureCaption``
-		* ``NarrativeText``
-		* ``ListItem``
-		* ``Title``
-		* ``Address``
-		* ``Table``
-		* ``PageBreak``
-		* ``Header``
-		* ``Footer``
-        	* ``EmailAddress``
-	* ``CheckBox``
-	* ``Image``
+**Elements**
+^^^^^^^^^^^^
+
+* ``type``
+
+  * ``FigureCaption``
+
+  * ``NarrativeText``
+
+  * ``ListItem``
+
+  * ``Title``
+
+  * ``Address``
+
+  * ``Table``
+
+  * ``PageBreak``
+
+  * ``Header``
+
+  * ``Footer``
+
+  * ``UncategorizedText``
+
+  * ``Image``
+
+  * ``Formula``
+
+* ``element_id``
+
+* ``metadata`` - see: :ref:`Metadata page <metadata-label>`
+
+* ``text``
+
 
 Other element types that we will add in the future include tables and figures.
 Different partitioning functions use different methods for determining the element type and extracting the associated content.