Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: qued <[email protected]>
Co-authored-by: shreyanid <[email protected]>
  • Loading branch information
3 people authored Oct 4, 2023
1 parent ccef105 commit 441c7be
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions docs/source/metadata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ the source file:
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``filetype`` | File Type | |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``type`` | Element Type | Categorizes elements into types such as Title, NarrativeText. Not a metadata field |
| ``type`` | Element Type | Categorizes elements into types such as Title, NarrativeText. Not a metadata field. |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``coordinates`` | XY Bounding Box Coordinates | |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Expand All @@ -44,18 +44,18 @@ the source file:
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``text_as_html`` | HTML representation of extracted tables | Only applicable to ``Table`` Elements |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``languages`` | Document Languages | At document level or element level |
| ``languages`` | Document Languages | At document level or element level. List is ordered by probability of being the primary language of the text. |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``emphasized_text_contents``| Emphasized text (bold or italic) in the original document| |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``emphasized_text_tags`` | Tags on text that is emphasized in the original document | |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``num_characters`` | The number of characters used | Used for chunking |
| ``num_characters`` | The number of characters used | Used for chunking. |
| | for max_characters in add_chunking_strategy | |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``is_continuation`` | True if element is a continuation of a previous element | Only relevant for chunking, if an element was divided into two due to ``max_characters`` |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``detection_class_prob`` | Detection Model Class Probabilities | From unstructured-inference, hi-res strategy |
| ``detection_class_prob`` | Detection model class probabilities | From unstructured-inference, hi-res strategy. |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

:raw-html:`<br />`
Expand Down Expand Up @@ -110,7 +110,7 @@ Additional Metadata Fields by Document Type
###########################################

+-------------------------+---------------------+--------------------------------------------------------+
| ``Field Name`` | Applicable Doc Types| Short Description |
| Field Name | Applicable Doc Types| Short Description |
+=========================+=====================+========================================================+
| ``page_number`` | DOCX,PDF, PPT,XLSX | Page Number |
+-------------------------+---------------------+--------------------------------------------------------+
Expand Down

0 comments on commit 441c7be

Please sign in to comment.