Skip to content

Commit

Permalink
Update documentation to remove extra level for text in markdown
Browse files Browse the repository at this point in the history
  • Loading branch information
benbrandt committed Mar 25, 2024
1 parent fb61aae commit abf7777
Show file tree
Hide file tree
Showing 5 changed files with 35 additions and 40 deletions.
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,14 +164,13 @@ Markdown is parsed according to the CommonMark spec, along with some optional fe
3. [Unicode Word Boundaries](https://www.unicode.org/reports/tr29/#Word_Boundaries)
4. [Unicode Sentence Boundaries](https://www.unicode.org/reports/tr29/#Sentence_Boundaries)
5. Soft line breaks (single newline) which isn't necessarily a new element in Markdown.
6. Text nodes within elements
7. Inline elements such as: emphasis, strong, strikethrough, link, image, table cells, inline code, footnote references, task list markers, and inline html.
8. Block elements suce as: paragraphs, code blocks, and footnote definitions.
9. Container blocks such as: table rows, block quotes, list items, and HTML blocks.
10. Meta containers such as: lists and tables.
11. Thematic breaks or horizontal rules.
12. Headings by level
13. Metadata at the beginning of the document
6. Inline elements such as: text nodes, emphasis, strong, strikethrough, link, image, table cells, inline code, footnote references, task list markers, and inline html.
7. Block elements suce as: paragraphs, code blocks, and footnote definitions.
8. Container blocks such as: table rows, block quotes, list items, and HTML blocks.
9. Meta containers such as: lists and tables.
10. Thematic breaks or horizontal rules.
11. Headings by level
12. Metadata at the beginning of the document

Splitting doesn't occur below the character level, otherwise you could get partial bytes of a char, which may not be a valid unicode str.

Expand Down
15 changes: 7 additions & 8 deletions bindings/python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,14 +121,13 @@ Markdown is parsed according to the CommonMark spec, along with some optional fe
3. [Unicode Word Boundaries](https://www.unicode.org/reports/tr29/#Word_Boundaries)
4. [Unicode Sentence Boundaries](https://www.unicode.org/reports/tr29/#Sentence_Boundaries)
5. Soft line breaks (single newline) which isn't necessarily a new element in Markdown.
6. Text nodes within elements
7. Inline elements such as: emphasis, strong, strikethrough, link, image, table cells, inline code, footnote references, task list markers, and inline html.
8. Block elements suce as: paragraphs, code blocks, and footnote definitions.
9. Container blocks such as: table rows, block quotes, list items, and HTML blocks.
10. Meta containers such as: lists and tables.
11. Thematic breaks or horizontal rules.
12. Headings by level
13. Metadata at the beginning of the document
6. Inline elements such as: text nodes, emphasis, strong, strikethrough, link, image, table cells, inline code, footnote references, task list markers, and inline html.
7. Block elements suce as: paragraphs, code blocks, and footnote definitions.
8. Container blocks such as: table rows, block quotes, list items, and HTML blocks.
9. Meta containers such as: lists and tables.
10. Thematic breaks or horizontal rules.
11. Headings by level
12. Metadata at the beginning of the document

Splitting doesn't occur below the character level, otherwise you could get partial bytes of a char, which may not be a valid unicode str.

Expand Down
15 changes: 7 additions & 8 deletions bindings/python/semantic_text_splitter.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -392,14 +392,13 @@ class MarkdownSplitter:
3. [Unicode Word Boundaries](https://www.unicode.org/reports/tr29/#Word_Boundaries)
4. [Unicode Sentence Boundaries](https://www.unicode.org/reports/tr29/#Sentence_Boundaries)
5. Soft line breaks (single newline) which isn't necessarily a new element in Markdown.
6. Text nodes within elements
7. Inline elements such as: emphasis, strong, strikethrough, link, image, table cells, inline code, footnote references, task list markers, and inline html.
8. Block elements suce as: paragraphs, code blocks, and footnote definitions.
9. Container blocks such as: table rows, block quotes, list items, and HTML blocks.
10. Meta containers such as: lists and tables.
11. Thematic breaks or horizontal rules.
12. Headings by level
13. Metadata at the beginning of the document
6. Inline elements such as: text nodes, emphasis, strong, strikethrough, link, image, table cells, inline code, footnote references, task list markers, and inline html.
7. Block elements suce as: paragraphs, code blocks, and footnote definitions.
8. Container blocks such as: table rows, block quotes, list items, and HTML blocks.
9. Meta containers such as: lists and tables.
10. Thematic breaks or horizontal rules.
11. Headings by level
12. Metadata at the beginning of the document
Markdown is parsed according to the Commonmark spec, along with some optional features such as GitHub Flavored Markdown.
Expand Down
15 changes: 7 additions & 8 deletions bindings/python/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -632,14 +632,13 @@ impl PyMarkdownSplitter {
3. [Unicode Word Boundaries](https://www.unicode.org/reports/tr29/#Word_Boundaries)
4. [Unicode Sentence Boundaries](https://www.unicode.org/reports/tr29/#Sentence_Boundaries)
5. Soft line breaks (single newline) which isn't necessarily a new element in Markdown.
6. Text nodes within elements
7. Inline elements such as: emphasis, strong, strikethrough, link, image, table cells, inline code, footnote references, task list markers, and inline html.
8. Block elements suce as: paragraphs, code blocks, and footnote definitions.
9. Container blocks such as: table rows, block quotes, list items, and HTML blocks.
10. Meta containers such as: lists and tables.
11. Thematic breaks or horizontal rules.
12. Headings by level
13. Metadata at the beginning of the document
6. Inline elements such as: text nodes, emphasis, strong, strikethrough, link, image, table cells, inline code, footnote references, task list markers, and inline html.
7. Block elements suce as: paragraphs, code blocks, and footnote definitions.
8. Container blocks such as: table rows, block quotes, list items, and HTML blocks.
9. Meta containers such as: lists and tables.
10. Thematic breaks or horizontal rules.
11. Headings by level
12. Metadata at the beginning of the document
Markdown is parsed according to the Commonmark spec, along with some optional features such as GitHub Flavored Markdown.
Expand Down
15 changes: 7 additions & 8 deletions src/markdown.rs
Original file line number Diff line number Diff line change
Expand Up @@ -93,14 +93,13 @@ where
/// 3. [Unicode Word Boundaries](https://www.unicode.org/reports/tr29/#Word_Boundaries)
/// 4. [Unicode Sentence Boundaries](https://www.unicode.org/reports/tr29/#Sentence_Boundaries)
/// 5. Soft line breaks (single newline) which isn't necessarily a new element in Markdown.
/// 6. Text nodes within elements
/// 7. Inline elements such as: emphasis, strong, strikethrough, link, image, table cells, inline code, footnote references, task list markers, and inline html.
/// 8. Block elements suce as: paragraphs, code blocks, and footnote definitions.
/// 9. Container blocks such as: table rows, block quotes, list items, and HTML blocks.
/// 10. Meta containers such as: lists and tables.
/// 11. Thematic breaks or horizontal rules.
/// 12. Headings by level
/// 13. Metadata at the beginning of the document
/// 6. Inline elements such as: text nodes, emphasis, strong, strikethrough, link, image, table cells, inline code, footnote references, task list markers, and inline html.
/// 7. Block elements suce as: paragraphs, code blocks, and footnote definitions.
/// 8. Container blocks such as: table rows, block quotes, list items, and HTML blocks.
/// 9. Meta containers such as: lists and tables.
/// 10. Thematic breaks or horizontal rules.
/// 11. Headings by level
/// 12. Metadata at the beginning of the document
///
/// Splitting doesn't occur below the character level, otherwise you could get partial bytes of a char, which may not be a valid unicode str.
///
Expand Down

0 comments on commit abf7777

Please sign in to comment.