Skip to content

Commit

Permalink
Add soft break
Browse files Browse the repository at this point in the history
  • Loading branch information
benbrandt committed Feb 10, 2024
1 parent 59f7616 commit 5bd9c29
Show file tree
Hide file tree
Showing 11 changed files with 420 additions and 353 deletions.
25 changes: 23 additions & 2 deletions src/unstable_markdown.rs
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,8 @@ enum SemanticLevel {
/// Split by [unicode sentences](https://www.unicode.org/reports/tr29/#Sentence_Boundaries)
/// Falls back to [`Self::Word`]
Sentence,
/// Single line break, which isn't necessarily a new element in Markdown
SoftBreak,
/// thematic break/horizontal rule
Rule,
}
Expand Down Expand Up @@ -181,8 +183,16 @@ impl SemanticSplit for Markdown {
let ranges = Parser::new(text)
.into_offset_iter()
.filter_map(|(event, range)| match event {
Event::Start(_)
| Event::End(_)
| Event::Text(_)
| Event::Code(_)
| Event::Html(_)
| Event::FootnoteReference(_)
| Event::HardBreak
| Event::TaskListMarker(_) => None,
Event::SoftBreak => Some((SemanticLevel::SoftBreak, range)),
Event::Rule => Some((SemanticLevel::Rule, range)),
_ => None,
})
.collect::<Vec<_>>();

Expand Down Expand Up @@ -228,7 +238,7 @@ impl SemanticSplit for Markdown {
SemanticLevel::Sentence => text
.split_sentence_bound_indices()
.map(move |(i, str)| (offset + i, str)),
SemanticLevel::Rule => split_str_by_separator(
SemanticLevel::Rule | SemanticLevel::SoftBreak => split_str_by_separator(
text,
self.ranges_after_offset(offset, semantic_level)
.map(move |(_, sep)| sep.start - offset..sep.end - offset),
Expand Down Expand Up @@ -429,6 +439,17 @@ mod tests {
assert_eq!(SemanticLevel::Sentence, markdown.max_level());
}

#[test]
fn test_softbreak() {
let markdown = Markdown::new("Some text\nwith a softbreak");

assert_eq!(
vec![&(SemanticLevel::SoftBreak, 9..10)],
markdown.ranges().collect::<Vec<_>>()
);
assert_eq!(SemanticLevel::SoftBreak, markdown.max_level());
}

#[test]
fn test_with_rule() {
let markdown = Markdown::new("Some text\n\n---\n\nwith a rule");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,22 +18,22 @@ input_file: tests/inputs/markdown/markdown_basics.md
- "are written in a before/after style, showing example syntax and the\n"
- "HTML output produced by Markdown.\n\n"
- "It's also helpful to simply try Markdown out; the [Dingus] [d] is a\n"
- "web application that allows you type your own Markdown-formatted text\nand translate it to XHTML.\n\n"
- "**Note:** This document is itself written using Markdown; you\n"
- "web application that allows you type your own Markdown-formatted text\n"
- "and translate it to XHTML.\n\n**Note:** This document is itself written using Markdown; you\n"
- "can [see the source for it by adding '.text' to the URL] [src].\n\n"
- " [s]: /projects/markdown/syntax \"Markdown Syntax\"\n"
- " [d]: /projects/markdown/dingus \"Markdown Dingus\"\n [src]: /projects/markdown/basics.text\n\n\n"
- "## Paragraphs, Headers, Blockquotes ##\n\n"
- "A paragraph is simply one or more consecutive lines of text, separated\nby one or more blank lines. "
- "(A blank line is any line that looks like\n"
- "a blank line -- a line containing nothing but spaces or tabs is\nconsidered blank.) "
- "Normal paragraphs should not be indented with\nspaces or tabs.\n\n"
- "Markdown offers two styles of headers: *Setext* and *atx*.\n"
- "A paragraph is simply one or more consecutive lines of text, separated\n"
- "by one or more blank lines. (A blank line is any line that looks like\n"
- "a blank line -- a line containing nothing but spaces or tabs is\n"
- "considered blank.) Normal paragraphs should not be indented with\n"
- "spaces or tabs.\n\nMarkdown offers two styles of headers: *Setext* and *atx*.\n"
- "Setext-style headers for `<h1>` and `<h2>` are created by\n"
- "\"underlining\" with equal signs (`=`) and hyphens (`-`), respectively.\n"
- "To create an atx-style header, you put 1-6 hash marks (`#`) at the\n"
- "beginning of the line -- the number of hashes equals the resulting\nHTML header level.\n\n"
- "Blockquotes are indicated using email-style '`>`' angle brackets.\n\nMarkdown:\n\n"
- "beginning of the line -- the number of hashes equals the resulting\n"
- "HTML header level.\n\nBlockquotes are indicated using email-style '`>`' angle brackets.\n\nMarkdown:\n\n"
- " A First Level Header\n ====================\n\n A Second Level Header\n"
- " ---------------------\n\n Now is the time for all good men to come to\n"
- " the aid of their country. This is just a\n regular paragraph.\n\n"
Expand All @@ -54,21 +54,24 @@ input_file: tests/inputs/markdown/markdown_basics.md
- " Some of these words <em>are emphasized also</em>.</p>\n\n"
- " <p>Use two asterisks for <strong>strong emphasis</strong>.\n"
- " Or, if you prefer, <strong>use two underscores instead</strong>.</p>\n\n\n\n## Lists ##\n\n"
- "Unordered (bulleted) lists use asterisks, pluses, and hyphens (`*`,\n`+`, and `-`) as list markers. "
- "These three markers are\ninterchangable; this:\n\n * Candy.\n * Gum.\n * Booze.\n\nthis:\n\n"
- " + Candy.\n + Gum.\n + Booze.\n\nand this:\n\n - Candy.\n - Gum.\n - Booze.\n\n"
- "Unordered (bulleted) lists use asterisks, pluses, and hyphens (`*`,\n"
- "`+`, and `-`) as list markers. These three markers are\n"
- "interchangable; this:\n\n * Candy.\n * Gum.\n * Booze.\n\nthis:\n\n + Candy.\n"
- " + Gum.\n + Booze.\n\nand this:\n\n - Candy.\n - Gum.\n - Booze.\n\n"
- "all produce the same output:\n\n <ul>\n <li>Candy.</li>\n <li>Gum.</li>\n <li>Booze.</li>\n"
- " </ul>\n\nOrdered (numbered) lists use regular numbers, followed by periods, as\nlist markers:\n\n"
- " 1. Red\n 2. Green\n 3. Blue\n\nOutput:\n\n <ol>\n <li>Red</li>\n <li>Green</li>\n"
- " <li>Blue</li>\n </ol>\n\nIf you put blank lines between items, you'll get `<p>` tags for the\n"
- " </ul>\n\nOrdered (numbered) lists use regular numbers, followed by periods, as\n"
- "list markers:\n\n 1. Red\n 2. Green\n 3. Blue\n\nOutput:\n\n <ol>\n <li>Red</li>\n"
- " <li>Green</li>\n <li>Blue</li>\n </ol>\n\n"
- "If you put blank lines between items, you'll get `<p>` tags for the\n"
- "list item text. You can create multi-paragraph list items by indenting\n"
- "the paragraphs by 4 spaces or 1 tab:\n\n * A list item.\n\n With multiple paragraphs.\n\n"
- " * Another item in the list.\n\nOutput:\n\n <ul>\n <li><p>A list item.</p>\n"
- " <p>With multiple paragraphs.</p></li>\n <li><p>Another item in the list.</p></li>\n </ul>\n\n\n"
- "\n### Links ###\n\nMarkdown supports two styles for creating links: *inline* and\n*reference*. "
- "With both styles, you use square brackets to delimit the\ntext you want to turn into a link.\n\n"
- "Inline-style links use parentheses immediately after the link text.\nFor example:\n\n"
- " This is an [example link](http://example.com/).\n\nOutput:\n\n"
- "\n### Links ###\n\nMarkdown supports two styles for creating links: *inline* and\n"
- "*reference*. With both styles, you use square brackets to delimit the\n"
- "text you want to turn into a link.\n\n"
- "Inline-style links use parentheses immediately after the link text.\n"
- "For example:\n\n This is an [example link](http://example.com/).\n\nOutput:\n\n"
- " <p>This is an <a href=\"http://example.com/\">\n example link</a>.</p>\n\n"
- "Optionally, you may include a title attribute in the parentheses:\n\n"
- " This is an [example link](http://example.com/ \"With a Title\").\n\nOutput:\n\n"
Expand All @@ -81,17 +84,17 @@ input_file: tests/inputs/markdown/markdown_basics.md
- " title=\"Google\">Google</a> than from <a href=\"http://search.yahoo.com/\"\n"
- " title=\"Yahoo Search\">Yahoo</a> or <a href=\"http://search.msn.com/\"\n"
- " title=\"MSN Search\">MSN</a>.</p>\n\nThe title attribute is optional. "
- "Link names may contain letters,\nnumbers and spaces, but are *not* case sensitive:\n\n"
- " I start my morning with a cup of coffee and\n [The New York Times][NY Times].\n\n"
- " [ny times]: http://www.nytimes.com/\n\nOutput:\n\n"
- "Link names may contain letters,\n"
- "numbers and spaces, but are *not* case sensitive:\n\n I start my morning with a cup of coffee and\n"
- " [The New York Times][NY Times].\n\n [ny times]: http://www.nytimes.com/\n\nOutput:\n\n"
- " <p>I start my morning with a cup of coffee and\n"
- " <a href=\"http://www.nytimes.com/\">The New York Times</a>.</p>\n\n\n### Images ###\n\n"
- "Image syntax is very much like link syntax.\n\nInline (titles are optional):\n\n !["
- "alt text](/path/to/img.jpg \"Title\")\n\nReference-style:\n\n ![alt text][id]\n\n"
- " [id]: /path/to/img.jpg \"Title\"\n\nBoth of the above examples produce the same output:\n\n"
- " <img src=\"/path/to/img.jpg\" alt=\"alt text\" title=\"Title\" />\n\n\n\n### Code ###\n\n"
- "In a regular paragraph, you can create code span by wrapping text in\nbacktick quotes. "
- "Any ampersands (`&`) and angle brackets (`<` or\n"
- "In a regular paragraph, you can create code span by wrapping text in\n"
- "backtick quotes. Any ampersands (`&`) and angle brackets (`<` or\n"
- "`>`) will automatically be translated into HTML entities. This makes\n"
- "it easy to use Markdown to write about HTML example code:\n\n"
- " I strongly recommend against using any `<blink>` tags.\n\n"
Expand Down
Loading

0 comments on commit 5bd9c29

Please sign in to comment.