Incorrect emphasis handling #383

mity · 2021-05-12T06:07:03Z

(Distilled from https://talk.commonmark.org/t/i-dont-understand-how-emphasis-is-parsed/3866)

Input:

*****Hello*world****

Actual Output:

<p>*****Hello<em>world</em>***</p>

Expected Output:

<p>**<em><strong>Hello<em>world</em></strong></em></p>

More detailed rationale can be found in this comment: https://talk.commonmark.org/t/i-dont-understand-how-emphasis-is-parsed/3866/8

jgm · 2021-06-15T16:21:32Z

Reading the algorithm at the end of the spec, I think I see the issue. We have an openers_bottom table that limits how far back you have to look for an opener. It is indexed to the type of delimiter (_, *) and the length of the closing delimiter mod 3. So after we fail to match the opener ***** to *, we set the openers_bottom for (*, 1) to the location of *, effectively removing the ***** as a possible opener for any run of *s with a length mod 3 of 1, including the final **** in this example. This procedure ignores the fact that the length mod 3 thing only matters if one of the delimiters can be both an opener and a closer.J

See commonmark/cmark#383.

This reverts commit ae7ead2.

This reverts commit dc9366c.

The problem arose as follows. The input was ``` *****Hello*world**** ``` We have an `openers_bottom` table that limits how far back you have to look for an opener. It is indexed to the type of delimiter (`_` or `*`) and the length of the closing delimiter mod 3. So after we fail to match the opener `*****` to `*`, we set the openers_bottom for `(*, 1)` to the location of `*`, effectively removing the `*****` as a possible opener for any run of `*`s with a length mod 3 of 1, including the final `****` in this example. This procedure ignores the fact that the length mod 3 restriction only matters if one of the delimiters can be both an opener and a closer. To fix this problem, we index the `openers_bottom` table not just to the type of delimiter and the length of the closing delimiter mod 3, but to whether the closing delimiter can also be an opener.

This reverts commit 10ed0d0.

commonmark/cmark#383

0.2.1.1 * Fix bug in prettyShow for SourceRange (#80). The bug led to an infinite loop in certain cases. 0.2.1 * Use official 0.30 spec.txt. * Update HTML block parser for recent spec changes. * Fix test case from commonmark/cmark#383. We need to index the list of stack bottoms not just by the length mod 3 of the closer but by whether it can be an opener, since this goes into the calculation of whether the delimiters can match. 0.2 * Commonmark.Inlines: export LinkInfo(..) [API change]. * Commonmark.Inlines: export pLink [API chage]. * Comonmark.ReferenceMap: Add linkPos field to LinkInfo [API change]. * Commonmark.Tokens: normalize unicode to NFC before tokenizing (#57). Normalization might affect detection of flankingness, recognition of reference links, etc. * Commonmark.Html: add data-prefix to non-HTML5 attributes, as pandoc does. * Remove unnecessary build-depends. * Use lightweight tasty-bench instead of criterion for benchmarks.

jgm added a commit that referenced this issue Jun 15, 2021

Add failing regression test for #383.

cd26ccb

jgm added a commit that referenced this issue Jun 16, 2021

Fix regression test for #383

ba699f9

jgm closed this as completed in dc9366c Jun 16, 2021

jgm added a commit that referenced this issue Jun 17, 2021

A more elegant fix for #383.

ae7ead2

jgm added a commit to commonmark/commonmark-spec that referenced this issue Jun 17, 2021

Update description of emphasis parsing algorithm in spec.

a3ab74d

See commonmark/cmark#383.

jgm added a commit to commonmark/commonmark.js that referenced this issue Jun 17, 2021

Fix counterpart to commonmark/cmark#383.

10ed0d0

jgm added a commit that referenced this issue Jun 19, 2021

Revert "A more elegant fix for #383."

d5c3a34

This reverts commit ae7ead2.

jgm added a commit that referenced this issue Jun 19, 2021

Revert "Fix #383."

2d8a2f8

This reverts commit dc9366c.

jgm added a commit to commonmark/commonmark.js that referenced this issue Jun 19, 2021

Revert "Fix counterpart to commonmark/cmark#383."

79d7756

This reverts commit 10ed0d0.

snyk-bot mentioned this issue Jun 21, 2021

[Snyk] Upgrade commonmark from 0.29.3 to 0.30.0 jstransformers/jstransformer-commonmark#18

Open

rlidwka added a commit to markdown-it/markdown-it that referenced this issue Jun 30, 2021

Fix emphasis algorithm as per 0.30 spec

eed156e

commonmark/cmark#383

snyk-bot mentioned this issue Jul 16, 2021

[Snyk] Upgrade commonmark from 0.28.1 to 0.30.0 ekmixon/pptr.dev#3

Open

jgm mentioned this issue Nov 3, 2022

Deterministic table iteration order LuaJIT/LuaJIT#719

Closed

Randyblo7 mentioned this issue Aug 21, 2023

[Snyk] Upgrade commonmark from 0.27.0 to 0.30.0 Randyblo7/autorest#3

Open

mettle-priya mentioned this issue Dec 6, 2023

[Snyk] Upgrade commonmark from 0.29.2 to 0.30.0 eeveebank/graphql-voyager#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect emphasis handling #383

Incorrect emphasis handling #383

mity commented May 12, 2021

jgm commented Jun 15, 2021

Incorrect emphasis handling #383

Incorrect emphasis handling #383

Comments

mity commented May 12, 2021

jgm commented Jun 15, 2021