Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(markdown-widget): support arbitrary component order #5597

Merged
merged 14 commits into from
Aug 17, 2021

Conversation

smoores-dev
Copy link
Contributor

Fixes #5258

Reopen of #5259

Summary

Updates remarkShortcodes to handle custom editor components occurring in
any order in a given markdown field.

Previously, remarkShortcodes (the remark tokenizer plugin in the markdown widget) was relying on Map.find to attempt to match the registered editor components' patterns against the remaining uneaten markdown. Because Map.find returns after the first instance that returns true, some valid shortcode matches can be "skipped" by the plugin, and parsed as plain text instead.

This made it almost impossible to use multiple shortcodes in a single markdown field. At least one of them regularly gets "skipped" by the tokenizer.

This also updates the initial value returned by getEditorComponents() to correctly be an instance of Map, instead of an empty array.

Test plan

Unit Tests

I've updated the unit tests in the following ways:

  1. I'm now passing an instance of Map as the second argument to process, instead of an array, because that's what is actually passed in the code
  2. I've added a new unit test that covers this case, relying on an OrderedMap to force a specific coincidental state that would have triggered the bug in the previous code (I verified that this new test does in fact fail under the previous code)

Manual Inspection

I built this package locally and linked it into the project that I first identified this bug in, and verified that it resolved this issue and that I could not identify any new issues:

Before:
image

After:
image

A picture of a cute animal (not mandatory but encouraged)

Some very tired dogs, as tribute:

63794410038__8567FE98-D096-4B00-A216-F180FCFCD1F7

@smoores-dev smoores-dev requested a review from a team July 8, 2021 22:55
@erezrokah erezrokah added the type: bug code to address defects in shipped code label Jul 11, 2021
@smoores-dev
Copy link
Contributor Author

Hey @erezrokah! Thanks for fixing the linting error. Any other thoughts on this guy? Would love to get it merged!

@erezrokah
Copy link
Contributor

Hey @erezrokah! Thanks for fixing the linting error. Any other thoughts on this guy? Would love to get it merged!

Hi @smores, I've been wanting to figure why this doesn't break the image widget like the previous PR and wasn't able to do it yet.

Also, I haven't tested it thoroughly which I'd like to do as previously we broke the markdown widget for some users when doing similar changes.

@smoores-dev
Copy link
Contributor Author

Ok... I took another swing at this. Because we now care about the index in the document that a shortcode is found, it wasn't sufficient to fallback to searching individual lines to handle line anchors: if a shortcode was found in an individual line, it would basically always have a match.index of 0, meaning it would get sorted to the top and incorrectly supersede shortcodes that might have come first in the multiline search.

Instead, this goes back to modifying patterns as we get them, albeit in a more explicit way that I think accurately captures the intent. We now replace line anchors that might occur in a plugin pattern with a zero-width lookaround that accepts the line anchor or a newline, which I think captures the API that we want to support. For example:

/^<<my fancy shortcode>>$/ will be replaced with /(?<=^|\n)<<my fancy shortcode>>(?=$|\n)/

This will properly be found in any of the following documents:

<<my fancy shortcode>>
# This is a fancy shortcode

<<my fancy shortcode>>
# This is a fancy shortcode

<<my fancy shortcode>>

That's all

And will not match a document like this:

You can add a fancy shortcode by typing <<my fancy shortcode>>!

We keep intact the flags passed in by the consumer. This is safe; the lookaround replacement is a noop for a regex with a multiline flag, which is the only flag that changes the meaning of line anchor tokens.

@erezrokah would love to get your thoughts!

@stefanprobst
Copy link
Contributor

does it makes sense to maybe also add some tips/recommendations to the editor widget docs about pattern? what would have helped me when trying to figure this out:

  • the pattern is matched against the whole remaining string
  • try to avoid $
  • when needing to match across line breaks use something like [^]*?
  • example: /^<SideNote>\n([^]*?)\n<\/SideNote>/

@smoores-dev
Copy link
Contributor Author

I actually think it's fine to use $, but when you do, you should use m! But yes, generally I think some more guidance would be good. I can draft something!

@erezrokah
Copy link
Contributor

Ok... I took another swing at this. Because we now care about the index in the document that a shortcode is found, it wasn't sufficient to fallback to searching individual lines to handle line anchors: if a shortcode was found in an individual line, it would basically always have a match.index of 0, meaning it would get sorted to the top and incorrectly supersede shortcodes that might have come first in the multiline search.

Can we somehow account for that in the sort? Either calculate the correct index ourselves or give less weight to per line matches?

Instead, this goes back to modifying patterns as we get them, albeit in a more explicit way that I think accurately captures the intent. We now replace line anchors that might occur in a plugin pattern with a zero-width lookaround that accepts the line anchor or a newline, which I think captures the API that we want to support.

I would try to avoid modifying users' patterns as we might break those or create a behavior that's unexpected.

@smoores-dev
Copy link
Contributor Author

We can definitely account for the offsets in the the matching function, it's just more complex than I initially drafted. My thought here is mostly that this is already fairly hard-to-predict behavior, and that manipulating user's patterns might be the most robust approach. Like the proposed "single-line fallback match" approach doesn't match /^<MyComponent>(\s\S*?)<\/MyComponent>$/ if the content is multi-line, but does if it's only a single line, which feels confusing to me, but I guess more closely matches the current behavior (if we pass this pattern through the manipulation I wrote, we end up with /(?<=^|\n)<MyComponent>(\s\S*?)<\/MyComponent>(?=$|\n)/, which does match in both single and multi line contexts).

I can work on an approach that keeps track of line start positions, though; I do agree that it feels risky to manipulate patterns. Now that we've updated the documentation to note that folks should use the multiline flag for multiline contexts maybe all that matters is being narrowly backwards compatible.

@smoores-dev
Copy link
Contributor Author

Hey @erezrokah! Any further thoughts on this?

@erezrokah
Copy link
Contributor

Hey @erezrokah! Any further thoughts on this?

Sorry for the delay. This is fully on my end and I'll try to follow up this week.

Copy link
Contributor

@erezrokah erezrokah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay @smores.

This looks good. I'll do some manual tests and probably release on Monday to minimize the risk

@smoores-dev
Copy link
Contributor Author

Still planning on releasing this week @erezrokah?

@erezrokah
Copy link
Contributor

Still planning on releasing this week @erezrokah?

Yes! Dependencies updates took a bit longer than usual yesterday...

@erezrokah erezrokah merged commit fbfab7c into decaporg:master Aug 17, 2021
@erezrokah
Copy link
Contributor

Released in https://github.com/netlify/netlify-cms/releases/tag/netlify-cms%402.10.158, thanks @smores for you patience and follow ups.

@smoores-dev
Copy link
Contributor Author

Thanks for all your feedback and support @erezrokah!! This is great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug code to address defects in shipped code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Markdown widget doesn't parse shortcodes if they're "out of order"
3 participants