Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YAML Highlight] # inside strings is treated as comment #1075

Closed
mjrussell opened this issue Jan 4, 2017 · 8 comments
Closed

[YAML Highlight] # inside strings is treated as comment #1075

mjrussell opened this issue Jan 4, 2017 · 8 comments

Comments

@mjrussell
Copy link

In YAML, the # for comments can appear inside strings which are enclosed with " or '

Github formatting:

# comment
foo: "#bar"
foo: "bar"

Prism:
image

I understand it is because https://github.com/PrismJS/prism/blob/gh-pages/components/prism-yaml.js#L7 is too greedy. But Im not sure how to change it. Would look behind for " or ' be sufficient?

@mjrussell
Copy link
Author

Ah looks like the key is to have greedy: true on the string. Will submit a PR

@mjrussell
Copy link
Author

Actually its pretty tricky to support all these edge cases so not going to submit a PR yet.

The current behavior supports multi line strings, but Im having trouble keeping that functionality while enabling comments inside of strings. Hoping someone more experienced with Prism might have some insights

@Golmote
Copy link
Contributor

Golmote commented Jan 4, 2017

@mjrussell: thanks for reporting.
greedy: true on the string pattern seems to be the right solution here, indeed. Could you provide the problematic edge case you tested, please?

(Poke @hason: you might want to check this issue)

@mjrussell
Copy link
Author

mjrussell commented Jan 4, 2017

Mutli line's with comments i.e.:

mutli: "mutli line
  with # inside" # and out

will fail because the regex (probably not new behavior actually with greedy):
/([:\-,[{]\s*(![^\s]+)?[ \t]*)("(?:[^"\\]|\\.)*"|'(?:[^'\\]|\\.)*')(?=[ \t]*($|,|]|}))/m

Expects the last capture group $|,|]|} at the end in order to be treated as a string. Seems like it needs one more alternative to eat whitespace until another comment (but let it be parsed as a comment)

@Golmote
Copy link
Contributor

Golmote commented Feb 7, 2017

As you can see in my commit above, the issue with # inside strings should be solved. That is, foo: "#bar" should get highlighted properly.

But we are still facing two issues here.

The first one, related to YAML, is the abusive use of that group $|,|]|} everywhere. I'm pretty sure there is a better way to handle those highlightings. (If we fix it, YAML would end up with the same behaviour as the other components, descibed below.)

The second one is a flaw in the current greedy behaviour, which can be reproduced with nearly every other component:
JS:
https://puu.sh/tSVId/a394453b85.png
https://puu.sh/tSVKc/cd5b6e06ec.png

CSS:
https://puu.sh/tSVLM/8bea2e8705.png
https://puu.sh/tSVNn/ba73a79f5b.png

even LOLCODE:
https://puu.sh/tSVRB/9bf1245395.png
https://puu.sh/tSVSP/0dff84caa9.png

... and probably nearly all other components...

The greedy string, when processed, invalidates the previously matched comment (the large one, beginning at the middle of the string) but does not force it to be reconsidered. So even the smaller comment, the one that actually is to be highligted, is ignored...

Obviously, the greedy flag, how it is currently implemented, won't allow that.

Poke @LeaVerou, @zeitgeist87: Should I open an issue for this? Will we go on that road again? Are we willing to improve the greedy behaviour?

@zeitgeist87
Copy link
Collaborator

Hi @Golmote,

Thanks for your great work. I've quickly thrown together a fix for this bug #1095. It seems to work for your examples, and it passes all tests in our test suite. I haven't tested it further though.

The PR refactors the tokenize() method, so that a subset of it can be called recursively. That way the greedy flag can cleanup after itself. The diff looks worse than it is, because I had to move a few things around.

@mjrussell
Copy link
Author

Awesome to see some progress on this! Agree I had many issues with greedy behavior before so excited to see how @zeitgeist87's tweaks work

@zeitgeist87
Copy link
Collaborator

This should be fixed by #1095

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants