-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enh(r) Add operators and punctuation #3195
Conversation
This change adds highlighting for operators and punctuation, and fixes the issues described in highlightjs#3194.
Does it really need to be in BOTH? Often spaces will be used (separating it from number) and we'll match (and color it) either way with the number rule.. so perhaps the relevancy boost can just handle the "on it's own" case... in which case it's just 2-3 lines of code much like exactly what we had before. Perhaps that would fix the auto-detect? |
I’ll try! |
Co-authored-by: Josh Goebel <[email protected]>
src/languages/r.js
Outdated
begin: '%', | ||
end: '%' | ||
scope: 'operator', | ||
match: /<-/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the full assignment regex back or is there some reason this had to change? More explicit is better than less typically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous assignment regex was only used as a relevance boost, and only matches a small subset of all possible assignments, namely a subset of all legal identifiers, plus whitespace, plus assign, plus whitespace.
We can’t use this for highlighting since it misses many cases:
- Legal identifiers not included in the subset used here
- Lack of whitespace before or after the assignment
- Assignment to a complex expression rather than an identifier (e.g., at its simplest,
x[1] <- 2
)
I’m actually not sure why this regex was so overly specific even for relevance boosting: was it to avoid clashing with other languages? At any rate, assignment expressions in R can be (and routinely are, in actual code) quite complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was it to avoid clashing with other languages?
Yes. The boost was intended to be narrow so as not to trigger false positives. If it only catches some common cases that's sufficient. It's not trying to be exhaustive. (relevance boosts often don't need to be to help)
Legal identifiers not included in the subset used here
If you wanted to use a proper IDENT vs SIMPLE_IDENT that would be an improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! |
@@ -1,26 +1,26 @@ | |||
<span class="hljs-comment"># Valid names</span> | |||
|
|||
a1_foo, A1_FOO, .foo_, ._foo, Bar.42, foo..1, ., ._, .., ..., ..1, <span class="hljs-built_in">c</span>, <span class="hljs-built_in">T</span>, <span class="hljs-built_in">F</span>, ._1 | |||
a1_foo<span class="hljs-punctuation">,</span> A1_FOO<span class="hljs-punctuation">,</span> .foo_<span class="hljs-punctuation">,</span> ._foo<span class="hljs-punctuation">,</span> Bar.42<span class="hljs-punctuation">,</span> foo..1<span class="hljs-punctuation">,</span> .<span class="hljs-punctuation">,</span> ._<span class="hljs-punctuation">,</span> ..<span class="hljs-punctuation">,</span> ...<span class="hljs-punctuation">,</span> ..1<span class="hljs-punctuation">,</span> <span class="hljs-built_in">c</span><span class="hljs-punctuation">,</span> <span class="hljs-built_in">T</span><span class="hljs-punctuation">,</span> <span class="hljs-built_in">F</span><span class="hljs-punctuation">,</span> ._1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long-term I do worry this makes the tests much harder to read, but I don't have a quick solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it hasn’t been great during debugging. :-(
Maybe I should just replace the commas by spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. That wouldn't help in actual code examples though... :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. You happy?
Thanks for the patient review! There were a lot of details that required attention. |
How does this kind of solution feel to the original look-behind problem? |
It’s not too bad but of course there’s still quite a bit of duplication here. I also keep racking my brain whether I’ve missed a situation in which numbers are preceded by something other than operators and whitespace (I don’t think so). |
I'm more interested in overall complexity than boilerplate. Feels like a pretty clean solution overall I think. Not perfect, but not ugly.
Problem for another day. :-) |
This change adds highlighting for operators and punctuation, and fixes #3194.
Changes
->
as illegalclassName
=>scope
)Unfortunately this introduces a slight regression, hence a draft PR: the auto-detection sample now fails for R, since
<-
is no longer used as a hint, instead it now has relevance 0 along with all other operators (without this, many languages incorrectly auto-detect as R, presumably since they don’t define operators).Is there a way of adding an auto-detection hint for
<-
assignment without adding yet another variant to the two separate definitions of operators (L161 and L202)?Also, is the formatting OK as it is? I’m no JS guy, and the linter I’m using is going slightly crazy, though it doesn’t seem entirely consistent in its complaints.
Checklist
CHANGES.md