Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental line-by-line highlighting breaks some languages #2259

Closed
davidben opened this issue Nov 7, 2019 · 23 comments · Fixed by #2274
Closed

Incremental line-by-line highlighting breaks some languages #2259

davidben opened this issue Nov 7, 2019 · 23 comments · Fixed by #2274

Comments

@davidben
Copy link
Contributor

davidben commented Nov 7, 2019

highlight.js supports highlighting text incrementally via the continuation parameter to the highlight function. Gerrit uses this to highlight line-by-line. However, this causes some features, notably C++ raw string matching, to break. See the discussion here.

(Ironically, raw string matching partially worked in #1825 and then I broke it again for us when I tried to make it more featureful in #1897.)

I changed the tests to feed all the markup samples line-by-line to demonstrate the issue. Things mostly work, but when a begin or end element crosses a line boundary, things break.
master...davidben:line-by-line-tests

This suggests that the strategy in #1897 was wrong and probably C++ raw strings should have separate begin and end units. But endSameAsBegin doesn't seem quite flexible enough, unless there's some trick I'm missing. One thought is begin matches R"<blah>( and then we have a endFromBegin attribute which takes a callback and will compute )<blah>" for the end pattern given a begin.

But it's not entirely clear what the general API contract here is. Supposing incremental highlight is a thing highlight.js wants to support (I assume yes, given the continuation parameter), it seems there's basically no hope in allowing a begin or end unit to span an input boundary. The API needs to return something for the text that was passed in, so you can't lookahead across chunks. That means highlight.js needs to have opinions about what input boundaries need to be supported by languages (One call per character obviously cannot do anything useful. Line-by-line seems reasonable?), or we change the incremental highlight API altogether.

Thoughts?

@joshgoebel
Copy link
Member

But endSameAsBegin doesn't seem quite flexible enough, unless there's some trick I'm missing. One thought is begin matches R"( and then we have a endFromBegin attribute which takes a callback and will compute )" for the end pattern given a begin.

Currently you can do this by nesting, but yes it's annoying. You might also see my work on Elixir:

https://github.com/highlightjs/highlight.js/pull/2257/files

It doesn't use endSameAsBegin, but same concept with first matching on the prefix and then the child matchers can focus on just the begin/end syntax, not the prefix.


continuation really is NOT meant as a way to do line-by-line processing. This would be better handled by a post-highlight or post-parse plugin as mentioned over in the plugin thread.

it seems there's basically no hope in allowing a begin or end unit to span an input boundary. The API needs to return something for the text that was passed in, so you can't lookahead across chunks.

I'm not sure I completely follow, I will follow your link and see if it's helpful. I've found the "outside" edge of the continuation stuff to be more robust than I expected... ie it remembers the state the parser is in re-opens those spans and continues the processing, even after handling a sublanguage in the middle... but again I'm speaking for what continuation were INTENDED for... they weren't intended for line-by-line processing.

@joshgoebel
Copy link
Member

  // When marking up an input line-by-line, elements which span multiple lines
  // will be divided up into several spans. For instance, a multi-line C++
  // comment may be marked up as:
  //
  //   <span class="hljs-comment">/*</span>
  //   <span class="hljs-comment">Hello</span>
  //   <span class="hljs-comment">*/</span>

This looks like the behavior I've come to expect with continuations.

@joshgoebel
Copy link
Member

Perhaps it'd be useful to show a "failing case" and what happens and what you would hope to happen, etc...

@joshgoebel
Copy link
Member

I'd suggest you take a look at what's going on here:

#1086

I think doing anything line-by-line would be done by walking the tree and dealing with \n inside text nodes however you saw fit... Doing something simple like wrapping every line in it's own span would probably be pretty trivial one that foundation.

There has also has been mention of a line-by-line callback especially, but honestly I'd probably rather see this first done by a plugin... no reason plugins couldn't extend the callback system or even have their own.

@davido
Copy link
Contributor

davido commented Nov 8, 2019

@joshgoebel
Copy link
Member

joshgoebel commented Nov 8, 2019

Commented on that thread just to clear up the fact that we do NOT have a line by line mode. In fact the original author was actually pretty against any type of "line number" feature and such things. And it's still mentioned in our docs today, etc... not that I think we're still THAT hostile to it today, but definitely sounds like a good use for a plugin.

We just need better hooks to make these type of things easier to do.

@davidben
Copy link
Contributor Author

davidben commented Nov 8, 2019

Ah, interesting. Hopefully one of the Gerrit devs will respond either here or on the Gerrit bug you commented on. (I don't actually work on Gerrit. I work on Chromium and BoringSSL, which use Gerrit, and highlight.js was easy enough to hack on, so I'd been poking at the highlighting bugs I was running into. :-) )

You mentioned in the Gerrit bug that the continuation feature is primarily for internal use. Should it be removed from the documentation? The documentation reads to me like feeding in chunks of partial data is supposed to work.

It doesn't use endSameAsBegin, but same concept with first matching on the prefix and then the child matchers can focus on just the begin/end syntax, not the prefix.

Hrm. I toyed with that a bit but it seemed there'd be issues with C++'s raw strings ending in )<delimiter>", so merely looking for <delimiter> might end the string early.

Of course, if line-by-line isn't supposed to work anyway, that's no longer a reason to change the current C++ definition. That said, hljs currently won't match R"(half a raw string, whereas it will match /* half a block comment, so perhaps there's something to be said for changing the C++ mode anyway?

This looks like the behavior I've come to expect with continuations.

It seems the issue may have since been clarified, but in case there's still confusion, yeah that behavior makes sense. The "issue" is the lineByLineExceptions list of tests that don't work even after accounting for it. (I say "issue" in quotes because it sounds like line-by-line mode isn't meant to work anyway.)

@joshgoebel
Copy link
Member

You mentioned in the Gerrit bug that the continuation feature is primarily for internal use. Should it be removed from the documentation? The documentation reads to me like feeding in chunks of partial data is supposed to work.

I'd actually be for that if @egor-rogov had no objections. It really has all sorts of caveats and really it's more of a limited thing to work with sublanguages. I was not aware we were encouraging its broader use. Either way I'll still say: Not really intended for line to line.

Of course, if line-by-line isn't supposed to work anyway, that's no longer a reason to change the current C++ definition.

Continuation (in the core library) is intended to support subLanguages (pretty sure that's the only reason it's there, but I'm open to being corrected)... and personally I don't think we should REQUIRE that any languages parse in a certain way if they naturally find it beneficial to parse in larger chunks. IE, we shouldn't do things that limits peoples choices in writing grammars without VERY good reasons to do so.

@joshgoebel
Copy link
Member

That said, hljs currently won't match R"(half a raw string, whereas it will match /* half a block comment, so perhaps there's something to be said for changing the C++ mode anyway?

You mean when parsing tiny bits at a time only, yes?

@davidben
Copy link
Contributor Author

davidben commented Nov 8, 2019

Nah, maybe the file is truncated or the author forgot to close the string. Usually syntax highlighters will highlight the unfinished thing.

@joshgoebel
Copy link
Member

Ah, now that might be worth thinking about. That would be one case where doing the full match would not turn out so well...

@egor-rogov
Copy link
Collaborator

You mentioned in the Gerrit bug that the continuation feature is primarily for internal use. Should it be removed from the documentation? The documentation reads to me like feeding in chunks of partial data is supposed to work.

I'd actually be for that if @egor-rogov had no objections.

I'd rather not remove it ('cause it is a part of the API), but explicitly explain that this is intended for internal use and may not behave as one might expect.

@davidben
Copy link
Contributor Author

davidben commented Nov 8, 2019

Ah, now that might be worth thinking about. That would be one case where doing the full match would not turn out so well...

What are your thoughts on something like:

endCallback: function(begin) {
  // Compute ')<delimiter>"' from 'R"<delimiter>('
},

Returning a match object for the begin regex would be smoother, but probably tricky given how the regexes get all joined up.

@joshgoebel
Copy link
Member

What are your thoughts on something like:

I'm receptive to the idea, though it'd probably be a more generic match callback that fires on match or perhaps start/finish... then you could do what you wanted. Ie, hook "start" to generate end on the fly, this would allow the current endSameAsBegin to be built with only a plug-in, as it's a much simpler version of what you're suggesting.

Although it's actually not that simple because the begin/end matchers are all compiled together and are not modified at run-time. endSameAsBegin currently only works because it's a MORE SPECIFIC match of the same type... IE, end (a copy of begin) will always match, but then we double check to see if the more specific end match was triggered.

Otherwise we might have to recursively recompile the regexs every time the regexes were dynamically changed... not sure what type of performance implications that would have.

Returning a match object for the begin regex would be smoother, but probably tricky given how the regexes get all joined up.

Huh? Not sure I follow? Accessing the invividual capture groups isn't that hard... but I'm not sure exactly what you're suggesting with this comment.

@davidben
Copy link
Contributor Author

davidben commented Nov 8, 2019

Although it's actually not that simple because the begin/end matchers are all compiled together and are not modified at run-time. endSameAsBegin currently only works because it's a MORE SPECIFIC match of the same type... IE, end (a copy of begin) will always match, but then we double check to see if the more specific end match was triggered.

Ah, interesting. Okay, I think I can do something with that. I'll try my hand at this and we'll see if you like the result.

Huh? Not sure I follow? Accessing the invividual capture groups isn't that hard... but I'm not sure exactly what you're suggesting with this comment.

I mean that you could imagine the function(begin) { ... } above being passed in a regex match object rather than the string. My recollection was that, since you combine all the regexes into one jumbo regex, the match groups are all shifted and such. Or maybe I'm entirely confused and misremembering. Anyway, not important, I'll poke around and see what's cleanest to do.

@joshgoebel
Copy link
Member

I mean that you could imagine the function(begin) { ... } above being passed in a regex match object rather than the string.

Yeah, that'd make sense... you'd have to pass it a reference to mode also or give it some API to effect things if that was the idea. Then you also run into the issue that modes can be recursive so do the parser handle the call stack or force that problem onto the user, etc... may not be a problem for C string, but I'm always thinking of the full scope of features.

My recollection was that, since you combine all the regexes into one jumbo regex, the match groups are all shifted and such.

Yes, but we keep track of the match offsets (or else we'd have know way to know which rule matched)... so if we had to "rewrite" them back into a sequential array starting at 1 that wouldn't be that hard to do.

@joshgoebel
Copy link
Member

Not to mention the security implications of something like this. :-) Right now grammars are technically code, but are only evaluated once, and what you see is what you get - reasonably easy to audit. They can't really be altered or exploited by specific input (other than bad regexes perhaps going into infinite loops)... so something like this would add a whole other potential vector for attacks.

davidben added a commit to davidben/highlight.js that referenced this issue Nov 8, 2019
PR highlightjs#1897 switched C++ raw strings to use backreferences, however this
breaks souce files where raw strings are truncated. Like comments, it
would be preferable to highlight them.

Instead, go back to using separate begin and end regexps, but introduce
an endFilter feature to filter out false positive matches. This
internally works similarly to endSameAsBegin.

See also issue highlightjs#2259.
davidben added a commit to davidben/highlight.js that referenced this issue Nov 8, 2019
PR highlightjs#1897 switched C++ raw strings to use backreferences, however this
breaks souce files where raw strings are truncated. Like comments, it
would be preferable to highlight them.

Instead, go back to using separate begin and end regexps, but introduce
an endFilter feature to filter out false positive matches. This
internally works similarly to endSameAsBegin.

See also issue highlightjs#2259.
@joshgoebel
Copy link
Member

joshgoebel commented Nov 12, 2019

We can continue the discussion on the PR as I'm interested in more dynamic grammars in general - and if we figure that out we can then decide where it might and might not make sense to use such dynamism in existing grammars. But I just want to be clear this grammar (cpp, and also the others you flag) are NOT considered broken or buggy because your custom line-by-line parsing code fails to work with them - continuations are not intended to be used in this fashion - and this issue is the reason why. :-)

If this type of CPP string really doesn't contain any escape sequences then the simplest way of handling it is probably to just match it with a single regex and take advantage of the group matching capabilities, as was done in #1897.

Adding docs clarifying that continuation is NOT intended for be using in the fashion you're using it:

#2274

I simply don't think it's reasonable to force grammas to always care about WHERE line breaks are at... when it's easier for a grammar to match against line breaks, we're not opposed to them doing so.

@joshgoebel
Copy link
Member

joshgoebel commented Nov 12, 2019

Closing this issue as resolved - existing behavior is expected behavior. More than happy to continue the discussion on a BETTER and more reliable way to do line by line though, just not with continuations.

@joshgoebel
Copy link
Member

joshgoebel commented Nov 12, 2019

@davidben Have you followed the discussion here:

#1086

Given a raw parse tree (of nodes), wouldn't writing a line-by-line outputter be a few lines of code? You merely have to track the current node "stack" (which you have to do for a normal render anyways) and each time you see a new line in text you output the closing tags... then the newline, then re-open the tags, then output the text portion that comes AFTER the new line.

Am I missing something?

This would be the "correct" way to do line-by-line stuff off the top of my head.

@joshgoebel
Copy link
Member

joshgoebel commented Nov 12, 2019

Also, if you're target is the browser you can apply the line stuff BEFORE the rendering also... when using highlightBlock, HTML is preserved and merged seamlessly... so if you wanted to do something like:

<pre><code id="zeroed">
<span class="line1">actual code</span>
<span class="line2">goes here</span>
</code></pre>

<script>
highlightBlock(document.querySelector('pre code#zecode'))
</script>

Etc... and it should "just work".

@joshgoebel
Copy link
Member

I think a lot of people make this type of thing WAY TOO complex though... I'd consider just having two entirely separately code blocks inside the pre (or a similar div grouping) and having one be the line numbers and the other be the source... no need for the parser to care about lines at all... though I'm not sure what your use case is exactly.

joshgoebel pushed a commit to joshgoebel/highlight.js that referenced this issue Mar 2, 2020
PR highlightjs#1897 switched C++ raw strings to use backreferences, however this
breaks souce files where raw strings are truncated. Like comments, it
would be preferable to highlight them.

Instead, go back to using separate begin and end regexps, but introduce
an endFilter feature to filter out false positive matches. This
internally works similarly to endSameAsBegin.

See also issue highlightjs#2259.
joshgoebel pushed a commit to davidben/highlight.js that referenced this issue Mar 12, 2020
PR highlightjs#1897 switched C++ raw strings to use backreferences, however this
breaks souce files where raw strings are truncated. Like comments, it
would be preferable to highlight them.

Instead, go back to using separate begin and end regexps, but introduce
an endFilter feature to filter out false positive matches. This
internally works similarly to endSameAsBegin.

See also issue highlightjs#2259.
joshgoebel pushed a commit to joshgoebel/highlight.js that referenced this issue Apr 27, 2020
PR highlightjs#1897 switched C++ raw strings to use backreferences, however this
breaks souce files where raw strings are truncated. Like comments, it
would be preferable to highlight them.

Instead, go back to using separate begin and end regexps, but introduce
an endFilter feature to filter out false positive matches. This
internally works similarly to endSameAsBegin.

See also issue highlightjs#2259.
joshgoebel pushed a commit to davidben/highlight.js that referenced this issue Apr 27, 2020
PR highlightjs#1897 switched C++ raw strings to use backreferences, however this
breaks souce files where raw strings are truncated. Like comments, it
would be preferable to highlight them.

- Add `on:begin` and `on:end` to allow more granular matching when
  then end match is dynamic and based on a part of the begin match
- This deprecates the `endSameAsBegin` attribute. That attribute was
  a very specific way to solve this problem, but now we have a much
  more general solution in these added callbacks.

Also related: highlightjs#2259.
joshgoebel added a commit to davidben/highlight.js that referenced this issue Apr 27, 2020
PR highlightjs#1897 switched C++ raw strings to use backreferences, however this
breaks souce files where raw strings are truncated. Like comments, it
would be preferable to highlight them.

- Add `on:begin` and `on:end` to allow more granular matching when
  then end match is dynamic and based on a part of the begin match
- This deprecates the `endSameAsBegin` attribute. That attribute was
  a very specific way to solve this problem, but now we have a much
  more general solution in these added callbacks.

Also related: highlightjs#2259.

Co-authored-by: Josh Goebel <[email protected]>
@joshgoebel
Copy link
Member

Given a raw parse tree (of nodes), wouldn't writing a line-by-line outputter be a few lines of code? You merely have to track the current node "stack" (which you have to do for a normal render anyways) and each time you see a new line in text you output the closing tags... then the newline, then re-open the tags, then output the text portion that comes AFTER the new line.

Now that parse tree support is in 10.0 someone really should try doing it this way and see how much simpler it is. :-)

joshgoebel added a commit that referenced this issue Apr 27, 2020
PR #1897 switched C++ raw strings to use backreferences, however this
breaks souce files where raw strings are truncated. Like comments, it
would be preferable to highlight them.

- Add `on:begin` and `on:end` to allow more granular matching when
  then end match is dynamic and based on a part of the begin match
- This deprecates the `endSameAsBegin` attribute. That attribute was
  a very specific way to solve this problem, but now we have a much
  more general solution in these added callbacks.

Also related: #2259.

Co-authored-by: Josh Goebel <[email protected]>
0xflotus added a commit to 0xflotus/highlight.js that referenced this issue Jun 12, 2020
* (docs) Add Chaos to supported languages (highlightjs#2510)

* fix(parser) fixes sublanguage with no rule matches (highlightjs#2506)

* fix(parser) fixes sublanguage with no rule matches

Resolves highlightjs#2504.

* (chore) Add ESLint config and clean up the major stuff (highlightjs#2503)

* (chore) eslint:recommended
* (chore): eslint_standard
* relax eslint rules for language grammars (to discourage rewriting them in one fell swoop; I'd rather have the blame history intact)
* remove extra escaping
* clean up variables
* more camelcase

* (docs) Add Visual Basic for Applications (VBA) to supported languages (highlightjs#2512)

* (yaml) improve tag support; add verbatim tags (highlightjs#2487)

* YAML parse non-word characters as part of tags
* adds support for verbatim tags

Co-authored-by: Josh Goebel <[email protected]>

* fix(javascript/typescript): lambda with parens in parameters fails (highlightjs#2502)

* fix(javascript/typescript): lambda with parens in parameters fails

- Fixes both JavaScript and TypeScript grammars

Fixes samples like:

    const bad = ((a, b) => [...a, b]);
    sides.every((length,width=(3+2+(4/5))) => length > 0 );

This is done by counting parens in the regex that finds arrows
functions. Currently we can only handle 2 levels of nesting as
shown in the second example above.

* allow much richer highlighting inside params
* improve highlighting inside arguments on typescript

* enh(cpp): Improve highlighting of unterminated raw strings

PR highlightjs#1897 switched C++ raw strings to use backreferences, however this
breaks souce files where raw strings are truncated. Like comments, it
would be preferable to highlight them.

- Add `on:begin` and `on:end` to allow more granular matching when
  then end match is dynamic and based on a part of the begin match
- This deprecates the `endSameAsBegin` attribute. That attribute was
  a very specific way to solve this problem, but now we have a much
  more general solution in these added callbacks.

Also related: highlightjs#2259.

Co-authored-by: Josh Goebel <[email protected]>

* (chore) C-like uses the new END_SAME_AS_BEGIN mode

* (chore) Ruby uses END_SAME_AS_BEGIN mode/rule

* (parser) make END_SAME_AS_BEGIN a function helper

Adds a mode helper to replace the deprecated `endSameAsBegin` attribute.

The first match group from the begin regex will be compared to the first
match group from the end regex and the end regex will only match if both
strings are identical.

Note this is more advanced functionality than before since now you can
match a larger selection of text yet only use a small portion of it for
the actual "end must match begin" portion.

* (pgsql) add test for $$ quoting existing behavior

- even if that existing behavior is questionable
- the ending span should really close before the $$, not after

Fixing this would involve delving into the sublanguage behavior and I'm
not sure we have time to tackle that right this moment.

* (chore) pgsql uses END_SAME_AS_BEGIN mode/rule now also

* (docs) rename to `mode_reference`; docs for callbacks

- I can never find this file because it's name didn't fully match.
- rename callbacks to `on:begin` and `on:end`

* prevented setter keyword conflicting with setTimeout|setInterval and highlighted them (highlightjs#2514) (highlightjs#2515)

* fix(javascript) prevent setter keyword 'set' conflicting with setTimeout|setInterval (highlightjs#2514)
* enh(javascript) setTimeout|setInterval now highlighted (highlightjs#2514)
* enh (javascript) clearInterval and clearTimeout now highlighted
* add keywords to TypeScript also

* (docs) add TLDR instructions for building and testing

* (dev) improve developer tool UI

* (parser) Build common EMCAscript foundation

Builds a common keyword foundation for any grammar that is
building on top of JavaScript:

- LiveScript
- CoffeeScript
- TypeScript

Also uses this common foundation for JS itself.

* (parser) Adds SHEBANG mode

* (yaml) Add support for inline sequences and mappings (highlightjs#2513)

* Use containers to match inline sequences and mappings
* Add string type for inside inline elements
* Handle nested inline sequences and mappings
* Disallow all braces brackets and commas from strings inside inline mappings or sequences
* clean up implementation
* feed the linter

Co-authored-by: Josh Goebel <[email protected]>

* [enh] Add `OPTIMIZE:` and `HACK:` to comment doctags

* (build) browser build is CommonJS and IIFE, no more AMD (highlightjs#2511)

* (build) browser build is CommonJS and IIFE (global) now
* (build) dropping support for AMD, which we never truly supported
  properly in the first place
* (build) add test to make sure browser build works as commonJS module

  Resolves highlightjs#2505

* fix(parser) Fix freezing issue with illegal 0 width matches (highlightjs#2524)

* fix[parser] add edge case handle for illegal 0 width matches
* add last ditch catch all that tries to detect other uncaught freezes

* (docs) added unicorn-rails-log as an 3rd-party language (highlightjs#2528)

- (docs) Add syntax highlighting for Rails Unicorn logging to supported languages.

* (docs) fix supported languages link: it moved again! (highlightjs#2533)

* fix(ts/js) use identifier to match potential keywords (highlightjs#2519)

- (parser) Adds `keywords.$pattern` key to grammar definitions
- `lexemes` is now deprecated in favor of `keywords.$pattern` key
- enh(typescript) use identifier to match potential keywords, preventing false positives 
- enh(javascript) use identifier to match potential keywords, preventing false positives

* fix(javascript) fix regex inside parens after a non-regex (highlightjs#2531)

* make the object attr container smarter
* deal with multi-line comments also
* comments in any order, spanning multiple lines

Essentially makes the object attr container much more sensitive by allowing it to look-ahead thru comments to find object keys - and therefore prevent them from being incorrectly matched by the "value container" rule.

* (parser) Add hljs.registerAlias() public API (highlightjs#2540)

* Add hljs.registerAlias(alias, languageName) public API
* Add .registerAlias() test

* enh(cpp) add `pair`, `make_pair`, `priority_queue` as built-ins (highlightjs#2538)

* (fix) `fixMarkup` would rarely destroy markup when `useBR` was enabled (highlightjs#2532)

* enh(cpp) Recognize `priority_queue`, `pair` as containers (highlightjs#2541)

* chore: rename `registerAlias` to `registerAliases`

Plural form is clearly better as it's not surprising to the reader
to see it being passed an array - where as the singular form might
have been.  Meanwhile it's also easy to assume that it also supports
arrays of any size - including an array with a singular alias.

The fact that it can magically accept a string as the first argument
might not be obvious, but we document it and even if one didn't know
this they could still use the array form of the API without any issue
by passing a one item array.

* (swift) @objcMembers not completely highlighted (highlightjs#2543)

* Fixed @objcMembers in Swift

Would match `@objc` first, and the `Members` part would be unhighlighted

* Update CHANGES.md

* Update swift.js

* (docs) add OCL to list of supported languages (highlightjs#2547)

* (docs) Add Svelte to list of supported languages (highlightjs#2549)

* enh(dart) Add `late` and `required` keywords, and `Never` built-in type (Dart 2.9) (highlightjs#2551)

* Add new Dart 2.9 keywords for Null Safety language feature

* enh(erlang) add support for underscore separators in numeric literals (highlightjs#2554)

* (erlang) add support for underscore separators in numeric literals
* (erlang) add tests

* (docs) add Jolie to Supported Languages (highlightjs#2556)

* (parser/docs) Add jsdoc annotations and TypeScript type file (highlightjs#2517)

Adds JSDoc annotations and a .tsconfig that allows TypeScript to be run in it's "allowJS" mode and apply type and sanity checking to JavaScript code also. See Type Checking JavaScript Files.

I've been using TypeScript a lot lately and finding it very beneficial and wanted to get those same benefits here but without converting the whole project to TypeScript. It was rough at the beginning but now that this is finished I think it's about 80%-90% of the benefits without any of the TS compilation pipeline. The big difference in being JSDoc for adding typing information vs inline types with TypeScript.

Should be super helpful for maintainers using an editor with tight TypeScript integration and the improved docs/comments should help everyone else.

- Adds types/index.d.ts to NPM build (should be useful for TypeScript peeps)
- Improves documentation of many functions
- Adds JSDoc annotations to almost all functions
- Adds JSDoc type annotations to variables that can't be inferred
- Refactors a few smaller things to allow the TypeScript compiler to better infer what 
   is happening (and usually also made the code clearer)

* (parser) highlightBlock result key `re` => `relevance` (highlightjs#2553)

* enh(handlebars) Support for sub-expressions, path-expressions, hashes, block-parameters and literals (highlightjs#2344)

- `htmlbars` grammar is now deprecated. Use `handlebars` instead.

A stub is included so that anyone literally referencing the old `htmlbars` file (say manually requiring it in Node.js, etc) is still covered, but everyone should transition to `handlebars` now.

* fix(typescript) Add missing `readonly` keyword (highlightjs#2562)

* (docs) Mention `c` is a possible class for C (highlightjs#2577)

* fix(groovy) strings are not allowed inside ternary clauses (highlightjs#2565)

* fix(groovy) strings are not allowed inside ternary clauses
* whitespace can also include tabs

* Update @typescript-eslint/parser to the latest version 🚀 (highlightjs#2575)

* chore(package): update @typescript-eslint/parser to version 3.0.0
* chore(package): update lockfile package-lock.json

Co-authored-by: greenkeeper[bot] <23040076+greenkeeper[bot]@users.noreply.github.com>
Co-authored-by: Josh Goebel <[email protected]>

* Update @typescript-eslint/eslint-plugin to the latest version 🚀 (highlightjs#2576)

* chore(package): update @typescript-eslint/eslint-plugin to version 3.0.0
* chore(package): update lockfile package-lock.json

Co-authored-by: greenkeeper[bot] <23040076+greenkeeper[bot]@users.noreply.github.com>
Co-authored-by: Josh Goebel <[email protected]>

* (parser) properly escape ' and " in HTML output (highlightjs#2564)

* escape quotes also in final HTML output
* [style] update test coding style
* update markup tests with new escaping

This shouldn't be a security issue -- we've always escaped double quotes inside of HTML attribute values (where they could be used to break out of context) - and we've always used double quotes for enclosing attribute values. 

This just goes all the way and now properly escapes quotes everywhere.  Better safe than sorry.

* (docs) add changelog entry for last PR

* add nnfx theme (highlightjs#2571)

* (themes) Add new lioshi theme (highlightjs#2581)

* Added Cisco Command Line to SUPPORTED_LANGUAGES.md (highlightjs#2583)

* (themes) add `nnfx-dark` theme (highlightjs#2584)

* enh(protobuf) Support multiline comments  (highlightjs#2597)

* enh(java) added support for hexadecimal floating point literals (highlightjs#2509)

- Added support for many additional types of floating point literals
- Added related tests

There still may be a few gaps, but this is a pretty large improvement.

Co-authored-by: Josh Goebel <[email protected]>

* (chore) Update issue templates (highlightjs#2574)

Co-authored-by: Vladimir Jimenez <[email protected]>

* enh(toml)(ini) Improve parsing of complex keys (highlightjs#2595)

Fixes: highlightjs#2594

* (chore) add `.js` extension to import statements (highlightjs#2601)

Adds file extensions to all import specifiers in ./src/ files.  This is useful to run the files straight from source with a web browser , Node.js ESM or Deno.

- Also add eslint rules regarding extensions for imports

* enh(dart) highlight built-in nullable types (highlightjs#2598)

* Dart: allow built-in nullable types with trailing ? to be highlighted

* enh(csharp) highlight generics in more cases (highlightjs#2599)

* (chore) fix tiny style issues, add linting npm task

- fixes tiny style issues
- adds `npm run lint` for linting the main library source
  (not languages which are still much messier)

* (chore) bump dev dependencies

* (chore) upgrade some dev stuff to newer versions

* bump v10.1.0

* (chore) bump copyright

* (chore) more import below metadata comment

Co-authored-by: M. Mert Yıldıran <[email protected]>
Co-authored-by: Josh Goebel <[email protected]>
Co-authored-by: Hugo Leblanc <[email protected]>
Co-authored-by: Peter Massey-Plantinga <[email protected]>
Co-authored-by: David Benjamin <[email protected]>
Co-authored-by: Vania Kucher <[email protected]>
Co-authored-by: SweetPPro <[email protected]>
Co-authored-by: Alexandre ZANNI <[email protected]>
Co-authored-by: Taufik Nurrohman <[email protected]>
Co-authored-by: Lin <[email protected]>
Co-authored-by: nicked <[email protected]>
Co-authored-by: Nicolas Homble <[email protected]>
Co-authored-by: Ryandi Tjia <[email protected]>
Co-authored-by: Sam Rawlins <[email protected]>
Co-authored-by: Sergey Prokhorov <[email protected]>
Co-authored-by: Brian Alberg <[email protected]>
Co-authored-by: Nils Knappmeier <[email protected]>
Co-authored-by: Martin <[email protected]>
Co-authored-by: Derek Lewis <[email protected]>
Co-authored-by: greenkeeper[bot] <23040076+greenkeeper[bot]@users.noreply.github.com>
Co-authored-by: Jim Mason <[email protected]>
Co-authored-by: lioshi <[email protected]>
Co-authored-by: BMatheas <[email protected]>
Co-authored-by: Pavel Evstigneev <[email protected]>
Co-authored-by: Vladimir Jimenez <[email protected]>
Co-authored-by: Antoine du Hamel <[email protected]>
Co-authored-by: TupikovVladimir <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants