use markdown parser to identify code blocks in literate coffeescript #3924

billymoon · 2015-03-31T02:57:19Z

There are many false positive matches for code blocks made by the current coffee-script parser which are not really code blocks in the markdown. This fixes it by using a markdown parser to identify code blocks.

this should help to handle ligitimate markdown with indentation correctly

carlsmith · 2015-03-31T16:32:15Z

+1 if this stops indented code being classed as CoffeeScript when it's inside html tags. That's a pain point for CoffeeShop.

billymoon · 2015-04-06T20:00:02Z

@carlsmith this most definitely stops indented code within html from being treated as coffee script, but I had a look at how it would fit in your library, and realised I have missed something important. I used a require statement to include the marked library, assuming coffee script running in nodejs. There needs to be a way for this dependency to be included in the browser build. I don't know what is the best approach for this, and coffee script currently has no dependencies, but it used to, so there might be a clue as how best to do it in the source code history.

lydell · 2015-04-07T05:26:37Z

It used to have mkdirp as a dependency, but it was only used in the CLI, not in the browser version of the compiler.

carlsmith · 2015-04-07T06:54:33Z

It'd be nice if you could pass a function to the compiler, so people could pass adapters for Marked or whatever library has the features they need, and let it fall back to the current behaviour by default.

Nice work by the way.

GeoffreyBooth · 2016-10-22T19:22:36Z

@billymoon I don’t know if you’re still interested in this PR, but I figured out how to make it work in browsers too:

diff --git a/Cakefile b/Cakefile
index cf538b8..1e822ce 100644
--- a/Cakefile
+++ b/Cakefile
@@ -140,6 +140,14 @@ task 'build:parser', 'rebuild the Jison parser (run build first)', ->

 task 'build:browser', 'rebuild the merged script for inclusion in the browser', ->
   code = ''
+  for {name, src} in [{name: 'marked', src: 'lib/marked.js'}]
+    code += """
+      require['#{name}'] = (function() {
+        var exports = {}, module = {exports: exports};
+        #{fs.readFileSync "node_modules/#{name}/#{src}"}
+        return module.exports;
+      })();
+    """
   for name in ['helpers', 'rewriter', 'lexer', 'parser', 'scope', 'nodes', 'sourcemap', 'coffee-script', 'browser']
     code += """
       require['./#{name}'] = (function() {

GeoffreyBooth · 2016-10-22T23:47:07Z

Also I improved invertLiterate so that the token it uses as a placeholder for tabs is ensured to not appear in code, rather than a magic string that we’re gambling won’t appear in code:

diff --git a/src/helpers.coffee b/src/helpers.coffee
index 0ac49d7..0f9f4e8 100644
--- a/src/helpers.coffee
+++ b/src/helpers.coffee
@@ -82,16 +82,19 @@ exports.some = Array::some ? (fn) ->
 # out all non-code blocks,  producing a string of CoffeeScript code that can
 # be compiled "normally".
 exports.invertLiterate = (code) ->
-  # don't know how to avoid this hack using token as placeholder for tabs, then
-  # re-inserting tabs after code extraction. The token has been split in two
-  # so that it does not end up getting parsed in this src code
-  token = '9ddb1d26184bdaf8'+'d4de55835d82eb56'
+  # Create a placeholder for tabs, that isn’t used anywhere in `code`, and then
+  # re-insert the tabs after code extraction.
+  generateRandomToken = ->
+    "#{Math.random() * Date.now()}"
+  while token is undefined or code.indexOf(token) isnt -1
+    token = generateRandomToken()
+
   code = code.replace "\t", token

GeoffreyBooth · 2016-10-26T05:35:52Z

Since we merged this code via #4345 into 2, are we deciding this won’t be implemented in the 1.x branch? And therefore we should close this PR?

lydell · 2016-10-26T11:56:00Z

Yes.

GeoffreyBooth · 2017-04-14T17:40:15Z

@billymoon do you mind providing examples of the false positives you mentioned when you started this thread? So that I can verify that the current implementation does, in fact, fix them. Or if we want to try yet another implementation that only looks at indentation and doesn’t try to parse Markdown, it would be good to have test cases that need to pass.

billymoon · 2017-04-14T18:22:34Z

@GeoffreyBooth I think that was done in this commit: bf1d9d3

akfish · 2017-04-14T18:38:09Z

Looking through CommonMark spec, I found one false positive and one false negative in Coffee 1.12.5.

Case 1

If there is any ambiguity between an interpretation of indentation as a code block and as indicating that material belongs to a list item, the list item interpretation takes precedence:

Example 77:

  - foo

    should_not_be_code

Example 78:

1.  foo

    - should_not_be_Code_too

Compiles to:

// Generated by CoffeeScript 1.12.5
(function() {
  should_not_be_code;
  -should_not_be_code_too;

}).call(this);

Parsed by CommonMark Reference Parser:

<p>If there is any ambiguity between an interpretation of indentation as a code block and as indicating that material belongs to a list item, the list item interpretation takes precedence:</p>
<p>Example 77:</p>
<ul>
<li>
<p>foo</p>
<p>should_not_be_code</p>
</li>
</ul>
<p>Example 78:</p>
<ol>
<li>
<p>foo</p>
<ul>
<li>should_not_be_code_too</li>
</ul>
</li>
</ol>

GitHub Render Result:

If there is any ambiguity between an interpretation of indentation as a code block and as indicating that material belongs to a list item, the list item interpretation takes precedence:

Example 77:

foo

should_not_be_code

Example 78:

foo
- should_not_be_Code_too

Case 2

And indented code can occur immediately before and after other kinds of blocks:

Example 84:

# Heading
    should_be_code
Heading
------
    should_be_code_too
----

Compiles to:

// Generated by CoffeeScript 1.12.5
(function() {


}).call(this);

CommonMark generates:

<p>And indented code can occur immediately before and after other kinds of blocks:</p>
<p>Example 84:</p>
<h1>Heading</h1>
<pre><code>should_be_code
</code></pre>
<h2>Heading</h2>
<pre><code>should_be_code_too
</code></pre>
<hr />

GitHub Render Result:

And indented code can occur immediately before and after other kinds of blocks:

Example 84:

Heading

should_be_code

Heading

should_be_code_too

edit: format

billymoon added 3 commits March 31, 2015 03:09

failing test case

bf1d9d3

add markdown parser for literate coffeescript

f7245b5

this should help to handle ligitimate markdown with indentation correctly

Merge branch 'feature/illiterate'

be8ef35

GeoffreyBooth mentioned this pull request Oct 21, 2016

[CS2] Failing test rejecting inconsistent whitespace in .litcoffee files #4336

Closed

GeoffreyBooth mentioned this pull request Oct 22, 2016

Fix tabbed Literate CoffeeScript #4345

Merged

lydell closed this Oct 26, 2016

lydell added enhancement wontfix labels Oct 26, 2016

GeoffreyBooth mentioned this pull request Apr 14, 2017

[CS2] Add webpack support #4501

Closed

6 tasks

GeoffreyBooth mentioned this pull request Apr 16, 2017

[CS2] Literate CoffeeScript without dependencies #4509

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use markdown parser to identify code blocks in literate coffeescript #3924

use markdown parser to identify code blocks in literate coffeescript #3924

billymoon commented Mar 31, 2015

carlsmith commented Mar 31, 2015

billymoon commented Apr 6, 2015

lydell commented Apr 7, 2015

carlsmith commented Apr 7, 2015

GeoffreyBooth commented Oct 22, 2016

GeoffreyBooth commented Oct 22, 2016

GeoffreyBooth commented Oct 26, 2016

lydell commented Oct 26, 2016

GeoffreyBooth commented Apr 14, 2017

billymoon commented Apr 14, 2017

akfish commented Apr 14, 2017 •

edited

Loading

use markdown parser to identify code blocks in literate coffeescript #3924

use markdown parser to identify code blocks in literate coffeescript #3924

Conversation

billymoon commented Mar 31, 2015

carlsmith commented Mar 31, 2015

billymoon commented Apr 6, 2015

lydell commented Apr 7, 2015

carlsmith commented Apr 7, 2015

GeoffreyBooth commented Oct 22, 2016

GeoffreyBooth commented Oct 22, 2016

GeoffreyBooth commented Oct 26, 2016

lydell commented Oct 26, 2016

GeoffreyBooth commented Apr 14, 2017

billymoon commented Apr 14, 2017

akfish commented Apr 14, 2017 • edited Loading

Case 1

Case 2

Heading

Heading

akfish commented Apr 14, 2017 •

edited

Loading