-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use markdown parser to identify code blocks in literate coffeescript #3924
Conversation
this should help to handle ligitimate markdown with indentation correctly
+1 if this stops indented code being classed as CoffeeScript when it's inside html tags. That's a pain point for CoffeeShop. |
@carlsmith this most definitely stops indented code within html from being treated as coffee script, but I had a look at how it would fit in your library, and realised I have missed something important. I used a require statement to include the marked library, assuming coffee script running in nodejs. There needs to be a way for this dependency to be included in the browser build. I don't know what is the best approach for this, and coffee script currently has no dependencies, but it used to, so there might be a clue as how best to do it in the source code history. |
It used to have mkdirp as a dependency, but it was only used in the CLI, not in the browser version of the compiler. |
It'd be nice if you could pass a function to the compiler, so people could pass adapters for Marked or whatever library has the features they need, and let it fall back to the current behaviour by default. Nice work by the way. |
@billymoon I don’t know if you’re still interested in this PR, but I figured out how to make it work in browsers too: diff --git a/Cakefile b/Cakefile
index cf538b8..1e822ce 100644
--- a/Cakefile
+++ b/Cakefile
@@ -140,6 +140,14 @@ task 'build:parser', 'rebuild the Jison parser (run build first)', ->
task 'build:browser', 'rebuild the merged script for inclusion in the browser', ->
code = ''
+ for {name, src} in [{name: 'marked', src: 'lib/marked.js'}]
+ code += """
+ require['#{name}'] = (function() {
+ var exports = {}, module = {exports: exports};
+ #{fs.readFileSync "node_modules/#{name}/#{src}"}
+ return module.exports;
+ })();
+ """
for name in ['helpers', 'rewriter', 'lexer', 'parser', 'scope', 'nodes', 'sourcemap', 'coffee-script', 'browser']
code += """
require['./#{name}'] = (function() { |
Also I improved diff --git a/src/helpers.coffee b/src/helpers.coffee
index 0ac49d7..0f9f4e8 100644
--- a/src/helpers.coffee
+++ b/src/helpers.coffee
@@ -82,16 +82,19 @@ exports.some = Array::some ? (fn) ->
# out all non-code blocks, producing a string of CoffeeScript code that can
# be compiled "normally".
exports.invertLiterate = (code) ->
- # don't know how to avoid this hack using token as placeholder for tabs, then
- # re-inserting tabs after code extraction. The token has been split in two
- # so that it does not end up getting parsed in this src code
- token = '9ddb1d26184bdaf8'+'d4de55835d82eb56'
+ # Create a placeholder for tabs, that isn’t used anywhere in `code`, and then
+ # re-insert the tabs after code extraction.
+ generateRandomToken = ->
+ "#{Math.random() * Date.now()}"
+ while token is undefined or code.indexOf(token) isnt -1
+ token = generateRandomToken()
+
code = code.replace "\t", token |
Since we merged this code via #4345 into |
Yes. |
@billymoon do you mind providing examples of the false positives you mentioned when you started this thread? So that I can verify that the current implementation does, in fact, fix them. Or if we want to try yet another implementation that only looks at indentation and doesn’t try to parse Markdown, it would be good to have test cases that need to pass. |
@GeoffreyBooth I think that was done in this commit: bf1d9d3 |
Looking through CommonMark spec, I found one false positive and one false negative in Coffee 1.12.5. Case 1If there is any ambiguity between an interpretation of indentation as a code block and as indicating that material belongs to a list item, the list item interpretation takes precedence:
Example 77:
- foo
should_not_be_code
Example 78:
1. foo
- should_not_be_Code_too
Compiles to: // Generated by CoffeeScript 1.12.5
(function() {
should_not_be_code;
-should_not_be_code_too;
}).call(this); Parsed by CommonMark Reference Parser: <p>If there is any ambiguity between an interpretation of indentation as a code block and as indicating that material belongs to a list item, the list item interpretation takes precedence:</p>
<p>Example 77:</p>
<ul>
<li>
<p>foo</p>
<p>should_not_be_code</p>
</li>
</ul>
<p>Example 78:</p>
<ol>
<li>
<p>foo</p>
<ul>
<li>should_not_be_code_too</li>
</ul>
</li>
</ol> GitHub Render Result: If there is any ambiguity between an interpretation of indentation as a code block and as indicating that material belongs to a list item, the list item interpretation takes precedence: Example 77:
Example 78:
Case 2And indented code can occur immediately before and after other kinds of blocks:
Example 84:
# Heading
should_be_code
Heading
------
should_be_code_too
----
Compiles to: // Generated by CoffeeScript 1.12.5
(function() {
}).call(this); CommonMark generates: <p>And indented code can occur immediately before and after other kinds of blocks:</p>
<p>Example 84:</p>
<h1>Heading</h1>
<pre><code>should_be_code
</code></pre>
<h2>Heading</h2>
<pre><code>should_be_code_too
</code></pre>
<hr /> GitHub Render Result: And indented code can occur immediately before and after other kinds of blocks: Example 84: Heading
Heading
edit: format |
There are many false positive matches for code blocks made by the current coffee-script parser which are not really code blocks in the markdown. This fixes it by using a markdown parser to identify code blocks.