lib/erb_lint/linters: Add a new CommentSyntax linter #279

issyl0 · 2022-10-20T11:14:01Z

This adds a new ERB comment syntax linter to catch ERB comments that could trip up the Ruby parser in latest versions of Ruby.
The ERB documentation[1] says that Ruby comments are not valid in ERB, and that comments in ERB should be of the form <%# comment here %> and not <% # comment here %>. That is, no space between the opening <% and the # character.
This documentation is only a guideline usually, because in most cases Ruby-style comments work just fine. However, using the Ruby 3.1 parser, RuboCop's Lint/Syntax cop started failing on ERB files that included comments with semicolons in them.
Take an example ERB file:

<%# This is the correct comment syntax. %>
<%# Recommending the "proper" ERB comment syntax seems like the safest way to go; even with semicolons it will parse. %>
<% # This is technically incorrect comment syntax but it generally parses. %>
<% # Until someone puts a ; in. %>

This led to output from RuboCop's Lint/Syntax cop (RuboCop v1.36, TargetRubyVersion: 3.1) where it failed to parse the file, but its code snippets and offense locations were not very useful. (Admittedly it's less visible in this contrived example, you'll have to trust me!)

app/views/proof_of_concept.html.erb:7:2: E: Lint/Syntax: unexpected token kIN
(Using Ruby 3.1 parser; configure using TargetRubyVersion parameter, under AllCops)
 in.
 ^^

1 file inspected, 1 offense detected

After lots of being confused, because the code looked like valid Ruby and the editor was syntax highlighting it if it were a valid comment, we realised that it was the semicolon in the comment in the ERB that was tripping the Ruby 3.1 parser up.
Doing some digging, it turns out that this could have all been avoided if our codebase had respected ERB's "Ruby comments are not valid" guideline[1]. I went hunting for a RuboCop rule for this, then remembered that ERBLint existed and is way better suited to linting ERB. There wasn't an existing rule for ERB comment syntax that I could find, hence this addition, because I feel like individual developers shouldn't have to know this quirk of ERB off the top of their heads!

[1] - https://github.com/ruby/erb/tree/a3492c4bd1061071814cca085544ce259a9d8d56#recognized-tags

Example ERB file:

<%# erb comment here %>
<%= # bad erb comment here %>
<%
  # apparently this comment syntax is valid?
%>
<% # very; bad erb comment here %>

Example behaviour of this new linter:

❯ exe/erblint --enable-linters=comment_syntax test_comment_syntax.html.erb
.erb-lint.yml not found: using default config
Linting 1 files with 1 linters...

Bad ERB comment syntax. Should be <%#= without a space between.
Leaving a space between ERB tags and the Ruby comment character can cause parser errors.
In file: test_comment_syntax.html.erb:2

Bad ERB comment syntax. Should be <%# without a space between.
Leaving a space between ERB tags and the Ruby comment character can cause parser errors.
In file: test_comment_syntax.html.erb:6

2 error(s) were found in ERB files

Thanks for building ERBLint, and thanks in advance for the review here! ✨

- This adds a new ERB comment syntax linter to catch ERB comments that could trip up the Ruby parser in latest versions of Ruby. - The ERB documentation[1] says that Ruby comments are not valid in ERB, and that comments in ERB should be of the form `<%# comment here %>` and not `<% # comment here %>`. That is, no space between the opening `<%` and the `#` character. - This documentation is only a guideline usually, because in most cases Ruby-style comments work just fine. However, using the Ruby 3.1 parser, RuboCop's `Lint/Syntax` cop started failing on ERB files that included comments with semicolons in them. - Take an example ERB file: ``` <%# This is the correct comment syntax. %> <%# Recommending the "proper" ERB comment syntax seems like the safest way to go; even with semicolons it will parse. %> <% # This is technically incorrect comment syntax but it generally parses. %> <% # Until someone puts a ; in. %> ``` - This led to output from RuboCop's `Lint/Syntax` cop (RuboCop v1.36, `TargetRubyVersion: 3.1`) where it failed to parse the file, but its code snippets and offense locations were not very useful. (Admittedly it's less visible in this contrived example, you'll have to trust me!) ``` app/views/proof_of_concept.html.erb:7:2: E: Lint/Syntax: unexpected token kIN (Using Ruby 3.1 parser; configure using TargetRubyVersion parameter, under AllCops) in. ^^ 1 file inspected, 1 offense detected ``` - After lots of being confused, because the code looked like valid Ruby and the editor was syntax highlighting it if it were a valid comment, we realised that it was the semicolon in the comment in the ERB that was tripping the Ruby 3.1 parser up. - Doing some digging, it turns out that this could have all been avoided if our codebase had respected ERB's "Ruby comments are not valid" guideline[1]. I went hunting for a RuboCop rule for this, then remembered that ERBLint existed and is way better suited to linting ERB. There wasn't an existing rule for ERB comment syntax that I could find, hence this addition, because I feel like individual developers shouldn't have to know this quirk of ERB off the top of their heads when it can cause such confusing errors! [1] - https://github.com/ruby/erb/tree/a3492c4bd1061071814cca085544ce259a9d8d56#recognized-tags

- It's better to have too many test cases than too few, and I realised forgot about the fact that multi-line ERB comments _can_ have spaces.

- Previously I erroneously did a `next if indicator_node.nil?`, which meant that for a file with multiple offenses where one of them was `<%= # comment %>` which _was_ getting correctly detected, it was working, but the last one `<% # comment %>` syntax was not being detected properly. - This bug manifested in having two offenses detected for the same line, because it was never moving past the first detected offense because in the `<%` ERB opening case, `indicator_node` is nil and was getting skipped. Not what I originally intended! There were two results, however, and the tests passed. This was because for `<%= # bad comment here %>` it was detecting `<%=` for the `indicator_node_str == "#"` condition and then adding another offense because that same line's `code_node_str` (` # bad comment here`) started with ` #` so also matched the second condition which added another offense. (See examples below.) - This commit also refactors a bit to make things simpler and less repetitive - since `add_offense` appends to an array of offenses internally, we can only call it once and get rid of the multiple ifs because they're confusing, tweaking the message depending on whether it's a `<%=` expression or a `<%` expression. Sample ERB file: ``` <%# erb comment here %> <%= # bad erb comment here %> <% # apparently this comment syntax is valid? %> <% # very; bad erb comment here %> ``` Example behaviour before: ``` Bad ERB comment syntax. Should be <%#= without a space between. Leaving a space between ERB tags and the Ruby comment character can cause parser errors. In file: test_comment_syntax.html.erb:2 Bad ERB comment syntax. Should be <%# without a space between. Leaving a space between ERB tags and the Ruby comment character can cause parser errors. In file: test_comment_syntax.html.erb:2 ``` Example behaviour now (fixed): ``` ❯ exe/erblint --enable-linters=comment_syntax test_comment_syntax.html.erb .erb-lint.yml not found: using default config Linting 1 files with 1 linters... Bad ERB comment syntax. Should be <%#= without a space between. Leaving a space between ERB tags and the Ruby comment character can cause parser errors. In file: test_comment_syntax.html.erb:2 Bad ERB comment syntax. Should be <%# without a space between. Leaving a space between ERB tags and the Ruby comment character can cause parser errors. In file: test_comment_syntax.html.erb:6 2 error(s) were found in ERB files ```

README.md

etiennebarrie

That's great thanks! Let's just fix the documentation a bit and enable it by default.

README.md

- Since this can easily trip people up and Ruby comments are invalid ERB.

issyl0 · 2022-10-26T10:48:48Z

Thanks for the review @etiennebarrie. 🙇🏻 I made the changes.

etiennebarrie

Thanks!

issyl0 added 3 commits October 19, 2022 15:43

linters/CommentSyntax: Improve docs & code style; add another test

911faf3

- It's better to have too many test cases than too few, and I realised forgot about the fact that multi-line ERB comments _can_ have spaces.

issyl0 commented Oct 20, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

etiennebarrie reviewed Oct 26, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

issyl0 added 2 commits October 26, 2022 11:44

README: Add good and bad examples for CommentSyntax

d0e72ef

runner_config,README: Enable CommentSyntax linter by default

af809c6

- Since this can easily trip people up and Ruby comments are invalid ERB.

etiennebarrie approved these changes Oct 26, 2022

View reviewed changes

etiennebarrie merged commit cc14978 into Shopify:main Oct 26, 2022

issyl0 deleted the lint-erb-comment-syntax branch October 26, 2022 11:57

shopify-shipit bot temporarily deployed to production November 1, 2022 16:01 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib/erb_lint/linters: Add a new CommentSyntax linter #279

lib/erb_lint/linters: Add a new CommentSyntax linter #279

issyl0 commented Oct 20, 2022

etiennebarrie left a comment

issyl0 commented Oct 26, 2022

etiennebarrie left a comment

lib/erb_lint/linters: Add a new CommentSyntax linter #279

lib/erb_lint/linters: Add a new CommentSyntax linter #279

Conversation

issyl0 commented Oct 20, 2022

etiennebarrie left a comment

Choose a reason for hiding this comment

issyl0 commented Oct 26, 2022

etiennebarrie left a comment

Choose a reason for hiding this comment