Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/erb_lint/linters: Add a new CommentSyntax linter #279

Merged
merged 5 commits into from
Oct 26, 2022

Conversation

issyl0
Copy link

@issyl0 issyl0 commented Oct 20, 2022

  • This adds a new ERB comment syntax linter to catch ERB comments that could trip up the Ruby parser in latest versions of Ruby.

  • The ERB documentation[1] says that Ruby comments are not valid in ERB, and that comments in ERB should be of the form <%# comment here %> and not <% # comment here %>. That is, no space between the opening <% and the # character.

  • This documentation is only a guideline usually, because in most cases Ruby-style comments work just fine. However, using the Ruby 3.1 parser, RuboCop's Lint/Syntax cop started failing on ERB files that included comments with semicolons in them.

  • Take an example ERB file:

<%# This is the correct comment syntax. %>
<%# Recommending the "proper" ERB comment syntax seems like the safest way to go; even with semicolons it will parse. %>
<% # This is technically incorrect comment syntax but it generally parses. %>
<% # Until someone puts a ; in. %>
  • This led to output from RuboCop's Lint/Syntax cop (RuboCop v1.36, TargetRubyVersion: 3.1) where it failed to parse the file, but its code snippets and offense locations were not very useful. (Admittedly it's less visible in this contrived example, you'll have to trust me!)
app/views/proof_of_concept.html.erb:7:2: E: Lint/Syntax: unexpected token kIN
(Using Ruby 3.1 parser; configure using TargetRubyVersion parameter, under AllCops)
 in.
 ^^

1 file inspected, 1 offense detected
  • After lots of being confused, because the code looked like valid Ruby and the editor was syntax highlighting it if it were a valid comment, we realised that it was the semicolon in the comment in the ERB that was tripping the Ruby 3.1 parser up.

  • Doing some digging, it turns out that this could have all been avoided if our codebase had respected ERB's "Ruby comments are not valid" guideline[1]. I went hunting for a RuboCop rule for this, then remembered that ERBLint existed and is way better suited to linting ERB. There wasn't an existing rule for ERB comment syntax that I could find, hence this addition, because I feel like individual developers shouldn't have to know this quirk of ERB off the top of their heads!

[1] - https://github.com/ruby/erb/tree/a3492c4bd1061071814cca085544ce259a9d8d56#recognized-tags


Example ERB file:

<%# erb comment here %>
<%= # bad erb comment here %>
<%
  # apparently this comment syntax is valid?
%>
<% # very; bad erb comment here %>

Example behaviour of this new linter:

❯ exe/erblint --enable-linters=comment_syntax test_comment_syntax.html.erb
.erb-lint.yml not found: using default config
Linting 1 files with 1 linters...

Bad ERB comment syntax. Should be <%#= without a space between.
Leaving a space between ERB tags and the Ruby comment character can cause parser errors.
In file: test_comment_syntax.html.erb:2

Bad ERB comment syntax. Should be <%# without a space between.
Leaving a space between ERB tags and the Ruby comment character can cause parser errors.
In file: test_comment_syntax.html.erb:6

2 error(s) were found in ERB files

Thanks for building ERBLint, and thanks in advance for the review here! ✨

- This adds a new ERB comment syntax linter to catch ERB comments that
  could trip up the Ruby parser in latest versions of Ruby.
- The ERB documentation[1] says that Ruby comments are not valid in ERB,
  and that comments in ERB should be of the form `<%# comment here %>`
  and not `<% # comment here %>`. That is, no space between the opening
  `<%` and the `#` character.
- This documentation is only a guideline usually, because in most cases
  Ruby-style comments work just fine. However, using the Ruby 3.1 parser,
  RuboCop's `Lint/Syntax` cop started failing on ERB files that included
  comments with semicolons in them.

- Take an example ERB file:

```
<%# This is the correct comment syntax. %>
<%# Recommending the "proper" ERB comment syntax seems like the safest way to go; even with semicolons it will parse. %>
<% # This is technically incorrect comment syntax but it generally parses. %>
<% # Until someone puts a ; in. %>
```

- This led to output from RuboCop's `Lint/Syntax` cop
  (RuboCop v1.36, `TargetRubyVersion: 3.1`) where it failed to parse the
  file, but its code snippets and offense locations were not very useful.
  (Admittedly it's less visible in this contrived example, you'll have
  to trust me!)

```
app/views/proof_of_concept.html.erb:7:2: E: Lint/Syntax: unexpected token kIN
(Using Ruby 3.1 parser; configure using TargetRubyVersion parameter, under AllCops)
 in.
 ^^

1 file inspected, 1 offense detected
```

- After lots of being confused, because the code looked like valid Ruby
  and the editor was syntax highlighting it if it were a valid comment,
  we realised that it was the semicolon in the comment in the ERB that
  was tripping the Ruby 3.1 parser up.
- Doing some digging, it turns out that this could have all been avoided
  if our codebase had respected ERB's "Ruby comments are not valid"
  guideline[1]. I went hunting for a RuboCop rule for this, then
  remembered that ERBLint existed and is way better suited to linting
  ERB. There wasn't an existing rule for ERB comment syntax that I could
  find, hence this addition, because I feel like individual developers
  shouldn't have to know this quirk of ERB off the top of their heads
  when it can cause such confusing errors!

[1] - https://github.com/ruby/erb/tree/a3492c4bd1061071814cca085544ce259a9d8d56#recognized-tags
- It's better to have too many test cases than too few, and I realised
  forgot about the fact that multi-line ERB comments _can_ have spaces.
- Previously I erroneously did a `next if indicator_node.nil?`, which
  meant that for a file with multiple offenses where one of them was
  `<%= # comment %>` which _was_ getting correctly detected, it was
  working, but the last one `<% # comment %>` syntax was not being
  detected properly.
- This bug manifested in having two offenses detected for the same line,
  because it was never moving past the first detected offense because in
  the `<%` ERB opening case, `indicator_node` is nil and was getting
  skipped. Not what I originally intended! There were two results,
  however, and the tests passed. This was because for `<%= # bad comment
  here %>` it was detecting `<%=` for the `indicator_node_str == "#"`
  condition and then adding another offense because that same line's
  `code_node_str` (` # bad comment here`) started with ` #` so also
  matched the second condition which added another offense. (See
  examples below.)
- This commit also refactors a bit to make things simpler and less
  repetitive - since `add_offense` appends to an array of offenses
  internally, we can only call it once and get rid of the multiple ifs
  because they're confusing, tweaking the message depending on whether
  it's a `<%=` expression or a `<%` expression.

Sample ERB file:

```
<%# erb comment here %>
<%= # bad erb comment here %>
<%
  # apparently this comment syntax is valid?
%>
<% # very; bad erb comment here %>
```

Example behaviour before:

```
Bad ERB comment syntax. Should be <%#= without a space between.
Leaving a space between ERB tags and the Ruby comment character can cause parser errors.
In file: test_comment_syntax.html.erb:2

Bad ERB comment syntax. Should be <%# without a space between.
Leaving a space between ERB tags and the Ruby comment character can cause parser errors.
In file: test_comment_syntax.html.erb:2
```

Example behaviour now (fixed):

```
❯ exe/erblint --enable-linters=comment_syntax test_comment_syntax.html.erb
.erb-lint.yml not found: using default config
Linting 1 files with 1 linters...

Bad ERB comment syntax. Should be <%#= without a space between.
Leaving a space between ERB tags and the Ruby comment character can cause parser errors.
In file: test_comment_syntax.html.erb:2

Bad ERB comment syntax. Should be <%# without a space between.
Leaving a space between ERB tags and the Ruby comment character can cause parser errors.
In file: test_comment_syntax.html.erb:6

2 error(s) were found in ERB files
```
README.md Outdated Show resolved Hide resolved
Copy link
Member

@etiennebarrie etiennebarrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great thanks! Let's just fix the documentation a bit and enable it by default.

README.md Outdated Show resolved Hide resolved
@issyl0
Copy link
Author

issyl0 commented Oct 26, 2022

Thanks for the review @etiennebarrie. 🙇🏻 I made the changes.

Copy link
Member

@etiennebarrie etiennebarrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@etiennebarrie etiennebarrie merged commit cc14978 into Shopify:main Oct 26, 2022
@issyl0 issyl0 deleted the lint-erb-comment-syntax branch October 26, 2022 11:57
@shopify-shipit shopify-shipit bot temporarily deployed to production November 1, 2022 16:01 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants