-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option AllowHeredoc to Metrics/LineLength #2479
Conversation
726a493
to
e10a903
Compare
This is nice! But I think the whitelist feature is not particularly useful. Personally, I would prefer if Metrics/LineLength simply allowed long lines in heredocs all the time, without the need for an extra config parameter. |
59f7b2b
to
1e21798
Compare
@alexdowad You can just set it to |
Yes, certainly; but in my opinion, it is better not to add gratuitous configuration "knobs". It just complicates things. But if the config parameter is there, my vote is that it should definitely be set to |
Usage-wise I love it, but it's ultimately up to @bbatsov. The reason I like this is because: a) it can be set to |
heredocs = [] | ||
iter = processed_source.tokens.each | ||
loop do | ||
token = iter.next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be written as
processed_source.tokens.each_with_object([]) do |token, heredocs|
heredocs.concat(collect_heredoc(iter)) if token.type == :tSTRING_BEG
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I think this would run into issues with the collect_heredoc
method. It does feel like there may be an opportunity to refactor here. Passing iter
around feels a bit weird to me. These methods have a lot of similarities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rrosenblum I actually first tried a similar approach to what you're suggesting. Turns out that makes it really hard to parse. The advantage of the iterator is that you consume it, thus progressing in the token list. This is much easier with nested heredocs. See e.g. this snippet:
$ ruby-parse -E -e "
foo(<<ONE, <<TWO)
Hello #{<<NESTED.upcase}
World
NESTED
ONE
What's your name?
TWO
"
foo(<<ONE, <<TWO)
^~~ tIDENTIFIER "foo" expr_cmdarg [0 <= cond] [0 <= cmdarg]
foo(<<ONE, <<TWO)
^ tLPAREN2 "(" expr_beg [0 <= cond] [0 <= cmdarg]
foo(<<ONE, <<TWO)
^~~~~ tSTRING_BEG "<<\"" expr_end [0 <= cond] [0 <= cmdarg]
Hello #{<<NESTED.upcase}
^~~~~~ tSTRING_CONTENT "Hello " expr_end [0 <= cond] [0 <= cmdarg]
Hello #{<<NESTED.upcase}
^~ tSTRING_DBEG "\#{" expr_end [0 <= cond] [0 <= cmdarg]
Hello #{<<NESTED.upcase}
^~~~~~~~ tSTRING_BEG "<<\"" expr_end [0 <= cond] [0 <= cmdarg]
World
^ tSTRING_CONTENT "World\n" expr_end [0 <= cond] [0 <= cmdarg]
NESTED
^~~~~~ tSTRING_END "NESTED" expr_end [0 <= cond] [0 <= cmdarg]
Hello #{<<NESTED.upcase}
^ tDOT "." expr_dot [0 <= cond] [0 <= cmdarg]
Hello #{<<NESTED.upcase}
^~~~~~ tIDENTIFIER "upcase" expr_arg [0 <= cond] [0 <= cmdarg]
Hello #{<<NESTED.upcase}
^ tSTRING_DEND "}" interp_string [0 <= cond] [0 <= cmdarg]
Hello #{<<NESTED.upcase}
^ tSTRING_CONTENT "\n" expr_end [0 <= cond] [0 <= cmdarg]
World
^ tSTRING_CONTENT "World\n" expr_end [0 <= cond] [0 <= cmdarg]
NESTED
^ tSTRING_CONTENT "NESTED\n" expr_end [0 <= cond] [0 <= cmdarg]
ONE
^~~ tSTRING_END "ONE" expr_end [0 <= cond] [0 <= cmdarg]
foo(<<ONE, <<TWO)
^ tCOMMA "," expr_beg [0 <= cond] [0 <= cmdarg]
foo(<<ONE, <<TWO)
^~~~~ tSTRING_BEG "<<\"" expr_end [0 <= cond] [0 <= cmdarg]
What's your name?
^ tSTRING_CONTENT "What's your name?\n" expr_end [0 <= cond] [0 <= cmdarg]
TWO
^~~ tSTRING_END "TWO" expr_end [0 <= cond] [0 <= cmdarg]
foo(<<ONE, <<TWO)
^ tRPAREN ")" expr_end [0 <= cond] [0 <= cmdarg]
foo(<<ONE, <<TWO)
^ tNL nil line_begin [0 <= cond] [0 <= cmdarg]
^ false "$eof" line_begin [0 <= cond] [0 <= cmdarg]
(send nil :foo
(dstr
(str "Hello ")
(begin
(send
(str "World\n") :upcase))
(str "\n")
(str "World\n")
(str "NESTED\n"))
(str "What's your name?\n"))
Let's cut down the noise here and just look at the tokens:
order | token | text | line |
---|---|---|---|
1 | tIDENTIFIER |
foo |
1 |
2 | tLPAREN2 |
( |
1 |
3 | tSTRING_BEG |
<<\" |
1 |
4 | tSTRING_CONTENT |
Hello |
2 |
5 | tSTRING_DBEG |
\#{ |
2 |
6 | tSTRING_BEG |
<<\" |
2 |
7 | tSTRING_CONTENT |
World\n |
3 |
8 | tSTRING_END |
NESTED |
4 |
9 | tDOT |
. |
2 |
10 | tIDENTIFIER |
upcase |
2 |
11 | tSTRING_DEND |
} |
2 |
12 | tSTRING_CONTENT |
\n |
2 |
13 | tSTRING_CONTENT |
World\n |
3 |
14 | tSTRING_CONTENT |
NESTED\n |
4 |
15 | tSTRING_END |
ONE |
5 |
16 | tCOMMA |
, |
1 |
17 | tSTRING_BEG |
<<\" |
1 |
18 | tSTRING_CONTENT |
What's your name?\n |
6 |
19 | tSTRING_END |
TWO |
7 |
20 | tRPAREN |
) |
1 |
So every time you encounter a tSTRING_BEG
, the next token will be the first line of the heredoc body. You do that until you reach tSTRING_END
, which contains the delimiter of the heredoc. The line before that is the end of the heredoc's body. Inside the heredoc you might encounter a further tSTRING_BEG
(token 5), i.e. a nested heredoc. There you want to consume the tokens until that nested one reaches its end (token 8) before going back to the parent heredoc.
I feel like a solution using #each
and the like would be more complicated since you'd have to pass state around, e.g. how many lines to skip due to a nested heredoc. With the iterator the state is implicit in the iterator. But I'm happy to be proven wrong 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fphilipe the parser
gem already creates a Parser::Source::Range
object for each AST node which can tell you the beginning and ending lines/columns where the source code for that node appears. Why don't you just find the AST nodes for the heredocs, note the lines/columns where they begin and end, and use that to check whether the end of a long line falls within a heredoc or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexdowad Thanks! I had looked into the AST, but I overlooked that it actually keeps track of the heredoc range. I've pushed another commit, which makes the code significantly easier to follow.
Thanks to @alexdowad's suggestion, I'm now extracting the information from the AST, which makes the code simpler. I've added a commit on top and haven't rebased, just so one can see the difference. Let me know if I should rebase before merging. One more thing I'm not sure about is where to store the extracted heredocs. Right now I just create this |
7dfb837
to
b1b341d
Compare
The new code is much cleaner. Good job. Thank you @alexdowad and @fphilipe. |
@fphilipe so happy to see this. 😄 Thank you! |
allowed_heredoc | ||
end | ||
|
||
def allowed_heredoc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need two methods? Isn't allow_heredoc?
enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you set the option to a list of allowed heredoc delimiters, you need to check whether the heredoc containing the long line is whitelisted. Thus this method is required. One can argue about #allow_heredoc?
above, which actually just returns whether the config is not false. I added #allow_heredoc?
so that in #investigate
it looks balanced with #allow_uri?
. Let me know what the preference is here.
You should add |
@jonas054 With what value? |
I'm leaning towards |
0f21667
to
0a9f9c5
Compare
@jonas054 There you go 😊 This caused a bunch of |
Looks good. Squash all commits together and rebase on top of the current |
0a9f9c5
to
2f9b896
Compare
This option can either be `true`, `false`, or a list of heredoc delimiters. When set to `true`, long lines in all heredocs are allowed. When set to a list, long lines are only allowed in heredocs delimited by the delimiters in that list. The default value of this option is true. Closes rubocop#1407
2f9b896
to
8918f9e
Compare
@bbatsov Done! |
Add option AllowHeredoc to Metrics/LineLength
👍 |
@fphilipe really happy about this. Thank you so much for looking at it and getting a solution knocked out. 😀 |
Thanks @fphilipe. |
You're welcome! Happy to help 😄 |
This option can either be
true
or a list of heredoc delimiters. When set totrue
, long lines in all heredocs are allowed. When set to a list, long lines are only allowed in heredocs delimited by the specified delimiters.Closes #1407
/cc @jfelchner