-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flake8-simplify
] Implementation for split-of-static-string
(SIM905)
#14008
[flake8-simplify
] Implementation for split-of-static-string
(SIM905)
#14008
Conversation
721fd96
to
05e0a1a
Compare
|
code | total | + violation | - violation | + fix | - fix |
---|---|---|---|---|---|
SIM905 | 31 | 31 | 0 | 0 | 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This is real good.
I think this mainly needs more tests .
The touch with maxsplit
is nice but I haven't seen a single example in the ecosystem results. That's why I'm not sure if its worth the complexity or if we should just bail on providing a fix in this case.
""" | ||
itemA | ||
itemB | ||
itemC | ||
""".split() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more examples would be great:
- Where the left is an empty string:
"".split()
- An all whitespace string:
" ".split()
ideally where the whitespace is a mixture of spaces, tabs, etc - Where the left contains no split points:
"/abc/".split()
(found in the ecosystem results) - Multiline with different indents: see https://github.com/scikit-build/scikit-build-core/blob/f6e60f41e46ce13ddbd344542d1bf45743b74514/tests/test_skbuild_settings.py#L440-L443
- Strings with unicode flag
- Examples with comments in various positions
("a,b,c" # comment .split() )
We should also add examples for the following strings where I think it's okay if we don't support them
- Raw strings
- Implicit concatenated strings
- Implicit concatenated strings with commented parts (where there are comments between the parts)
- Implicit concatenated strings with different prefixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added these cases, as well as maxsize=-1
that I found people also use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implicit concatenated strings are supported, but the fix is unsafe because of the comments. I could also exclude them from fixing and mark the fix as safe if you prefer.
crates/ruff_linter/src/rules/ruff/rules/split_of_static_string.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/src/rules/ruff/rules/split_of_static_string.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/src/rules/ruff/rules/split_of_static_string.rs
Outdated
Show resolved
Hide resolved
if let Some(ref replacement_expr) = split_replacement { | ||
// Construct replacement list | ||
let replacement = checker.generator().expr(replacement_expr); | ||
diagnostic.set_fix(Fix::unsafe_edit(Edit::range_replacement( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's your reasoning for making this an unsafe edit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had not yet exhausted all edge cases (comments, implicit string concatenation, etc.), so I marked it unsafe to be on the safe side. Comments within ISC are not preserved, so in that case it's unsafe.
// Autofix for maxsplit without separator not yet implemented | ||
None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What makes maxsplit
with the default separator difficult to implement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no equivalent to split
for split_whitespace
. Implementing it is not that much work, but it would be simplified greatly when this experimental API is stabilised (https://doc.rust-lang.org/std/str/struct.SplitWhitespace.html#method.remainder). I don't think it's worth to implement the logic now. I'll add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule exists in the Flake8-simplify linter as SIM905: MartinThoma/flake8-simplify#86
So I think it would be good for us to put it in our Flake8-simplify category with code SIM905
Thanks for the review @MichaReiser, will make another pass. Good catch @AlexWaygood, I'll remap to that category. |
ruff
] Implementation for split-of-static-string
(RUF035)flake8-simplify
] Implementation for split-of-static-string
(SIM905)
} | ||
|
||
fn split_sep(str_value: &str, sep_value: &str, max_split: usize, direction_left: bool) -> Expr { | ||
let list_items: Vec<&str> = if direction_left && max_split > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The touch with maxsplit is nice but I haven't seen a single example in the ecosystem results. That's why I'm not sure if it's worth the complexity or if we should just bail on providing a fix in this case.
Maxsplit=1 is quite common (arguably people should use str.partition
).
Another reason to keep the logic is that it can be useful to generalise for this pylint rule:
https://pylint.readthedocs.io/en/stable/user_guide/messages/convention/use-maxsplit-arg.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome. Thank you. It's up to you if you want to address any of the two comments. I merge whenever you tell me the PR's good to go :)
op: UnaryOp::USub, | ||
operand, | ||
.. | ||
}) if matches!(**operand, Expr::NumberLiteral { .. }) => 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should bail here if maxplit
is a negative number other than -1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's rare, but maxsplit=-2
behaves as if maxsplit is omitted.
From that perspective, any negative number could be flagged as non-idiomatic default, but that would be a different rule.
Working code in the wild:
(This is valid code, although likely a mistake)
@MichaReiser processed the comments, thanks for the pointers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a test for maxsplit=0
. I suspect that we're handling that incorrectly to be the same as maxsplit=-1
where it isn't.
>>> "a,b,c".split(',', maxsplit=0)
['a,b,c']
>>> "a,b,c".split(',', maxsplit=-1)
['a', 'b', 'c']
>>> "a,b,c".split(',', maxsplit=-2)
- negative values => Same as
maxsplit
None - 0 => No split
To make things more interesting:
|
You're right, the |
2901dff
to
56047ff
Compare
I decided to make it safe unless there are comments in the range (in which case, we still fix, but as unsafe). |
Summary
Closes #13944
Test Plan
Standard snapshot testing
flake8-simplify surprisingly only has a single test case