-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new lint hashset_insert_after_contains
#12873
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking on this lint. This is a good start. I left a few notes. We'd want a lint check before merging it though.
I'm currently not at my desk, will look into running the check some time next week when I get around to it.
value: &'tcx Expr<'tcx>, | ||
span: Span, | ||
} | ||
fn try_parse_contains<'tcx>(cx: &LateContext<'_>, expr: &'tcx Expr<'_>) -> Option<ContainsExpr<'tcx>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try_parse_insert
and try_parse_contains
are equal on all but two things: The UnOp
(Not vs. Deref) and the call sym (insert
vs. contains
). Please factor out a try_parse_op_call
method to use in both cases (oh, and the contains
result also containing the span, but we can live with ignoring that for the insert
case later on).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@llogiq thanks for your comment. I was trying to take a look at it and I just have one doubt that I wanted to validate with you.
There is another difference betwheen try_parse_insert
and try_parse_contains
: on try_parse_insert
I am peeling the expr
while it is a Not
, at the beginning, and at the try_parse_contains
I am peeling the value
of the expr
-if it is a method call- while it is a Deref
. So the UnOp
is applied on different things (expr
vs value of method call expr
). To make a generic function I think that I would need to always do both peelings. I don't think it would be a problem, in terms of correctness, but probably it is not necessary, so I just wanted to double check with you.
Does it make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, did it! 2b0cad6
span_lint_and_then( | ||
cx, | ||
HASHSET_INSERT_AFTER_CONTAINS, | ||
expr.span, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a good idea to use a MultiSpan
here containing the !contains
and the insert
call each (simply call span_lint
with a vec![contains_span, insert_span]
). That will reduce the visual clutter especially if there is more code within the if
expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I think I did it as you mentioned, here 2b0cad6
{ | ||
span_lint_and_then( | ||
cx, | ||
HASHSET_INSERT_AFTER_CONTAINS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about SET_CONTAINS_OR_INSERT
? Yes, for now this only works on HashSet
s, but we can easily extend it to also work on BTreeSet
s or the respective Map
types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds more generic! Renamed it here 2b0cad6
let borrow_set = &mut set; | ||
if !borrow_set.contains(&value) { | ||
borrow_set.insert(value); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
if set.contains(&value) {
println!("value is already in set");
} else {
set.insert(value);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Right now the code is only searching for the insert
in the then
part of the IF. I think it would make sense to search as well in the else
part. I am just not sure if we would not warn if we find it in the else
and the then
has "a lot of things" (meaning that probably it was a stylist choice of the developer) or if we keep it simple and just warn if we find the insert
in the else as well... I think that keeping it simple would be the best choice, but, for sure, I am open to suggestions.
/// | ||
/// ### Why is this bad? | ||
/// Using just `insert` and checking the returned `bool` is more efficient. | ||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know of one possible false positive: If the value is only borrowed & expensive to clone or impossible to clone twice, we may opt to check with contains
before inserting to avoid the clone. There should be a "known problems" section mentioning this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, did it here 2b0cad6
#[clippy::version = "1.80.0"] | ||
pub HASHSET_INSERT_AFTER_CONTAINS, | ||
nursery, | ||
"unnecessary call to `HashSet::contains` followed by `HashSet::insert`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"unnecessary call to `HashSet::contains` followed by `HashSet::insert`" | |
"call to `HashSet::contains` followed by `HashSet::insert`" |
As stated, we cannot know whether the call is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed on 2b0cad6
☔ The latest upstream changes (presumably #12849) made this pull request unmergeable. Please resolve the merge conflicts. |
Sorry this is taking so long, I am unfortunately quite busy at the moment. I'll ping you once I've run the check so you don't need to rebase more than once. |
/// | ||
/// ### Known problems | ||
/// In case the value that wants to be inserted is borrowed and also expensive or impossible | ||
/// to clone. In such scenario, the developer might want to check with `contain` before inserting, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// to clone. In such scenario, the developer might want to check with `contain` before inserting, | |
/// to clone. In such a scenario, the developer might want to check with `contains` before inserting, |
I finally came around to do a lintcheck run, and it looked OK. So r=me after a rebase and the docs fix @bitfield suggested. |
@bors r=llogiq |
@lochetti: 🔑 Insufficient privileges: Not in reviewers |
Oops. It looks like I can't |
No problem. @bors r+ |
☀️ Test successful - checks-action_dev_test, checks-action_remark_test, checks-action_test |
Awesome work! One question -- would it make sense to generalize the name of this lint to |
@nyurik: Currently the lint only catches HashMaps, but I'd welcome a PR to change that along with the name. |
I could do a quick PR to rename the lint, but implementing it for other types might take a bit longer, and might not even make the cutoff for the next release (at which point the lint name would become permanent) |
Oh, my apologies, this was already renamed in the code to
|
Renamed in #13053 |
This PR closes #11103.
This is my first PR creating a new lint (and the second attempt of creating this PR, the first one I was not able to continue because of personal reasons). Thanks for the patience :)
The idea of the lint is to find insert in hashmanps inside if staments that are checking if the hashmap contains the same value that is being inserted. This is not necessary since you could simply call the insert and check for the bool returned if you still need the if statement.
changelog: new lint: [hashset_insert_after_contains]