Add new lint `hashset_insert_after_contains` #12873

lochetti · 2024-05-31T16:01:53Z

This PR closes #11103.

This is my first PR creating a new lint (and the second attempt of creating this PR, the first one I was not able to continue because of personal reasons). Thanks for the patience :)

The idea of the lint is to find insert in hashmanps inside if staments that are checking if the hashmap contains the same value that is being inserted. This is not necessary since you could simply call the insert and check for the bool returned if you still need the if statement.

changelog: new lint: [hashset_insert_after_contains]

rustbot · 2024-05-31T16:01:58Z

r? @llogiq

rustbot has assigned @llogiq.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

llogiq

Thanks for taking on this lint. This is a good start. I left a few notes. We'd want a lint check before merging it though.

I'm currently not at my desk, will look into running the check some time next week when I get around to it.

llogiq · 2024-05-31T16:17:35Z

clippy_lints/src/hashset_insert_after_contains.rs

+    value: &'tcx Expr<'tcx>,
+    span: Span,
+}
+fn try_parse_contains<'tcx>(cx: &LateContext<'_>, expr: &'tcx Expr<'_>) -> Option<ContainsExpr<'tcx>> {


try_parse_insert and try_parse_contains are equal on all but two things: The UnOp (Not vs. Deref) and the call sym (insert vs. contains). Please factor out a try_parse_op_call method to use in both cases (oh, and the contains result also containing the span, but we can live with ignoring that for the insert case later on).

@llogiq thanks for your comment. I was trying to take a look at it and I just have one doubt that I wanted to validate with you.

There is another difference betwheen try_parse_insert and try_parse_contains: on try_parse_insert I am peeling the expr while it is a Not, at the beginning, and at the try_parse_contains I am peeling the value of the expr -if it is a method call- while it is a Deref. So the UnOp is applied on different things (expr vs value of method call expr). To make a generic function I think that I would need to always do both peelings. I don't think it would be a problem, in terms of correctness, but probably it is not necessary, so I just wanted to double check with you.

Does it make sense?

Great, did it! 2b0cad6

llogiq · 2024-05-31T16:24:12Z

clippy_lints/src/hashset_insert_after_contains.rs

+            span_lint_and_then(
+                cx,
+                HASHSET_INSERT_AFTER_CONTAINS,
+                expr.span,


It might be a good idea to use a MultiSpan here containing the !contains and the insert call each (simply call span_lint with a vec![contains_span, insert_span]). That will reduce the visual clutter especially if there is more code within the if expression.

Nice! I think I did it as you mentioned, here 2b0cad6

llogiq · 2024-05-31T16:26:22Z

clippy_lints/src/hashset_insert_after_contains.rs

+        {
+            span_lint_and_then(
+                cx,
+                HASHSET_INSERT_AFTER_CONTAINS,


How about SET_CONTAINS_OR_INSERT? Yes, for now this only works on HashSets, but we can easily extend it to also work on BTreeSets or the respective Map types.

Ok, sounds more generic! Renamed it here 2b0cad6

llogiq · 2024-05-31T16:28:27Z

tests/ui/hashset_insert_after_contains.rs

+    let borrow_set = &mut set;
+    if !borrow_set.contains(&value) {
+        borrow_set.insert(value);
+    }


How about

if set.contains(&value) { println!("value is already in set"); } else { set.insert(value); }

Good point. Right now the code is only searching for the insert in the then part of the IF. I think it would make sense to search as well in the else part. I am just not sure if we would not warn if we find it in the else and the then has "a lot of things" (meaning that probably it was a stylist choice of the developer) or if we keep it simple and just warn if we find the insert in the else as well... I think that keeping it simple would be the best choice, but, for sure, I am open to suggestions.

llogiq · 2024-05-31T16:34:24Z

clippy_lints/src/hashset_insert_after_contains.rs

+    ///
+    /// ### Why is this bad?
+    /// Using just `insert` and checking the returned `bool` is more efficient.
+    ///


I know of one possible false positive: If the value is only borrowed & expensive to clone or impossible to clone twice, we may opt to check with contains before inserting to avoid the clone. There should be a "known problems" section mentioning this.

Ok, did it here 2b0cad6

llogiq · 2024-05-31T16:35:06Z

clippy_lints/src/hashset_insert_after_contains.rs

+    #[clippy::version = "1.80.0"]
+    pub HASHSET_INSERT_AFTER_CONTAINS,
+    nursery,
+    "unnecessary call to `HashSet::contains` followed by `HashSet::insert`"


Suggested change

"unnecessary call to `HashSet::contains` followed by `HashSet::insert`"

"call to `HashSet::contains` followed by `HashSet::insert`"

As stated, we cannot know whether the call is necessary.

Fixed on 2b0cad6

bors · 2024-06-11T23:22:42Z

☔ The latest upstream changes (presumably #12849) made this pull request unmergeable. Please resolve the merge conflicts.

llogiq · 2024-06-15T11:03:00Z

Sorry this is taking so long, I am unfortunately quite busy at the moment. I'll ping you once I've run the check so you don't need to rebase more than once.

bitfield · 2024-06-23T12:09:26Z

clippy_lints/src/set_contains_or_insert.rs

+    ///
+    /// ### Known problems
+    /// In case the value that wants to be inserted is borrowed and also expensive or impossible
+    /// to clone. In such scenario, the developer might want to check with `contain` before inserting,


Suggested change

/// to clone. In such scenario, the developer might want to check with `contain` before inserting,

/// to clone. In such a scenario, the developer might want to check with `contains` before inserting,

llogiq · 2024-07-03T09:52:50Z

I finally came around to do a lintcheck run, and it looked OK. So r=me after a rebase and the docs fix @bitfield suggested.

lochetti · 2024-07-03T19:03:09Z

@bors r=llogiq

bors · 2024-07-03T19:03:12Z

@lochetti: 🔑 Insufficient privileges: Not in reviewers

lochetti · 2024-07-03T19:03:45Z

I finally came around to do a lintcheck run, and it looked OK. So r=me after a rebase and the docs fix @bitfield suggested.

Oops. It looks like I can't r=you... :)

llogiq · 2024-07-04T05:38:21Z

No problem.

@bors r+

bors · 2024-07-04T05:38:24Z

📌 Commit 4e71fc4 has been approved by llogiq

It is now in the queue for this repository.

bors · 2024-07-04T05:39:30Z

⌛ Testing commit 4e71fc4 with merge d2400a4...

bors · 2024-07-04T05:46:41Z

☀️ Test successful - checks-action_dev_test, checks-action_remark_test, checks-action_test
Approved by: llogiq
Pushing d2400a4 to master...

nyurik · 2024-07-04T07:38:09Z

Awesome work! One question -- would it make sense to generalize the name of this lint to insert_after_contains ? This way it can also cover HashMap and possibly other dictionary-like patterns?

llogiq · 2024-07-04T19:22:43Z

@nyurik: Currently the lint only catches HashMaps, but I'd welcome a PR to change that along with the name.

nyurik · 2024-07-05T16:29:55Z

I could do a quick PR to rename the lint, but implementing it for other types might take a bit longer, and might not even make the cutoff for the next release (at which point the lint name would become permanent)

nyurik · 2024-07-05T16:52:45Z

Oh, my apologies, this was already renamed in the code to set_contains_or_insert. Some options of a generalized name:

contains_or_insert (short and to the point?)
collection_contains_or_insert
???

nyurik · 2024-07-05T17:12:29Z

Renamed in #13053

rustbot assigned llogiq May 31, 2024

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label May 31, 2024

lochetti mentioned this pull request May 31, 2024

New lint hashset_insert_after_contains #11296

Closed

llogiq reviewed May 31, 2024

View reviewed changes

lochetti force-pushed the issue_11103 branch from e48d9e5 to 2b0cad6 Compare June 3, 2024 18:05

bitfield reviewed Jun 23, 2024

View reviewed changes

lochetti added 3 commits July 3, 2024 19:41

Add new lint hashset_insert_after_contains

0f915f6

Rename lint, generalize function, add known issues, use multispan

6661e83

Fix typos

eff6f68

lochetti force-pushed the issue_11103 branch from 2b0cad6 to eff6f68 Compare July 3, 2024 18:42

Small fix after rebase

4e71fc4

bors merged commit d2400a4 into rust-lang:master Jul 4, 2024
11 checks passed

nyurik mentioned this pull request Jul 5, 2024

Add BTreeSet detection to the set_contains_or_insert lint #13053

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new lint `hashset_insert_after_contains` #12873

Add new lint `hashset_insert_after_contains` #12873

lochetti commented May 31, 2024

rustbot commented May 31, 2024

llogiq left a comment

llogiq May 31, 2024

lochetti Jun 1, 2024 •

edited

Loading

lochetti Jun 3, 2024

llogiq May 31, 2024

lochetti Jun 3, 2024

llogiq May 31, 2024

lochetti Jun 3, 2024

llogiq May 31, 2024

lochetti Jun 3, 2024

llogiq May 31, 2024

lochetti Jun 3, 2024

llogiq May 31, 2024

lochetti Jun 3, 2024

bors commented Jun 11, 2024

llogiq commented Jun 15, 2024

bitfield Jun 23, 2024

llogiq commented Jul 3, 2024 •

edited

Loading

lochetti commented Jul 3, 2024

bors commented Jul 3, 2024

lochetti commented Jul 3, 2024

llogiq commented Jul 4, 2024

bors commented Jul 4, 2024

bors commented Jul 4, 2024

bors commented Jul 4, 2024

nyurik commented Jul 4, 2024

llogiq commented Jul 4, 2024

nyurik commented Jul 5, 2024

nyurik commented Jul 5, 2024

nyurik commented Jul 5, 2024

	"unnecessary call to `HashSet::contains` followed by `HashSet::insert`"
	"call to `HashSet::contains` followed by `HashSet::insert`"

	/// to clone. In such scenario, the developer might want to check with `contain` before inserting,
	/// to clone. In such a scenario, the developer might want to check with `contains` before inserting,

Add new lint hashset_insert_after_contains #12873

Add new lint hashset_insert_after_contains #12873

Conversation

lochetti commented May 31, 2024

rustbot commented May 31, 2024

llogiq left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lochetti Jun 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bors commented Jun 11, 2024

llogiq commented Jun 15, 2024

Choose a reason for hiding this comment

llogiq commented Jul 3, 2024 • edited Loading

lochetti commented Jul 3, 2024

bors commented Jul 3, 2024

lochetti commented Jul 3, 2024

llogiq commented Jul 4, 2024

bors commented Jul 4, 2024

bors commented Jul 4, 2024

bors commented Jul 4, 2024

nyurik commented Jul 4, 2024

llogiq commented Jul 4, 2024

nyurik commented Jul 5, 2024

nyurik commented Jul 5, 2024

nyurik commented Jul 5, 2024

Add new lint `hashset_insert_after_contains` #12873

Add new lint `hashset_insert_after_contains` #12873

lochetti Jun 1, 2024 •

edited

Loading

llogiq commented Jul 3, 2024 •

edited

Loading