Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A metadata collection monster #6887

Merged

Conversation

xFrednet
Copy link
Member

@xFrednet xFrednet commented Mar 11, 2021

This PR introduces a metadata collection lint as discussed in #4310. It currently collects:

This data has a slightly different structure than the current lints.json and doesn't include depreciated lints yet. I plan to adapt the website to the new format and include depreciated lints in a follow-up PR :). The current collected json looks like this: metadata_collection.json

The entire implementation is guarded behind the metadata-collector-lint feature and the ENABLE_METADATA_COLLECTION environment value to prevent default collection. You can test the implementation via:

$ ENABLE_METADATA_COLLECTION=1 cargo test --test dogfood --all-features

changelog: none


The size of this PR sadly also grew into a small monster, sorry! I definitely plan to improve on this! And it's totally okay if you take your time with this :)

r? @phansch
cc: @flip1995

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Mar 11, 2021
Copy link
Member

@flip1995 flip1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another 600++ PR. 😄 I'll take a closer look at the implementation and try it out this weekend! Infra PR reviews are way more exciting than FP fixes/new lint impls. 😉

To minimize the changes we have to do to the other infra, I suggest to save the .json file to the same path and name, as the export.py script currently does. I don't have a problem with changing/improving the format of the file though.

Skimming over it, I only found 2 NITs.

clippy_lints/Cargo.toml Show resolved Hide resolved
@xFrednet
Copy link
Member Author

xFrednet commented Mar 12, 2021

Another 600++ PR. 😄

Yes 😅 This code is luckily way more straight forward in my opinion it has two execution paths

  1. for collecting the lint information
  2. for resolving the applicability

It was sadly hard to split it up as each chapter would have been incomplete, but the rest will be split into smaller PRs. Mainly collecting depreciated lints, lint configuration and adapting the website.

To minimize the changes we have to do to the other infra, I suggest to save the .json file to the same path and name, as the export.py script currently does. I don't have a problem with changing/improving the format of the file though.

I would definitely change the export location, but wait a bit on the name until it can actually be used by the website :)

We could also think about moving the implementation into the clippy_dev tool with the new clippy_util refactorings that have been going on. This would just require a dependency to clippy_util and some wrapping. I wouldn't do it in this PR, but it's definitely a possibility to keep in mind :) (On second thought this would require some adjustments to collect the lint group as it's currently extracted from the LintStore. Still nothing that can't be done.)


Off-topic: About the website, I thought about rewriting it to either use a different framework or maybe be statically generated as it seems slow to display the first list. Also in comparison to https://rust-lang.github.io/rustfmt for instance. @phansch mentioned in #6877 the possibility of reusing an existing technology like mdbook. These are just some thoughts so far. I'll do some testing and then start a thread on Zulip to get some general feedback from the group :)

@xFrednet xFrednet force-pushed the 4310-internal-metadata-extraction-lint branch 2 times, most recently from 5c790af to bb26194 Compare March 12, 2021 20:52
@xFrednet
Copy link
Member Author

I've added a new environment value called ENABLE_METADATA_COLLECTION to prevent the accidental metadata collection while using dogfood. (Dogfood runs Clippy with all features enabled). This allows the dynamic choice of running the collection (if the lint was compiled).

The new comment to collect the metadata is:

$ ENABLE_METADATA_COLLECTION=1 cargo test --test dogfood --all-features

cargo test will no longer collect it by default

@bors
Copy link
Contributor

bors commented Mar 13, 2021

☔ The latest upstream changes (presumably #6865) made this pull request unmergeable. Please resolve the merge conflicts.

@xFrednet xFrednet force-pushed the 4310-internal-metadata-extraction-lint branch 3 times, most recently from 8f13e60 to c8e9d15 Compare March 13, 2021 18:21
@bors
Copy link
Contributor

bors commented Mar 16, 2021

☔ The latest upstream changes (presumably #6912) made this pull request unmergeable. Please resolve the merge conflicts.

@xFrednet
Copy link
Member Author

Hey a question before I resolve the merge conflicts. I think it would be possible to split this PR into two separate ones a ~300 lines (Still a lot 😅 ). Should I do that or have you already started to review this?

@flip1995
Copy link
Member

I started reviewing this over the weekend, but haven't got any comments about it, yet. Don't worry about rebasing this, since the main part of this is in its own module anyway. So I would suggest to just keep it like this until it is finished and then rebase it once.

@xFrednet
Copy link
Member Author

Awesome, sounds like a plan 🙃

Copy link
Member

@phansch phansch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good to me overall! I can't comment too much on the resolvers, though; my Rust is getting a bit rusty 😭


impl Drop for MetadataCollector {
/// You might ask: How hacky is this?
/// My answer: YES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be possible to avoid the drop code if we append the lints to a file while the collector is running?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sadly not possible since we collect data from several sources (lint definition, emission and later configuration). The order of this collection could differ between lints and rustc versions. It would therefor be hard to say when the last data was collected about a lint and when it should be written to a file.

The best solution for this hack would probably be to move it into clippy_dev with a linked requirement to clippy_util. Then we could just call a write function after the whole compilation. Moving this would however take extra work and code and might be something for the future :)

///
/// We use a Heap here to have the lints added in alphabetic order in the export
lints: BinaryHeap<LintMetadata>,
applicability_into: FxHashMap<String, ApplicabilityInfo>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
applicability_into: FxHashMap<String, ApplicabilityInfo>,
applicability_info: FxHashMap<String, ApplicabilityInfo>,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A spelling mistake as usual ^^. I'll wait on flips review/approval and then fix this as well 🙃

Thank you very much for the review! <3

Copy link
Member

@flip1995 flip1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found another typo. Otherwise I think this is really well written and documented. I have yet to test it myself. I hope I can get to this and other things on my Clippy todo list over Easter.

With the NITs fixed I would suggest to just merge this as is and do fine tuning / CI integration / ... in follow up PRs.

}

/// Add a new single expression suggestion to the counter
fn add_singe_span_suggestion(&mut self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn add_singe_span_suggestion(&mut self) {
fn add_single_span_suggestion(&mut self) {

cx,
INTERNAL_METADATA_COLLECTOR,
item.ident.span,
&format!("Metadata collection error for `{}`: {}", item.ident.name, message),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&format!("Metadata collection error for `{}`: {}", item.ident.name, message),
&format!("metadata collection error for `{}`: {}", item.ident.name, message),

I really like that you use a lint message for error handling :D

Comment on lines +117 to +119
// The metadata collector gets dropped twice, this makes sure that we only write
// when the list is full
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it dropped twice? Is this something totally obvious I'm missing here?

I'm ok with using drop to write out the sugggestion. This is a quite good solution IMO.

Copy link
Member Author

@xFrednet xFrednet Mar 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that drop is called when the value gets out of scope (This is used by reference counting smart pointers to my knowledge) and that the compiler maybe constructs it twice? I'm not sure though, and I haven't investigated this further as I haven't quite figured out how to properly set breakpoints in Clippy 😅.

This is a quite good solution IMO

Thank you 🙃

@xFrednet xFrednet force-pushed the 4310-internal-metadata-extraction-lint branch from b4d8d1d to 72ed0b0 Compare March 31, 2021 18:34
@xFrednet
Copy link
Member Author

xFrednet commented Mar 31, 2021

Thank you two for the reviews! I've rebased from master and the branch is ready to be merged from my side.


I would suggest to just merge this as is and do fine tuning / CI integration / ... in follow up PRs.

Big thumps up from my side. The current version also doesn't collect depreciated lints and configurations which should both be added before the python script can retire. (The implementations of those features are ready in other branches xD )

My plan would be to get them merged and then adapt our current website to use the new format along with some other improvements (like a dark theme). @flip1995 you mentioned in #6992 that you would prefer the use of MdBook and @phansch has also suggested this in #6877. Do you think it's worth it to adapt the website to the new format, or would you prefer me to write a tool to generate a complete MD file for such a book? 🙃

@xFrednet xFrednet force-pushed the 4310-internal-metadata-extraction-lint branch from 72ed0b0 to 9e076d9 Compare March 31, 2021 18:38
@flip1995
Copy link
Member

Do you think it's worth it to adapt the website to the new format, or would you prefer me to write a tool to generate a complete MD file for such a book?

I think we would need a decision where we want to go with the Clippy documentation. I really like the filtering on the website, which won't be possible in a mdbook. But on the other hand a mdbook is more commonly used across Rust projects.

Let's first try to adapt the website to the new json format. cargo dev serve will be really helpful while doing this.

I'll do a final review of this PR over the weekend.

@bors
Copy link
Contributor

bors commented Apr 16, 2021

☔ The latest upstream changes (presumably #7049) made this pull request unmergeable. Please resolve the merge conflicts.

@flip1995
Copy link
Member

flip1995 commented Apr 30, 2021

I'll do a final review of this PR over the weekend.

I never said which weekend, and now its only Friday and I already did my final review! 😛 (Sorry about that, I try to put more time into Clippy again, but the ongoing lockdown really hurts my productivity and motivation for pretty much anything...)


Anyway, a few small changes, I'd like to see in this PR:

  1. add "internal-lints" to the list of implied deps in the Cargo.toml, so that you don't have to run this tool with --all-features, but can just enable this one feature.¹
  2. rebase
  3. @bors delegate+ Go ahead and r=me once the 2 points above are addressed, so that you don't need to keep rebasing the changes, if I forget about this PR again.

--

A. IIUC it doesn't parse the config of lints yet?
B. Does this tool depend on the lints being registered? Otherwise, you could just not register the lints, if the feature is enabled, so that dogfood runs way faster. This could look something like this: flip1995@fa539c9#diff-3767332344fc72c5464f444b995d7a5a7c7f6afa2559ba3207a5605a6e8c0dbf


¹

diff --git a/Cargo.toml b/Cargo.toml
index bdee8e408..23a035f97 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -52,7 +52,7 @@ rustc_tools_util = { version = "0.2.0", path = "rustc_tools_util" }
 deny-warnings = []
 integration = ["tempfile"]
 internal-lints = ["clippy_lints/internal-lints"]
-metadata-collector-lint = ["clippy_lints/metadata-collector-lint"]
+metadata-collector-lint = ["internal-lints", "clippy_lints/metadata-collector-lint"]

 [package.metadata.rust-analyzer]
 # This package uses #[feature(rustc_private)]

@bors
Copy link
Contributor

bors commented Apr 30, 2021

✌️ @xFrednet can now approve this pull request

@xFrednet
Copy link
Member Author

I never said which weekend, and now its only Friday and I already did my final review! stuck_out_tongue (Sorry about that, I try to put more time into Clippy again, but the ongoing lockdown really hurts my productivity and motivation for pretty much anything...)

True, I can't complain as this is still quite the monster PR ^^. I definitely know what you mean, we can at least slowly see a some light on the horizon!


Anyways. The two points seem to be simple enough. I have an assignment for my university that will continue to eat up most of my time until next week, but I'll address them afterwards. Thank you for the delegate! 🙃


A. IIUC it doesn't parse the config of lints yet?

  • Correct that implementation is in another branch to keep this PR a manageable size. Depreciated lints are also missing in this version. Both additions are around 60 to 120 lines. So. I'll kick that of when this is merged :)

B. Does this tool depend on the lints being registered? Otherwise, you could just not register the lints, if the feature is enabled, so that dogfood runs way faster. This could look something like this: flip1995@fa539c9#diff-3767332344fc72c5464f444b995d7a5a7c7f6afa2559ba3207a5605a6e8c0dbf

  • This is already implemented by these lines or do you mean something else?
    #[cfg(feature = "metadata-collector-lint")]
    {
        if std::env::var("ENABLE_METADATA_COLLECTION").eq(&Ok("1".to_string())) {
            store.register_late_pass(|| box utils::internal_lints::metadata_collector::MetadataCollector::default());
        }
    }

@flip1995
Copy link
Member

flip1995 commented Apr 30, 2021

This is already implemented by these lines or do you mean something else?

I meant to only register the metadata collection pass and no other lint pass. That would speed up dogfood a bit and therefore the metadata generation.

I think this tool also only runs on clippy_lints, right? In that case, you could filter out everything else from dogfood, if this feature is enabled. (I'm already thinking about CI/CD integration here, so this doesn't have to be addressed in this PR).

@xFrednet
Copy link
Member Author

I meant to only register the metadata collection pass and no other lint pass. That would speed up dogfood a bit and therefore the metadata generation.

Ahh, that makes sense. I actually believe that it doesn't require that no. Consider it added to the To-do list 🙃

@xFrednet xFrednet force-pushed the 4310-internal-metadata-extraction-lint branch from 9e076d9 to e0eb29c Compare May 5, 2021 16:59
@xFrednet
Copy link
Member Author

xFrednet commented May 5, 2021

@bors r+

@bors
Copy link
Contributor

bors commented May 5, 2021

📌 Commit e0eb29c has been approved by xFrednet

@bors
Copy link
Contributor

bors commented May 5, 2021

⌛ Testing commit e0eb29c with merge f35345d...

@xFrednet
Copy link
Member Author

xFrednet commented May 5, 2021

Bors apparently ignored your r=me @flip1995. I hope this still okay 😅

@flip1995
Copy link
Member

flip1995 commented May 5, 2021

Oh, you have to write r=<name>. r=me is nothing official for bors, just saying "I'm ok with merging this but it is waiting for something and I don't want to keep tracking this PR or do the work of writing r+ myself" 😄

But yeah, this is also fine.

@bors
Copy link
Contributor

bors commented May 5, 2021

☀️ Test successful - checks-action_dev_test, checks-action_remark_test, checks-action_test
Approved by: xFrednet
Pushing f35345d to master...

@bors bors merged commit f35345d into rust-lang:master May 5, 2021
@xFrednet
Copy link
Member Author

xFrednet commented May 5, 2021

Ahh okay, I know that for next time. Thank you ❤️

@xFrednet xFrednet deleted the 4310-internal-metadata-extraction-lint branch July 28, 2021 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants