-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of git status check in cargo package
.
#9478
Conversation
r? @Eh2406 (rust-highfive has picked a reviewer for you, use r? to override) |
@bors: r+ Seems like a nice win to me! |
📌 Commit a200640 has been approved by |
☀️ Test successful - checks-actions |
Update cargo 8 commits in e51522ab3db23b0d8f1de54eb1f0113924896331..070e459c2d8b79c5b2ac5218064e7603329c92ae 2021-05-07 21:29:52 +0000 to 2021-05-11 18:12:23 +0000 - Fix rustdoc warnings (rust-lang/cargo#9468) - Improve performance of git status check in `cargo package`. (rust-lang/cargo#9478) - Link to the new rustc tests chapter. (rust-lang/cargo#9477) - Bump index cache version to deal with semver metadata version mismatch. (rust-lang/cargo#9476) - Fix Url::into_string deprecation warning (rust-lang/cargo#9475) - Fix rust-lang/cargo#4482 and rust-lang/cargo#9449: set Fossil ignore and clean settings locally (rust-lang/cargo#9469) - Improve two error messages (rust-lang/cargo#9472) - Fix `cargo install` with a semver metadata version. (rust-lang/cargo#9467)
Thanks for the performance improvement, I wasn't even aware :).
@ehuss, I think I have found a probably not too uncommon case where this isn't necessarily the case. The culprit, I believe, is this line…: status_opts
.exclude_submodules(true)
.include_ignored(true) ←
.include_untracked(true); …which can cause the publish to fail because it's alarmed by untracked files which are ignored and wouldn't be published anyway. As a concrete example, here is the publish attempt of a
(note that The top-level Now I wonder if a solution would be to pass all supposedly dirty files through the |
I can't reproduce the problem. If you'd like, I recommend opening an issue with a reproduction (minimal if possible). Ignored files that aren't published should be excluded from the intersection here. Perhaps it is possible there is some kind of path normalization issue or something. |
Thanks for the swift reply! If I read the exclusion check correctly, then There is something peculiar about these paths though:
So it refers to itself and does so in a case-insensitive manner. Maybe this is why these files have not been excluded in the first place. In any case, it's probably enough for me to cook up a test that reproduces the issue. Thanks for your help thus far! |
The check for a dirty repository during packaging/publishing is quite slow. It was calling
status_file
for every packaged file, which is very expensive. I have a directory that had about 10,000 untracked files. Previously, cargo would hang for over 2 minutes without any output. With this PR, it finishes in 0.3 seconds.The solution here is to collect the status information once, and then compare the package list against it.
One subtle point is that it does not use
recurse_untracked_dirs
, and instead relies on a primitivestarts_with
comparison, which I believe should be equivalent.This still includes an inefficient n^2 algorithm, but I am too lazy to make a better approach.
I'm moderately confident this is pretty much the same as before (at least, all the scenarios I could think of).