-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: create ephemeral workspace for git source #13689
base: master
Are you sure you want to change the base?
fix: create ephemeral workspace for git source #13689
Conversation
r? @weihanglo rustbot has assigned @weihanglo. Use |
Debugging note:
|
f74eabe
to
ce67365
Compare
ce67365
to
beeac46
Compare
Tested locally:
[build]
target-dir = "/Volumes/t7/code/cargo/target/tmp1"
./target/debug/cargo install --git https://github.com/hi-rustin/cargo-information --rev 5fd64bb6afe37092ee1dd7c518f4fc4865175c22
Compiling bytesize v1.3.0
Compiling cargo-information v0.4.2 (https://github.com/hi-rustin/cargo-information?rev=5fd64bb6afe37092ee1dd7c518f4fc4865175c22#5fd64bb6)
Finished `release` profile [optimized] target(s) in 1m 26s
Installing /Users/hi-rustin/.cargo/bin/cargo-info
Installed package `cargo-information v0.4.2 (https://github.com/hi-rustin/cargo-information?rev=5fd64bb6afe37092ee1dd7c518f4fc4865175c22#5fd64bb6)` (executable `cargo-info`)
md5 ~/.cargo/bin/cargo-info
MD5 (/Users/hi-rustin/.cargo/bin/cargo-info) = 3a7ccae20527098730fda436790ddf72
./target/debug/cargo install --git https://github.com/hi-rustin/cargo-information --rev e8b3419bc033d3252ad19081f516946c94e0c02e
Updating git repository `https://github.com/hi-rustin/cargo-information`
Updating crates.io index
Compiling cargo-information v0.4.2 (https://github.com/hi-rustin/cargo-information?rev=e8b3419bc033d3252ad19081f516946c94e0c02e#e8b3419b)
Finished `release` profile [optimized] target(s) in 7.85s
Replacing /Users/hi-rustin/.cargo/bin/cargo-info
Replaced package `cargo-information v0.4.2 (https://github.com/hi-rustin/cargo-information?rev=5fd64bb6afe37092ee1dd7c518f4fc4865175c22#5fd64bb6)` with `cargo-information v0.4.2 (https://github.com/hi-rustin/cargo-information?rev=e8b3419bc033d3252ad19081f516946c94e0c02e#e8b3419b)` (executable `cargo-info`)
md5 ~/.cargo/bin/cargo-info
MD5 (/Users/hi-rustin/.cargo/bin/cargo-info) = abd82eed7d6f3f88cf7a5371bf75d469 |
@weihanglo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking back into the history!
@@ -283,6 +283,10 @@ impl<'gctx> Workspace<'gctx> { | |||
ws.member_ids.insert(id); | |||
ws.default_members.push(ws.current_manifest.clone()); | |||
ws.set_resolve_behavior(); | |||
// The find_root function is used here to traverse the directory tree and locate the root of the workspace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit worried about this change. The in-memory property isn't held for ephemeral workspaces after this patch. It is also a bit fragile because the workspace discovery logic is only guarded by the other place, which is not immediately clear and hard to track.
I wonder if we really need this line here. git_install_reads_workspace_manifest
seems to fail if we don't do find_root
, but does cargo ever use profile
or anything from there? Things we might want to figure out:
- What kind of workspace support
cargo install --git
current provide? - Can we skip this and still find all members and inheritable fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but does cargo ever use
profile
or anything from there?
After this PR we can only check the manifest for errors. Before that, I belive that we could use any information defined by the workspace. For example lints etc.
- What kind of workspace support
cargo install --git
current provide?
Before this PR, we just load the whole workspace and create a workspace based on the root manifest.
After this PR, we create the workspace based on the specific package and we don't use the information from the root manifest. Only searched the root to see if there is an error in the manifest.
- Can we skip this and still find all members and inheritable fields?
After this PR, we only checked the manifest, we don't respect workspace information anymore.
For example:
If we try cargo install --git https://github.com/hi-rustin/test-cargo-git-install bin-tool;
Before this change:
test-cargo-git-install on master via 🦀 v1.77.2
❯ cargo install --git https://github.com/hi-rustin/test-cargo-git-install bin-tool
Updating git repository `https://github.com/hi-rustin/test-cargo-git-install`
warning: virtual workspace defaulting to `resolver = "1"` despite one or more workspace members being on edition 2021 which implies `resolver = "2"`
note: to keep the current resolver, specify `workspace.resolver = "1"` in the workspace root's manifest
note: to use the edition 2021 resolver, specify `workspace.resolver = "2"` in the workspace root's manifest
note: for more details see https://doc.rust-lang.org/cargo/reference/resolver.html#resolver-versions
Installing bin-tool v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)
Compiling lib1 v0.1.0 (/Users/hi-rustin/.cargo/git/checkouts/test-cargo-git-install-fae8728302b9b2da/d7a594e/lib1)
Compiling bin-tool v0.1.0 (/Users/hi-rustin/.cargo/git/checkouts/test-cargo-git-install-fae8728302b9b2da/d7a594e/bin-tool)
error: usage of an `unsafe` block
--> bin-tool/src/main.rs:2:5
|
2 | / unsafe {
3 | | println!("Hello, world!");
4 | | }
| |_____^
|
= note: requested on the command line with `-F unsafe-code`
warning: unnecessary `unsafe` block
--> bin-tool/src/main.rs:2:5
|
2 | unsafe {
| ^^^^^^ unnecessary `unsafe` block
|
= note: `#[warn(unused_unsafe)]` on by default
warning: `bin-tool` (bin "bin-tool") generated 1 warning
error: could not compile `bin-tool` (bin "bin-tool") due to 1 previous error; 1 warning emitted
error: failed to compile `bin-tool v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)`, intermediate artifacts can be found at `/var/folders/j1/7l6snwpx6svgqxh79bz_d27m0000gn/T/cargo-installbO7SHc`.
To reuse those artifacts with a future compilation, set the environment variable `CARGO_TARGET_DIR` to that path.
After this change:
test-cargo-git-install on master via 🦀 v1.77.2
❯ ../cargo/target/debug/cargo install --git https://github.com/hi-rustin/test-cargo-git-install bin-tool
Updating git repository `https://github.com/hi-rustin/test-cargo-git-install`
Installing bin-tool v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)
Compiling lib1 v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)
Compiling bin-tool v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)
Finished `release` profile [optimized] target(s) in 2.57s
Installing /Users/hi-rustin/.cargo/bin/bin-tool
Installed package `bin-tool v0.1.0 (https://github.com/hi-rustin/test-cargo-git-install#d7a594e5)` (executable `bin-tool`)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a workspace information perspective, I think this PR introduces some regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a workspace information perspective, I think this PR introduces some regression.
workspace is a feature which should not be there as it complicates all optimizations? it is only there because CARGO_TARGET_DIR does not work very well, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
workspace is already there so to move forward we need to first figure out the feature set cargo install --git
offers. However, I agree with soloturn, we may need to step back a bit to re-evaludate the bug.
Both workspace initialization and CARGO_TARGET_DIR
are symptoms of the bug. From my previous investigation, the bug is in workspace, members are collected as from PathSource
, hence cargo install
recognizes the as path dependencies. In contrast, when using in git
dependency, workspace members will be marked as from GitSource
. I wonder if we should mark the entire workspace with the same source as the root one for cargo install
. Thinking of nested packages rust-lang/rfcs#3452, if we had that today, we might also need this, otherwise nested package will be marked as from local path and never get updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
workspace is a feature which should not be there as it complicates all optimizations? it is only there because CARGO_TARGET_DIR does not work very well, isn't it?
I think it's hard to say the workspace feature complicates all optimizations. As my example in https://github.com/hi-rustin/test-cargo-git-install, respect workspace information will help us avoid inconsistency.
The problem is not caused by the CARGO_TARGET_DIR. If users set a CARGO_TARGET_DIR, then all compilation would happen in the CARGO_TARGET_DIR.
The problem here is as @weihanglo said, we cannot rebuild the binary after the git repository is updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should mark the entire workspace with the same source as the root one for
cargo install
. Thinking of nested packages rust-lang/rfcs#3452, if we had that today, we might also need this, otherwise nested package will be marked as from local path and never get updated.
That sounds like a reasonable solution, I will take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should mark the entire workspace with the same source as the root one for
cargo install
I've looked at how we calculate the fingerprint of the path member.
pub fn path_args(ws: &Workspace<'_>, unit: &Unit) -> (PathBuf, PathBuf) {
let ws_root = ws.root();
let src = match unit. target.src_path() {
TargetSourcePath::Path(path) => path.to_path_buf(),
TargetSourcePath::Metabuild => unit.pkg.manifest().metabuild_path(ws.target_dir()),
};
assert!(src.is_absolute());
if unit.pkg.package_id().source_id().is_path() {
if let Ok(path) = src.strip_prefix(ws_root) {
return (path.to_path_buf(), ws_root.to_path_buf());
}
}
(src, unit.pkg.root().to_path_buf())
}
It seems we always try to track the workspace as much as possible.
So I am not sure what you mean by mark the entire workspace with the same source
. Can you explain it a little bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we read a dependency from a Git source, no matter it is a single package or workspace member, it always returns a PackageId
with SourceKind::Git
. Under the hood it is actually a PathSource
assoicated with a Git SourceId
. This helps Cargo track the actual source.
cargo/src/cargo/sources/git/source.rs
Lines 360 to 365 in b9d913e
let source_id = self | |
.source_id | |
.with_git_precise(Some(actual_rev.to_string())); | |
let path_source = PathSource::new_recursive(&checkout_path, source_id, self.gctx); | |
self.path_source = Some(path_source); |
However, in cargo install
we use workspace to find members so we lost that sticky fashion.
) | ||
.run(); | ||
|
||
cargo_process(&format!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have a comment or two pointing out we want to observe a rebuild?
Or we can actually do a shasum comparsion like the original did?
☔ The latest upstream changes (presumably #13769) made this pull request unmergeable. Please resolve the merge conflicts. |
What does this PR try to resolve?
close #13259
Root cause
The root cause of this problem is that cargo install creates a non-ephemeral workspace for git sources, that would try finding all member packages in that git repo, and add them as path dependencies.
However, when computing fingerprints to determine if a rebuild is needed, Cargo doesn't hash git revision into fingerprint directly. Instead, it checks if source file path has changed. For git source, it is an absolute path to ~/.cargo/git/checkouts//, so commit SHA is always there. For path dependency, it is relative to workspace root, so nothing change and Cargo doesn't rebuild even it is from a different git revision.
Sloution
If any time we create an ephemeral workspace, then we will update the git source every time and that would help us use the right code and rebuild the binary correctly.
There are some historical PRs related to this change:
cargo install
using a workspace target dir #5685.Workspace::new
, but it also caused thecargo install
not working correctly when using git repo #13259.So in my fix, I just rollback to use ephemera workspace but using
find_root
to achieve the same goal of searching the root manifest. You can find more in the test:git_install_reads_workspace_manifest
How should we test and review this PR?
Additional information
None