-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add safe.directories
config
#10736
base: master
Are you sure you want to change the base?
Add safe.directories
config
#10736
Conversation
@ehuss: no appropriate reviewer found, use r? to override |
// FIXME(ehuss): Loading and parsing manifests just to find the root seems | ||
// very inefficient. I think this should be reconsidered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance of this was tested before merging in this PR. There is some hit to the performance but not enough that it was a concern at the time. There was hope that any concerns about performance would be found by people testing the feature but none have come up so far.
If it is something that needs to be worked on the easiest fix would be to record workspace root paths once they have been found. Then each time a workspace root is needed you check if an already existing workspace root is the correct one for the current manifest. The idea is similar to what I described in this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did a quick test of using workspace inheritance in the rust workspace. It increased the startup time from 358.8 ms to 449.1 ms. That's a pretty substantial increase. I didn't do any profiling to see if this particular thing is the culprit, but I suspect it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't mind sharing what you did for the test, I would like to take a look at it. That is much higher than I have seen but I think that is because of the nesting in the rust repo.
I think caching paths is the best way to eliminate most of that but I don't know for certain. I will take a look at it and see what I can do. I don't want to clutter this PR so if a separate issue is better for further discussion I would be happy to open one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Extract this file in
benches/workspaces
rust2.zipThis is just a quick hack that adds inheritance to a bunch of packages, but I didn't really put much thought into setting it up.
-
Extract
rust.tgz
in that directory, too. -
Prime the target directory with a cache of
rustc
info. Inrust
andrust2
, run:cargo c -p linkchecker
. Otherwise it would be measuringrustc
overhead. -
Test out some command that spends most of its time loading the workspace. There aren't very many good candidates, but
cargo metadata
orcargo tree
can work. I use hyperfine for quick measurements. For example:hyperfine "cd rust; $(rustup which cargo) metadata" "cd rust2; $(rustup which cargo) metadata"
We really should add workspace initialization benchmarks. If you want to work on that, that would be wonderful. I think that would just require adding a new benchmark to benches/benchsuite/benches
that essentially just calls cargo::core::Workspace::new
. Might have to factor out the common logic from resolve.rs
that sets up the workspaces.
This centralizes the workspace discovery code into one location `find_workspace_root_with_loader` so that it isn't duplicated.
This reworks the walk_tree so that the home config can be loaded first in order to check safe_directories.
This adds a function that will convert an OwnershipError to a message explaining to the user what happened and how to fix it. This will be shared by both the config validation and the manifest validation.
This loads safe.directories from the home config, and then checks the ownership of all other directories.
This adds a check for file ownership when locating Cargo.toml files.
// Note: Unlike other Cargo environment variables, this does not | ||
// assume paths relative to the current directory (only absolute paths | ||
// are supported). This is intended as an extra layer of safety. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I didn't see this in the RFC
- For my use of git's equivalent, I did use a
.
though I did it through-c
rather than an env variable - We are only enforcing this contract by the fact that the a
.
won't match anything, correct? Should we more explicitly tell the user, either via a warning or error, that their safe directory setting will do nothing?
* Tracking Issue: TODO | ||
* RFC: [#3279](https://github.com/rust-lang/rfcs/pull/3279) | ||
|
||
The `CARGO_UNSTABLE_SAFE_DIRECTORIES=true` environment variable enables a mode where Cargo will check the ownership of `Cargo.toml` and `config.toml` files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this is merged, should we do a Call for Testing in TWiR?
☔ The latest upstream changes (presumably #10764) made this pull request unmergeable. Please resolve the merge conflicts. |
Add a benchmark for workspace initialization It [was suggested](#10736 (comment)) that a benchmark for workspace initialization should be added. This was suggested because there were issues with the performance of [workspace inheritance](#10747) as well as a general way to track the workspace initialization time across cargo changes ### Changes - Moved common functions out of `resolve.rs` to a shared `lib.rs` - Added a new struct to be used when creating a new benchmark - This was done because `env!("CARGO_TARGET_TMPDIR")` would fail to compile when put inside of the new `lib.rs` - Added a new workspace test for workspace inheritance - This new workspace does not have a repo that it was built from and if one needs to be made I can change that
add a cache for discovered workspace roots ## History `@ehuss` [noticed that](#10736 (comment)) workspace inheritance caused a significant increase in startup times when using workspace inheritance. This brought up the creation of #10747. When using a similar test setup [to the original](#10736 (comment)) I got ``` Benchmark 1: cd rust; ../../../target/release/cargo metadata Time (mean ± σ): 149.4 ms ± 3.8 ms [User: 105.9 ms, System: 31.7 ms] Range (min … max): 144.2 ms … 162.2 ms 19 runs Benchmark 2: cd rust-ws-inherit; ../../../target/release/cargo metadata Time (mean ± σ): 191.6 ms ± 1.4 ms [User: 145.9 ms, System: 34.2 ms] Range (min … max): 188.8 ms … 193.9 ms 15 runs ``` This showed a large increase in time per cargo command when using workspace inheritance. During the investigation of this issue, other [performance concerns were found and addressed](#10761). This resulted in a drop in time across the board but heavily favored workspace inheritance. ``` Benchmark 1: cd rust; ../../../target/release/cargo metadata Time (mean ± σ): 139.3 ms ± 1.7 ms [User: 99.8 ms, System: 29.4 ms] Range (min … max): 137.1 ms … 144.5 ms 20 runs Benchmark 2: cd rust-ws-inherit; ../../../target/release/cargo metadata Time (mean ± σ): 161.7 ms ± 1.4 ms [User: 120.4 ms, System: 31.2 ms] Range (min … max): 158.0 ms … 164.6 ms 18 runs ``` ## Performance after changes `hyperfine --warmup 10 "cd rust; ../../../target/release/cargo metadata" "cd rust-ws-inherit; ../../../target/release/cargo metadata" --runs 40` ``` Benchmark 1: cd rust; ../../../target/release/cargo metadata Time (mean ± σ): 140.1 ms ± 1.5 ms [User: 99.5 ms, System: 30.7 ms] Range (min … max): 137.4 ms … 144.0 ms 40 runs Benchmark 2: cd rust-ws-inherit; ../../../target/release/cargo metadata Time (mean ± σ): 141.8 ms ± 1.6 ms [User: 100.9 ms, System: 30.9 ms] Range (min … max): 138.4 ms … 145.4 ms 40 runs ``` [New Benchmark](#10754) `cargo bench -- workspace_initialization/rust` ``` workspace_initialization/rust time: [14.779 ms 14.880 ms 14.997 ms] workspace_initialization/rust-ws-inherit time: [16.235 ms 16.293 ms 16.359 ms] ``` ## Changes Made - [Pulled a commit](bbd41a4) from `@ehuss` that deduplicated finding a workspace root to make the changes easier - Added a cache in `Config` to hold found `WorkspaceRootConfig`s - This makes it so manifests should only be parsed once - Made `WorkspaceRootConfig` get added to the cache when parsing a manifest ## Testing Steps To check the new benchmark: 1. `cd benches/benchsuite` 2. `cargo bench -- workspace_initialization/rust` Using `hyperfine`: 1. run `cargo build --release` 2. extract `rust` and `rust-ws-inherit` in `benches/workspaces` 3. cd `benches/workspaces` 4. Prime the target directory with a cache of `rustc` info. In `rust` and `rust-ws-inherit`, run: `cargo +nightly c -p linkchecker`. Otherwise it would be measuring `rustc` overhead. 4. run `hyperfine --warmup 10 "cd rust; ../../../target/release/cargo metadata" "cd rust-ws-inherit; ../../../target/release/cargo metadata" --runs 40` closes #10747
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch is really comprehensive and covers the current RFC well. Thank you!
One thing I noticed: If both .cargo/config
and .cargo/config.toml
exist, the current implementation seems check .cargo/config
only for backward compatible reason. This behaviour is not new for this PR and looks good to me, though not written in the RFC. OTOH, the RFC does state that rustup would check rust-toolchain
and rust-toolchain.toml
if both exist. Is this deviation intentional?
Edited: I was wrong. The current PR against rustup doesn't check both of those files.
config: &Config, | ||
) -> anyhow::Error { | ||
let to_approve = if std::env::var_os("RUSTUP_TOOLCHAIN").is_some() { | ||
// FIXME: shell_escape doesn't handle powershell (and possibly other shells) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be overthinking this. Will there be a case that a malicious attacker creates a path that fails to escape in a specific shell, and then a user runs that without scrutiny?
let mut result = Vec::new(); | ||
let mut seen = HashSet::new(); | ||
let home = self.home_path.clone().into_path_unlocked(); | ||
self.walk_tree(&self.cwd, &home, |path| { | ||
let mut cv = self._load_file(path, &mut seen, false)?; | ||
for mut cv in cvs { | ||
if self.cli_unstable().config_include { | ||
self.load_unmerged_include(&mut cv, &mut seen, &mut result)?; | ||
} | ||
result.push(cv); | ||
Ok(()) | ||
}) | ||
.with_context(|| "could not load Cargo configuration")?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(optional) could we avoid unnecessary vec allocation here, as config-include is still unstable?
let mut result = Vec::new(); | |
let mut seen = HashSet::new(); | |
let home = self.home_path.clone().into_path_unlocked(); | |
self.walk_tree(&self.cwd, &home, |path| { | |
let mut cv = self._load_file(path, &mut seen, false)?; | |
for mut cv in cvs { | |
if self.cli_unstable().config_include { | |
self.load_unmerged_include(&mut cv, &mut seen, &mut result)?; | |
} | |
result.push(cv); | |
Ok(()) | |
}) | |
.with_context(|| "could not load Cargo configuration")?; | |
} | |
if self.cli_unstable().config_include { | |
let mut result = Vec::new(); | |
for mut cv in cvs { | |
self.load_unmerged_include(&mut cv, &mut seen, &mut result)?; | |
result.push(cv); | |
} | |
Ok(result) | |
} else { | |
Ok(cvs) | |
} |
} | ||
} | ||
// This is used for testing to simulate a failure. | ||
let simulate = match env::var_os("__CARGO_TEST_OWNERSHIP") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to ship this in production? I was thinking that maybe putting this behind debug_assertion
.
fn get_username(uid: u32) -> String { | ||
unsafe { | ||
let amt = match libc::sysconf(libc::_SC_GETPW_R_SIZE_MAX) { | ||
n if n < 0 => 512 as usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry a stupid question. Where does 512 come from?
let result = GetNamedSecurityInfoW( | ||
path_w.as_ptr(), | ||
SE_FILE_OBJECT, | ||
OWNER_SECURITY_INFORMATION | DACL_SECURITY_INFORMATION, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need DACL_SECURITY_INFORMATION
here?
} | ||
} | ||
|
||
#[cfg(windows)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @ehuss for building this up.
To whom that is not familiar with winapi like me, here are some important links to API used in this PR.
Each API handles errors and allocations a bit differently. For me, I usually heed these things:
- Do we need to call a special finalizer for a certain value, e.g.
CloseHandle
andLocalFree
? - Does the API return an error or do we need to retrieve from
GetLastError
? - Mind the encoding difference. UTF-16 is prevalent on Windows.
Just my personal note. Hope it helps other reviewers.
.file("src/lib.rs", "") | ||
.file("myconfig.toml", "build.jobs = 1") | ||
.build(); | ||
p.cargo("check --config myconfig.toml -Zunstable-options") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--config
is about to stabilize in 1.63.0
(and ditto throughout)
p.cargo("check --config myconfig.toml -Zunstable-options") | |
p.cargo("check --config myconfig.toml") |
This is an implementation of RFC 3279.
This only uses the
CARGO_UNSTABLE_SAFE_DIRECTORIES=true
environment variable instead of a-Z
flag because-Z
flags are loaded after the config files have been loaded.I didn't feel comfortable trying to allow that to work (config is loaded early to handle aliases).
The config code needing some changing around so that the checks could be placed in one location.
There are two uses here. The normal config loading, and the config loading for the
cargo config
command.This also changed because it needs to load the home directory config first instead of last, in order to fetch the
safe.directories
settings.