Add `safe.directories` config #10736

ehuss · 2022-06-08T20:58:00Z

This is an implementation of RFC 3279.

This only uses the CARGO_UNSTABLE_SAFE_DIRECTORIES=true environment variable instead of a -Z flag because -Z flags are loaded after the config files have been loaded.
I didn't feel comfortable trying to allow that to work (config is loaded early to handle aliases).

The config code needing some changing around so that the checks could be placed in one location.
There are two uses here. The normal config loading, and the config loading for the cargo config command.
This also changed because it needs to load the home directory config first instead of last, in order to fetch the safe.directories settings.

rust-highfive · 2022-06-08T20:58:03Z

@ehuss: no appropriate reviewer found, use r? to override

src/cargo/util/config/mod.rs

Muscraft · 2022-06-08T21:45:03Z

src/cargo/core/workspace.rs

+    // FIXME(ehuss): Loading and parsing manifests just to find the root seems
+    // very inefficient. I think this should be reconsidered.


The performance of this was tested before merging in this PR. There is some hit to the performance but not enough that it was a concern at the time. There was hope that any concerns about performance would be found by people testing the feature but none have come up so far.

If it is something that needs to be worked on the easiest fix would be to record workspace root paths once they have been found. Then each time a workspace root is needed you check if an already existing workspace root is the correct one for the current manifest. The idea is similar to what I described in this comment.

I just did a quick test of using workspace inheritance in the rust workspace. It increased the startup time from 358.8 ms to 449.1 ms. That's a pretty substantial increase. I didn't do any profiling to see if this particular thing is the culprit, but I suspect it is.

If you don't mind sharing what you did for the test, I would like to take a look at it. That is much higher than I have seen but I think that is because of the nesting in the rust repo.

I think caching paths is the best way to eliminate most of that but I don't know for certain. I will take a look at it and see what I can do. I don't want to clutter this PR so if a separate issue is better for further discussion I would be happy to open one.

Extract this file in benches/workspaces
rust2.zip

This is just a quick hack that adds inheritance to a bunch of packages, but I didn't really put much thought into setting it up.

Extract rust.tgz in that directory, too.

Prime the target directory with a cache of rustc info. In rust and rust2, run: cargo c -p linkchecker. Otherwise it would be measuring rustc overhead.

Test out some command that spends most of its time loading the workspace. There aren't very many good candidates, but cargo metadata or cargo tree can work. I use hyperfine for quick measurements. For example:

hyperfine "cd rust; $(rustup which cargo) metadata" "cd rust2; $(rustup which cargo) metadata"

We really should add workspace initialization benchmarks. If you want to work on that, that would be wonderful. I think that would just require adding a new benchmark to benches/benchsuite/benches that essentially just calls cargo::core::Workspace::new. Might have to factor out the common logic from resolve.rs that sets up the workspaces.

This centralizes the workspace discovery code into one location `find_workspace_root_with_loader` so that it isn't duplicated.

This reworks the walk_tree so that the home config can be loaded first in order to check safe_directories.

This adds a function that will convert an OwnershipError to a message explaining to the user what happened and how to fix it. This will be shared by both the config validation and the manifest validation.

This loads safe.directories from the home config, and then checks the ownership of all other directories.

This adds a check for file ownership when locating Cargo.toml files.

epage · 2022-06-10T16:36:42Z

src/cargo/util/config/mod.rs

+        // Note: Unlike other Cargo environment variables, this does not
+        // assume paths relative to the current directory (only absolute paths
+        // are supported). This is intended as an extra layer of safety.


I didn't see this in the RFC

For my use of git's equivalent, I did use a . though I did it through -c rather than an env variable

We are only enforcing this contract by the fact that the a . won't match anything, correct? Should we more explicitly tell the user, either via a warning or error, that their safe directory setting will do nothing?

epage · 2022-06-10T16:38:56Z

src/doc/src/reference/unstable.md

+* Tracking Issue: TODO
+* RFC: [#3279](https://github.com/rust-lang/rfcs/pull/3279)
+
+The `CARGO_UNSTABLE_SAFE_DIRECTORIES=true` environment variable enables a mode where Cargo will check the ownership of `Cargo.toml` and `config.toml` files.


Once this is merged, should we do a Call for Testing in TWiR?

bors · 2022-06-17T17:29:02Z

☔ The latest upstream changes (presumably #10764) made this pull request unmergeable. Please resolve the merge conflicts.

Add a benchmark for workspace initialization It [was suggested](#10736 (comment)) that a benchmark for workspace initialization should be added. This was suggested because there were issues with the performance of [workspace inheritance](#10747) as well as a general way to track the workspace initialization time across cargo changes ### Changes - Moved common functions out of `resolve.rs` to a shared `lib.rs` - Added a new struct to be used when creating a new benchmark - This was done because `env!("CARGO_TARGET_TMPDIR")` would fail to compile when put inside of the new `lib.rs` - Added a new workspace test for workspace inheritance - This new workspace does not have a repo that it was built from and if one needs to be made I can change that

add a cache for discovered workspace roots ## History `@ehuss` [noticed that](#10736 (comment)) workspace inheritance caused a significant increase in startup times when using workspace inheritance. This brought up the creation of #10747. When using a similar test setup [to the original](#10736 (comment)) I got ``` Benchmark 1: cd rust; ../../../target/release/cargo metadata Time (mean ± σ): 149.4 ms ± 3.8 ms [User: 105.9 ms, System: 31.7 ms] Range (min … max): 144.2 ms … 162.2 ms 19 runs Benchmark 2: cd rust-ws-inherit; ../../../target/release/cargo metadata Time (mean ± σ): 191.6 ms ± 1.4 ms [User: 145.9 ms, System: 34.2 ms] Range (min … max): 188.8 ms … 193.9 ms 15 runs ``` This showed a large increase in time per cargo command when using workspace inheritance. During the investigation of this issue, other [performance concerns were found and addressed](#10761). This resulted in a drop in time across the board but heavily favored workspace inheritance. ``` Benchmark 1: cd rust; ../../../target/release/cargo metadata Time (mean ± σ): 139.3 ms ± 1.7 ms [User: 99.8 ms, System: 29.4 ms] Range (min … max): 137.1 ms … 144.5 ms 20 runs Benchmark 2: cd rust-ws-inherit; ../../../target/release/cargo metadata Time (mean ± σ): 161.7 ms ± 1.4 ms [User: 120.4 ms, System: 31.2 ms] Range (min … max): 158.0 ms … 164.6 ms 18 runs ``` ## Performance after changes `hyperfine --warmup 10 "cd rust; ../../../target/release/cargo metadata" "cd rust-ws-inherit; ../../../target/release/cargo metadata" --runs 40` ``` Benchmark 1: cd rust; ../../../target/release/cargo metadata Time (mean ± σ): 140.1 ms ± 1.5 ms [User: 99.5 ms, System: 30.7 ms] Range (min … max): 137.4 ms … 144.0 ms 40 runs Benchmark 2: cd rust-ws-inherit; ../../../target/release/cargo metadata Time (mean ± σ): 141.8 ms ± 1.6 ms [User: 100.9 ms, System: 30.9 ms] Range (min … max): 138.4 ms … 145.4 ms 40 runs ``` [New Benchmark](#10754) `cargo bench -- workspace_initialization/rust` ``` workspace_initialization/rust time: [14.779 ms 14.880 ms 14.997 ms] workspace_initialization/rust-ws-inherit time: [16.235 ms 16.293 ms 16.359 ms] ``` ## Changes Made - [Pulled a commit](bbd41a4) from `@ehuss` that deduplicated finding a workspace root to make the changes easier - Added a cache in `Config` to hold found `WorkspaceRootConfig`s - This makes it so manifests should only be parsed once - Made `WorkspaceRootConfig` get added to the cache when parsing a manifest ## Testing Steps To check the new benchmark: 1. `cd benches/benchsuite` 2. `cargo bench -- workspace_initialization/rust` Using `hyperfine`: 1. run `cargo build --release` 2. extract `rust` and `rust-ws-inherit` in `benches/workspaces` 3. cd `benches/workspaces` 4. Prime the target directory with a cache of `rustc` info. In `rust` and `rust-ws-inherit`, run: `cargo +nightly c -p linkchecker`. Otherwise it would be measuring `rustc` overhead. 4. run `hyperfine --warmup 10 "cd rust; ../../../target/release/cargo metadata" "cd rust-ws-inherit; ../../../target/release/cargo metadata" --runs 40` closes #10747

weihanglo

This patch is really comprehensive and covers the current RFC well. Thank you!

One thing I noticed: If both .cargo/config and .cargo/config.toml exist, the current implementation seems check .cargo/config only for backward compatible reason. This behaviour is not new for this PR and looks good to me, though not written in the RFC. OTOH, the RFC does state that rustup would check rust-toolchain and rust-toolchain.toml if both exist. Is this deviation intentional?

Edited: I was wrong. The current PR against rustup doesn't check both of those files.

weihanglo · 2022-07-21T00:48:11Z

src/cargo/util/errors.rs

+    config: &Config,
+) -> anyhow::Error {
+    let to_approve = if std::env::var_os("RUSTUP_TOOLCHAIN").is_some() {
+        // FIXME: shell_escape doesn't handle powershell (and possibly other shells)


I might be overthinking this. Will there be a case that a malicious attacker creates a path that fails to escape in a specific shell, and then a user runs that without scrutiny?

weihanglo · 2022-07-21T08:20:47Z

src/cargo/util/config/mod.rs

        let mut result = Vec::new();
-        let mut seen = HashSet::new();
-        let home = self.home_path.clone().into_path_unlocked();
-        self.walk_tree(&self.cwd, &home, |path| {
-            let mut cv = self._load_file(path, &mut seen, false)?;
+        for mut cv in cvs {
            if self.cli_unstable().config_include {
                self.load_unmerged_include(&mut cv, &mut seen, &mut result)?;
            }
            result.push(cv);
-            Ok(())
-        })
-        .with_context(|| "could not load Cargo configuration")?;
+        }


(optional) could we avoid unnecessary vec allocation here, as config-include is still unstable?

Suggested change

let mut result = Vec::new();

let mut seen = HashSet::new();

let home = self.home_path.clone().into_path_unlocked();

self.walk_tree(&self.cwd, &home, |path| {

let mut cv = self._load_file(path, &mut seen, false)?;

for mut cv in cvs {

if self.cli_unstable().config_include {

self.load_unmerged_include(&mut cv, &mut seen, &mut result)?;

}

result.push(cv);

Ok(())

})

.with_context(|| "could not load Cargo configuration")?;

}

if self.cli_unstable().config_include {

let mut result = Vec::new();

for mut cv in cvs {

self.load_unmerged_include(&mut cv, &mut seen, &mut result)?;

result.push(cv);

}

Ok(result)

} else {

Ok(cvs)

}

weihanglo · 2022-07-22T21:13:48Z

crates/cargo-util/src/paths/ownership.rs

+        }
+    }
+    // This is used for testing to simulate a failure.
+    let simulate = match env::var_os("__CARGO_TEST_OWNERSHIP") {


Do we want to ship this in production? I was thinking that maybe putting this behind debug_assertion.

weihanglo · 2022-07-22T22:08:59Z

crates/cargo-util/src/paths/ownership.rs

+    fn get_username(uid: u32) -> String {
+        unsafe {
+            let amt = match libc::sysconf(libc::_SC_GETPW_R_SIZE_MAX) {
+                n if n < 0 => 512 as usize,


Sorry a stupid question. Where does 512 come from?

weihanglo · 2022-07-23T11:52:46Z

crates/cargo-util/src/paths/ownership.rs

+        let result = GetNamedSecurityInfoW(
+            path_w.as_ptr(),
+            SE_FILE_OBJECT,
+            OWNER_SECURITY_INFORMATION | DACL_SECURITY_INFORMATION,


Why do we need DACL_SECURITY_INFORMATION here?

weihanglo · 2022-07-23T12:22:51Z

crates/cargo-util/src/paths/ownership.rs

+    }
+}
+
+#[cfg(windows)]


Thank you @ehuss for building this up.

To whom that is not familiar with winapi like me, here are some important links to API used in this PR.

OpenProcessToken

GetTokenInformation

GetNamedSecurityInfoW

LookupAccountSidW

Each API handles errors and allocations a bit differently. For me, I usually heed these things:

Do we need to call a special finalizer for a certain value, e.g. CloseHandle and LocalFree?

Does the API return an error or do we need to retrieve from GetLastError?

Mind the encoding difference. UTF-16 is prevalent on Windows.

Just my personal note. Hope it helps other reviewers.

weihanglo · 2022-07-23T14:17:44Z

tests/testsuite/safe_directories.rs

+        .file("src/lib.rs", "")
+        .file("myconfig.toml", "build.jobs = 1")
+        .build();
+    p.cargo("check --config myconfig.toml -Zunstable-options")


--config is about to stabilize in 1.63.0

(and ditto throughout)

Suggested change

p.cargo("check --config myconfig.toml -Zunstable-options")

p.cargo("check --config myconfig.toml")

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 8, 2022

ehuss force-pushed the safe-directories branch from c11b02a to ef0ee72 Compare June 8, 2022 21:15

epage reviewed Jun 8, 2022

View reviewed changes

src/cargo/util/config/mod.rs Show resolved Hide resolved

ehuss force-pushed the safe-directories branch from ef0ee72 to d9eb57b Compare June 8, 2022 21:28

Muscraft reviewed Jun 8, 2022

View reviewed changes

ehuss added 9 commits June 9, 2022 17:04

Add metadata and symlink_metadata functions to cargo_util::paths

fb68ae4

Add ownership validation to cargo_util.

2189462

Rework how workspace roots are discovered.

bbd41a4

This centralizes the workspace discovery code into one location `find_workspace_root_with_loader` so that it isn't duplicated.

Rework config walk_tree in preparation for safe_directories

b2d8a44

This reworks the walk_tree so that the home config can be loaded first in order to check safe_directories.

Add OwnershipError converter.

fe72b8b

This adds a function that will convert an OwnershipError to a message explaining to the user what happened and how to fix it. This will be shared by both the config validation and the manifest validation.

Load safe.directories config and check it for config.

a31c16f

This loads safe.directories from the home config, and then checks the ownership of all other directories.

Check ownership during Cargo.toml discovery.

a9e8556

This adds a check for file ownership when locating Cargo.toml files.

Add safe.directories testsuite.

25a21df

Add documentation for safe.directories.

e966aa7

ehuss force-pushed the safe-directories branch from d9eb57b to e966aa7 Compare June 10, 2022 00:36

epage reviewed Jun 10, 2022

View reviewed changes

This was referenced Jun 10, 2022

workspace inheritiance slowing down cargo startup times #10747

Closed

Add a benchmark for workspace initialization #10754

Merged

Add preloading for workspace packages in resolve_with_previous #10761

Merged

Muscraft mentioned this pull request Jun 20, 2022

add a cache for discovered workspace roots #10776

Merged

weihanglo reviewed Jul 23, 2022

View reviewed changes

weihanglo added S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 15, 2022

weihanglo mentioned this pull request Sep 13, 2022

Aliases should not be able to shadow external subcommands #10049

Open

weihanglo mentioned this pull request Sep 18, 2022

Cargo uses configuration from $HOME/.cargo if $PWD is under $HOME even if $CARGO_HOME is set #11045

Closed

epage mentioned this pull request Oct 12, 2022

Cargo build tries to read files that it has no permissions for (Permission denied (os error 13)) #11220

Closed

weihanglo added S-blocked and removed S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. labels Dec 20, 2022

weihanglo added S-blocked-external Status: ❌ blocked on something out of the direct control of the Cargo project, e.g., upstream fix and removed S-blocked labels Apr 18, 2023

weihanglo added A-configuration Area: cargo config files and env vars A-rustup Area: rustup interaction labels May 3, 2023

weihanglo mentioned this pull request May 3, 2023

A Cargo.toml file in parent directory which you cannot delete prevents cargo from building anything #12080

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `safe.directories` config #10736

Add `safe.directories` config #10736

ehuss commented Jun 8, 2022

rust-highfive commented Jun 8, 2022

Muscraft Jun 8, 2022

ehuss Jun 10, 2022

Muscraft Jun 10, 2022

ehuss Jun 10, 2022

epage Jun 10, 2022

epage Jun 10, 2022

bors commented Jun 17, 2022

weihanglo left a comment •

edited

Loading

weihanglo Jul 21, 2022

weihanglo Jul 21, 2022

weihanglo Jul 22, 2022

weihanglo Jul 22, 2022

weihanglo Jul 23, 2022

weihanglo Jul 23, 2022

weihanglo Jul 23, 2022 •

edited

Loading

		// FIXME(ehuss): Loading and parsing manifests just to find the root seems
		// very inefficient. I think this should be reconsidered.

	p.cargo("check --config myconfig.toml -Zunstable-options")
	p.cargo("check --config myconfig.toml")

Add safe.directories config #10736

Are you sure you want to change the base?

Add safe.directories config #10736

Conversation

ehuss commented Jun 8, 2022

rust-highfive commented Jun 8, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bors commented Jun 17, 2022

weihanglo left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weihanglo Jul 23, 2022 • edited Loading

Choose a reason for hiding this comment

Add `safe.directories` config #10736

Add `safe.directories` config #10736

weihanglo left a comment •

edited

Loading

weihanglo Jul 23, 2022 •

edited

Loading