Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo in Rust 1.66 changed location for source cache #11487

Closed
jonhoo opened this issue Dec 15, 2022 · 8 comments
Closed

Cargo in Rust 1.66 changed location for source cache #11487

jonhoo opened this issue Dec 15, 2022 · 8 comments
Labels
C-bug Category: bug regression-from-stable-to-stable Regression in stable that worked in a previous stable release.

Comments

@jonhoo
Copy link
Contributor

jonhoo commented Dec 15, 2022

Problem

In Rust 1.66.0, Cargo appears to be using a different hash for computing directory names in $CARGO_HOME/src/. This means that the cache from <=1.65.0 is no longer re-used. Furthermore, for anyone using Cargo as a library, if the version used as a library doesn't match the one used by the binary, they'll use different cache directories, causing everything to be extracted (and downloaded?) twice.

It doesn't appear to affect crates.io — not quite sure why. Do we hard-code its registry name somewhere perhaps?

Steps

$ rustup toolchain install 1.65.0
$ rustup toolchain install 1.66.0
$ cargo new cargo-version-hashing
$ cd cargo-version-hashing
$ echo 'serde = "1"' >> Cargo.toml
$ cargo generate-lockfile
$ mkdir .cargo
$ cargo local-registry -s Cargo.lock local-registry > .cargo/config.toml
$ env CARGO_HOME=$PWD/ch cargo +1.65.0 check
   Unpacking serde v1.0.150 (registry `/local/home/jongje/dev/tmp/cargo-version-hashing/local-registry`)
   Compiling serde v1.0.150
    Checking cargo-version-hashing v0.1.0 (/local/home/jongje/dev/tmp/cargo-version-hashing)
    Finished dev [unoptimized + debuginfo] target(s) in 4.18s
$ env CARGO_HOME=$PWD/ch cargo +1.66.0 check
   Unpacking serde v1.0.150 (registry `/local/home/jongje/dev/tmp/cargo-version-hashing/local-registry`)
   Compiling serde v1.0.150
    Checking cargo-version-hashing v0.1.0 (/local/home/jongje/dev/tmp/cargo-version-hashing)
    Finished dev [unoptimized + debuginfo] target(s) in 4.42s
$ exa -la ch/registry/src/
drwxr-xr-x - jongje 15 Dec 21:08 -168cb03ff00d48fc
drwxr-xr-x - jongje 15 Dec 21:08 -a4f4e04374b9ac0b
$ rm -rf ch
$ env CARGO_HOME=$PWD/ch cargo +1.65.0 check
   Unpacking serde v1.0.150 (registry `/local/home/jongje/dev/tmp/cargo-version-hashing/local-registry`)
    Finished dev [unoptimized + debuginfo] target(s) in 0.23s
$ env CARGO_HOME=$PWD/ch cargo +1.66.0 check
   Unpacking serde v1.0.150 (registry `/local/home/jongje/dev/tmp/cargo-version-hashing/local-registry`)
    Finished dev [unoptimized + debuginfo] target(s) in 0.24s
$ exa -la ch/registry/src/
drwxr-xr-x - jongje 15 Dec 21:09 -168cb03ff00d48fc
drwxr-xr-x - jongje 15 Dec 21:09 -a4f4e04374b9ac0b

Possible Solution(s)

The cause of this was #11209, which added a new variant to SourceKind, which in turn affects the derived impl Hash for SourceKind. That Hash impl is taken into account in impl Hash for SourceId here:

self.inner.kind.hash(into);

which in turn is used to generate a SourceId's "short name" here:

fn short_name(id: SourceId) -> String {
let hash = hex::short_hash(&id);
let ident = id.url().host_str().unwrap_or("").to_string();
format!("{}-{}", ident, hash)
}

I'm not 100% sure where crates.io diverges so it doesn't go through the same path, but it's definitely intentional given this test. I suspect it's because the crates.io SourceId is constructed specially:

pub fn crates_io(config: &Config) -> CargoResult<SourceId> {

Notes

No response

Version

cargo 1.66.0 (d65d197ad 2022-11-15)
release: 1.66.0
commit-hash: d65d197ad5c6c09234369f219f943e291d4f04b9
commit-date: 2022-11-15
host: aarch64-apple-darwin
libgit2: 1.5.0 (sys:0.15.0 vendored)
libcurl: 7.79.1 (sys:0.4.55+curl-7.83.1 system ssl:(SecureTransport) LibreSSL/3.3.6)
os: Mac OS 12.6.1 [64-bit]
@jonhoo jonhoo added the C-bug Category: bug label Dec 15, 2022
@weihanglo
Copy link
Member

I believe the order of enum variants matters. By moving SpareRegistry to the last, everything works fine.

Maybe we should release at least a new patch version of cargo crate.

@jonhoo
Copy link
Contributor Author

jonhoo commented Dec 15, 2022

I don't think that's true. At least from a basic test on the playground, the hash of the first variant changes if a second variant is added.

@weihanglo weihanglo added the regression-from-stable-to-stable Regression in stable that worked in a previous stable release. label Dec 15, 2022
@weihanglo
Copy link
Member

weihanglo commented Dec 16, 2022

It seems that single-variant enum has got a special treatment.

The result of cargo +nightly rustc -- -Zunpretty=expanded for enum { A }:

#![feature(prelude_import)]
#[prelude_import]
use std::prelude::rust_2021::*;
#[macro_use]
extern crate std;
enum E { A, }
#[automatically_derived]
impl ::core::hash::Hash for E {
    fn hash<__H: ::core::hash::Hasher>(&self, state: &mut __H) -> () {}
}

For enum { A, B }:

#![feature(prelude_import)]
#[prelude_import]
use std::prelude::rust_2021::*;
#[macro_use]
extern crate std;
enum E { A, B, }
#[automatically_derived]
impl ::core::hash::Hash for E {
    fn hash<__H: ::core::hash::Hasher>(&self, state: &mut __H) -> () {
        let __self_tag = ::core::intrinsics::discriminant_value(self);
        ::core::hash::Hash::hash(&__self_tag, state)
    }
}

It looks like, for enum containing 2 or more variants, the hash value of each variant is computed from discriminant_value.

@ehuss
Copy link
Contributor

ehuss commented Dec 19, 2022

@rust-lang/cargo Does anyone feel like this regression is worth pushing for a point release? Or maybe just backporting a fix to 1.67 and leave 1.66 as an outlier? Or just leave it as-is and let newer versions use the new hash.

To summarize: If someone uses cargo-local-registry, when cargo unpacks the .crate files in $CARGO_HOME/registry/src, it is placing them in a different directory than versions prior to 1.66. The main consequence is wasting disk space after upgrading to 1.66.

@jonhoo
Copy link
Contributor Author

jonhoo commented Dec 19, 2022

I believe this also applies to any registry that isn't crates.io for what it's worth.

@ehuss
Copy link
Contributor

ehuss commented Dec 19, 2022

I believe this also applies to any registry that isn't crates.io for what it's worth.

Can you provide an example for that? In my testing, it doesn't seem to be affected. AFAIK, the hash only changed for SourceKind::LocalRegistry and SourceKind::Directory.

@jonhoo
Copy link
Contributor Author

jonhoo commented Dec 20, 2022

Ohh, you're right, forgot about those being earlier variants in the enum!

In that case I'm in favor of closing this as a wontfix. Only a point release would actually help the misalignment between binary and library, and that doesn't feel worth it.

@ehuss
Copy link
Contributor

ehuss commented Jan 3, 2023

OK, I'm going to go ahead and close this as wontfix. #11501 added a test to check for regressions in the future. In the future it might be worthwhile to investigate a more reliable and stable approach to hashing.

@ehuss ehuss closed this as not planned Won't fix, can't repro, duplicate, stale Jan 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug regression-from-stable-to-stable Regression in stable that worked in a previous stable release.
Projects
None yet
Development

No branches or pull requests

3 participants