Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hash-checking support to install and sync #2945

Merged
merged 3 commits into from
Apr 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ indoc = { version = "2.0.4" }
itertools = { version = "0.12.1" }
junction = { version = "1.0.0" }
mailparse = { version = "0.14.0" }
md-5 = { version = "0.10.6" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to support md5? It's cryptographically broken

miette = { version = "7.2.0" }
nanoid = { version = "0.4.0" }
once_cell = { version = "1.19.0" }
Expand Down
8 changes: 0 additions & 8 deletions PIP_COMPATIBILITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,14 +259,6 @@ When uv resolutions differ from `pip` in undesirable ways, it's often a sign tha
are too loose, and that the user should consider tightening them. For example, in the case of
`starlette` and `fastapi`, the user could require `fastapi>=0.110.0`.

## Hash-checking mode

While uv will include hashes via `uv pip compile --generate-hashes`, it does not support
hash-checking mode, which is a feature of `pip` that allows users to verify the integrity of
downloaded packages by checking their hashes against those provided in the `requirements.txt` file.

In the future, uv will support hash-checking mode. For more, see [#474](https://github.com/astral-sh/uv/issues/474).

## `pip check`

At present, `uv pip check` will surface the following diagnostics:
Expand Down
48 changes: 40 additions & 8 deletions crates/distribution-types/src/cached.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@ use anyhow::Result;

use distribution_filename::WheelFilename;
use pep508_rs::VerbatimUrl;
use pypi_types::HashDigest;
use uv_normalize::PackageName;

use crate::direct_url::{DirectUrl, LocalFileUrl};
use crate::hashed::Hashed;
use crate::{
BuiltDist, Dist, DistributionMetadata, InstalledMetadata, InstalledVersion, Name, SourceDist,
VersionOrUrl,
Expand All @@ -25,6 +27,7 @@ pub enum CachedDist {
pub struct CachedRegistryDist {
pub filename: WheelFilename,
pub path: PathBuf,
pub hashes: Vec<HashDigest>,
}

#[derive(Debug, Clone)]
Expand All @@ -33,45 +36,60 @@ pub struct CachedDirectUrlDist {
pub url: VerbatimUrl,
pub path: PathBuf,
pub editable: bool,
pub hashes: Vec<HashDigest>,
}

impl CachedDist {
/// Initialize a [`CachedDist`] from a [`Dist`].
pub fn from_remote(remote: Dist, filename: WheelFilename, path: PathBuf) -> Self {
pub fn from_remote(
remote: Dist,
filename: WheelFilename,
hashes: Vec<HashDigest>,
path: PathBuf,
) -> Self {
match remote {
Dist::Built(BuiltDist::Registry(_dist)) => {
Self::Registry(CachedRegistryDist { filename, path })
}
Dist::Built(BuiltDist::Registry(_dist)) => Self::Registry(CachedRegistryDist {
filename,
path,
hashes,
}),
Dist::Built(BuiltDist::DirectUrl(dist)) => Self::Url(CachedDirectUrlDist {
filename,
url: dist.url,
hashes,
path,
editable: false,
}),
Dist::Built(BuiltDist::Path(dist)) => Self::Url(CachedDirectUrlDist {
filename,
url: dist.url,
hashes,
path,
editable: false,
}),
Dist::Source(SourceDist::Registry(_dist)) => {
Self::Registry(CachedRegistryDist { filename, path })
}
Dist::Source(SourceDist::Registry(_dist)) => Self::Registry(CachedRegistryDist {
filename,
path,
hashes,
}),
Dist::Source(SourceDist::DirectUrl(dist)) => Self::Url(CachedDirectUrlDist {
filename,
url: dist.url,
hashes,
path,
editable: false,
}),
Dist::Source(SourceDist::Git(dist)) => Self::Url(CachedDirectUrlDist {
filename,
url: dist.url,
hashes,
path,
editable: false,
}),
Dist::Source(SourceDist::Path(dist)) => Self::Url(CachedDirectUrlDist {
filename,
url: dist.url,
hashes,
path,
editable: dist.editable,
}),
Expand Down Expand Up @@ -104,13 +122,15 @@ impl CachedDist {
}
}

/// Returns `true` if the distribution is editable.
pub fn editable(&self) -> bool {
match self {
Self::Registry(_) => false,
Self::Url(dist) => dist.editable,
}
}

/// Returns the [`WheelFilename`] of the distribution.
pub fn filename(&self) -> &WheelFilename {
match self {
Self::Registry(dist) => &dist.filename,
Expand All @@ -119,12 +139,24 @@ impl CachedDist {
}
}

impl Hashed for CachedRegistryDist {
fn hashes(&self) -> &[HashDigest] {
&self.hashes
}
}

impl CachedDirectUrlDist {
/// Initialize a [`CachedDirectUrlDist`] from a [`WheelFilename`], [`url::Url`], and [`Path`].
pub fn from_url(filename: WheelFilename, url: VerbatimUrl, path: PathBuf) -> Self {
pub fn from_url(
filename: WheelFilename,
url: VerbatimUrl,
hashes: Vec<HashDigest>,
path: PathBuf,
) -> Self {
Self {
filename,
url,
hashes,
path,
editable: false,
}
Expand Down
27 changes: 27 additions & 0 deletions crates/distribution-types/src/hashed.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
use pypi_types::HashDigest;

pub trait Hashed {
/// Return the [`HashDigest`]s for the archive.
fn hashes(&self) -> &[HashDigest];

/// Returns `true` if the archive satisfies the given hashes.
fn satisfies(&self, hashes: &[HashDigest]) -> bool {
if hashes.is_empty() {
true
} else {
self.hashes().iter().any(|hash| hashes.contains(hash))
}
}

/// Returns `true` if the archive includes a hash for at least one of the given algorithms.
fn has_digests(&self, hashes: &[HashDigest]) -> bool {
if hashes.is_empty() {
true
} else {
hashes
.iter()
.map(HashDigest::algorithm)
.any(|algorithm| self.hashes().iter().any(|hash| hash.algorithm == algorithm))
}
}
}
2 changes: 2 additions & 0 deletions crates/distribution-types/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ pub use crate::direct_url::*;
pub use crate::editable::*;
pub use crate::error::*;
pub use crate::file::*;
pub use crate::hashed::*;
pub use crate::id::*;
pub use crate::index_url::*;
pub use crate::installed::*;
Expand All @@ -66,6 +67,7 @@ mod direct_url;
mod editable;
mod error;
mod file;
mod hashed;
mod id;
mod index_url;
mod installed;
Expand Down
39 changes: 32 additions & 7 deletions crates/distribution-types/src/prioritized_distribution.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@ impl Display for IncompatibleDist {
IncompatibleWheel::RequiresPython(python) => {
write!(f, "it requires at python {python}")
}
IncompatibleWheel::MissingHash => f.write_str("it has no hash"),
IncompatibleWheel::MismatchedHash => f.write_str("the hash does not match"),
},
Self::Source(incompatibility) => match incompatibility {
IncompatibleSource::NoBuild => {
Expand All @@ -104,6 +106,8 @@ impl Display for IncompatibleDist {
IncompatibleSource::RequiresPython(python) => {
write!(f, "it requires python {python}")
}
IncompatibleSource::MissingHash => f.write_str("it has no hash"),
IncompatibleSource::MismatchedHash => f.write_str("the hash does not match"),
},
Self::Unavailable => f.write_str("no distributions are available"),
}
Expand All @@ -122,6 +126,8 @@ pub enum IncompatibleWheel {
Tag(IncompatibleTag),
RequiresPython(VersionSpecifiers),
Yanked(Yanked),
MissingHash,
MismatchedHash,
NoBinary,
}

Expand All @@ -136,6 +142,8 @@ pub enum IncompatibleSource {
ExcludeNewer(Option<i64>),
RequiresPython(VersionSpecifiers),
Yanked(Yanked),
MissingHash,
MismatchedHash,
NoBuild,
}

Expand Down Expand Up @@ -381,20 +389,26 @@ impl IncompatibleSource {
Self::ExcludeNewer(timestamp_self) => match other {
// Smaller timestamps are closer to the cut-off time
Self::ExcludeNewer(timestamp_other) => timestamp_other < timestamp_self,
Self::NoBuild | Self::RequiresPython(_) | Self::Yanked(_) => true,
Self::NoBuild
| Self::RequiresPython(_)
| Self::Yanked(_)
| Self::MissingHash
| Self::MismatchedHash => true,
},
Self::RequiresPython(_) => match other {
Self::ExcludeNewer(_) => false,
// Version specifiers cannot be reasonably compared
Self::RequiresPython(_) => false,
Self::NoBuild | Self::Yanked(_) => true,
Self::NoBuild | Self::Yanked(_) | Self::MissingHash | Self::MismatchedHash => true,
},
Self::Yanked(_) => match other {
Self::ExcludeNewer(_) | Self::RequiresPython(_) => false,
// Yanks with a reason are more helpful for errors
Self::Yanked(yanked_other) => matches!(yanked_other, Yanked::Reason(_)),
Self::NoBuild => true,
Self::NoBuild | Self::MissingHash | Self::MismatchedHash => true,
},
Self::MissingHash => false,
Self::MismatchedHash => false,
Self::NoBuild => false,
}
}
Expand All @@ -412,26 +426,37 @@ impl IncompatibleWheel {
timestamp_other < timestamp_self
}
},
Self::NoBinary | Self::RequiresPython(_) | Self::Tag(_) | Self::Yanked(_) => true,
Self::NoBinary
| Self::RequiresPython(_)
| Self::Tag(_)
| Self::Yanked(_)
| Self::MissingHash
| Self::MismatchedHash => true,
},
Self::Tag(tag_self) => match other {
Self::ExcludeNewer(_) => false,
Self::Tag(tag_other) => tag_other > tag_self,
Self::NoBinary | Self::RequiresPython(_) | Self::Yanked(_) => true,
Self::NoBinary
| Self::RequiresPython(_)
| Self::Yanked(_)
| Self::MissingHash
| Self::MismatchedHash => true,
},
Self::RequiresPython(_) => match other {
Self::ExcludeNewer(_) | Self::Tag(_) => false,
// Version specifiers cannot be reasonably compared
Self::RequiresPython(_) => false,
Self::NoBinary | Self::Yanked(_) => true,
Self::NoBinary | Self::Yanked(_) | Self::MissingHash | Self::MismatchedHash => true,
},
Self::Yanked(_) => match other {
Self::ExcludeNewer(_) | Self::Tag(_) | Self::RequiresPython(_) => false,
// Yanks with a reason are more helpful for errors
Self::Yanked(yanked_other) => matches!(yanked_other, Yanked::Reason(_)),
Self::NoBinary => true,
Self::NoBinary | Self::MissingHash | Self::MismatchedHash => true,
},
Self::NoBinary => false,
Self::MismatchedHash => false,
Self::MissingHash => false,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zanieb - I could use your help with this part. I'm not sure if I did the comparisons correctly here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is actually wrong, although I'm not sure if it matters. You're supposed to be enforcing an ordering but here you're saying that MissingHash and MismatchedHash are never more compatible than the other value. This means that if we see both MissingHash and MismatchedHash we would arbitrarily display the first one we saw instead of preferring to present one of them to the user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to change up the strategy in #2949, such that we treat distributions without hashes as compatible (but lower-priority). I can merge that into this PR if you agree with the change.

}
}
}
2 changes: 1 addition & 1 deletion crates/pep508-rs/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ pub enum Pep508ErrorSource {
String(String),
/// A URL parsing error.
#[error(transparent)]
UrlError(#[from] verbatim_url::VerbatimUrlError),
UrlError(#[from] VerbatimUrlError),
/// The version requirement is not supported.
#[error("{0}")]
UnsupportedRequirement(String),
Expand Down
4 changes: 2 additions & 2 deletions crates/uv-cache/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -594,12 +594,12 @@ pub enum CacheBucket {
impl CacheBucket {
fn to_str(self) -> &'static str {
match self {
Self::BuiltWheels => "built-wheels-v2",
Self::BuiltWheels => "built-wheels-v3",
Self::FlatIndex => "flat-index-v0",
Self::Git => "git-v0",
Self::Interpreter => "interpreter-v0",
Self::Simple => "simple-v7",
Self::Wheels => "wheels-v0",
Self::Wheels => "wheels-v1",
Self::Archive => "archive-v0",
}
}
Expand Down
Loading
Loading