-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tarball: Use cargo_toml
to parse Cargo.toml
file
#6914
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,44 +1,89 @@ | ||
use cargo_toml::OptionalFile; | ||
use derive_deref::Deref; | ||
use serde::{de, Deserialize, Deserializer}; | ||
|
||
#[derive(Debug, Deserialize)] | ||
pub struct Manifest { | ||
#[serde(alias = "project")] | ||
pub package: Package, | ||
} | ||
|
||
#[derive(Debug, Deserialize)] | ||
#[serde(rename_all = "kebab-case")] | ||
pub struct Package { | ||
pub name: String, | ||
pub version: String, | ||
#[serde(default)] | ||
pub readme: OptionalFile, | ||
pub repository: Option<String>, | ||
pub rust_version: Option<RustVersion>, | ||
} | ||
|
||
#[derive(Debug, Deref)] | ||
pub struct RustVersion(String); | ||
|
||
impl PartialEq<&str> for RustVersion { | ||
fn eq(&self, other: &&str) -> bool { | ||
self.0.eq(other) | ||
} | ||
} | ||
|
||
impl<'de> Deserialize<'de> for RustVersion { | ||
fn deserialize<D: Deserializer<'de>>(d: D) -> Result<RustVersion, D::Error> { | ||
let s = String::deserialize(d)?; | ||
match semver::VersionReq::parse(&s) { | ||
// Exclude semver operators like `^` and pre-release identifiers | ||
Ok(_) if s.chars().all(|c| c.is_ascii_digit() || c == '.') => Ok(RustVersion(s)), | ||
Ok(_) | Err(..) => { | ||
let value = de::Unexpected::Str(&s); | ||
let expected = "a valid rust_version"; | ||
Err(de::Error::invalid_value(value, &expected)) | ||
} | ||
} | ||
use cargo_toml::{Dependency, DepsSet, Error, Inheritable, Manifest, Package}; | ||
|
||
pub fn validate_manifest(manifest: &Manifest) -> Result<(), Error> { | ||
let package = manifest.package.as_ref(); | ||
|
||
// Check that a `[package]` table exists in the manifest, since crates.io | ||
// does not accept workspace manifests. | ||
let package = package.ok_or(Error::Other("missing field `package`"))?; | ||
|
||
validate_package(package)?; | ||
|
||
// These checks ensure that dependency workspace inheritance has been | ||
// normalized by cargo before publishing. | ||
if manifest.dependencies.is_inherited() | ||
|| manifest.dev_dependencies.is_inherited() | ||
|| manifest.build_dependencies.is_inherited() | ||
{ | ||
return Err(Error::InheritedUnknownValue); | ||
} | ||
|
||
Ok(()) | ||
} | ||
|
||
pub fn validate_package(package: &Package) -> Result<(), Error> { | ||
// These checks ensure that package field workspace inheritance has been | ||
// normalized by cargo before publishing. | ||
if package.edition.is_inherited() | ||
|| package.rust_version.is_inherited() | ||
|| package.version.is_inherited() | ||
|| package.authors.is_inherited() | ||
|| package.description.is_inherited() | ||
|| package.homepage.is_inherited() | ||
|| package.documentation.is_inherited() | ||
|| package.readme.is_inherited() | ||
|| package.keywords.is_inherited() | ||
|| package.categories.is_inherited() | ||
|| package.exclude.is_inherited() | ||
|| package.include.is_inherited() | ||
|| package.license.is_inherited() | ||
|| package.license_file.is_inherited() | ||
|| package.repository.is_inherited() | ||
|| package.publish.is_inherited() | ||
{ | ||
return Err(Error::InheritedUnknownValue); | ||
} | ||
|
||
// Check that the `rust-version` field has a valid value, if it exists. | ||
if let Some(rust_version) = package.rust_version() { | ||
validate_rust_version(rust_version)?; | ||
} | ||
|
||
Ok(()) | ||
} | ||
|
||
trait IsInherited { | ||
fn is_inherited(&self) -> bool; | ||
} | ||
|
||
impl<T> IsInherited for Inheritable<T> { | ||
fn is_inherited(&self) -> bool { | ||
!self.is_set() | ||
} | ||
} | ||
|
||
impl<T: IsInherited> IsInherited for Option<T> { | ||
fn is_inherited(&self) -> bool { | ||
self.as_ref().map(|it| it.is_inherited()).unwrap_or(false) | ||
} | ||
} | ||
|
||
impl IsInherited for Dependency { | ||
fn is_inherited(&self) -> bool { | ||
matches!(self, Dependency::Inherited(_)) | ||
} | ||
} | ||
|
||
impl IsInherited for DepsSet { | ||
fn is_inherited(&self) -> bool { | ||
self.iter().any(|(_key, dep)| dep.is_inherited()) | ||
} | ||
} | ||
|
||
pub fn validate_rust_version(value: &str) -> Result<(), Error> { | ||
match semver::VersionReq::parse(value) { | ||
// Exclude semver operators like `^` and pre-release identifiers | ||
Ok(_) if value.chars().all(|c| c.is_ascii_digit() || c == '.') => Ok(()), | ||
Ok(_) | Err(..) => Err(Error::Other("invalid `rust-version` value")), | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the extra validation worth it?
I'm concerned about us (very easily) adding new things that will then fail on publish. The person publishing would then need to figure out it is a bug in crates.io, report it, someone respond, create a PR upstream with
cargo_toml
, wait for a fix and publish, then updates and deploy crates.io.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our long term goal is to reduce the reliance on the metadata JSON blob that is being sent together with the tarball.
an attacker could potentially publish something where the metadata says foo and the tarball says bar, and we would prefer to treat the tarball as the primary source of truth.
this however requires us to parse the
Cargo.toml
file and if parsing fails we can't accept the upload. we could keep rolling our own structure definitions, but as we've already seen with thereadme
field, this is quite error-prone.as the PR description says
cargo_toml
works for almost all existing crates, so there isn't really a reason to write everything ourselves.the extra validation is technically not needed if we assume that only
cargo
is used to publish new crates, but only relying on client-side validation seems quite risky given scenarios like https://blog.vlt.sh/blog/the-massive-hole-in-the-npm-ecosystem.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be reasonable to add logging / alerting such that we could detect an increase in failed toml parsing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, that seems like a reasonable idea :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a breaking change like that cause bigger problems anyway, since crates.io is by no means the only downstream parser and consumer of cargo manifests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not even talking about breaking backwards compatibility but breaking forward compatibility.
Think of cases like:
Cargo needs to have a strict parser for these because an unknown value could have a major impact on behavior. If another tool is acting off of the field, they should hard error as well. However, we need to be able to add new values or else cargo is stuck as-is. This is possible to support in serde using a catch-all variant.
cargo_toml
instead errors on unknown values.A trickier one is adding new types which we've done but
cargo_metadata
also has hard errors for that.So assuming we only maintain backward compatibility. Most likely we'll be playing whack-a-mole with
cargo_toml
andcargo_metadata
for them to properly handle new values being emitted.We then have to deal with making sure they are updated when the new values are out (at minimum, a strong communication path).
Tying this back to this PR, making most of
cargo_toml
loosey-goosey for compatibility removes most of the validation that crates-io can do.Independent of all of that., in my opinion, is that pushing responsibility for this stuff to third-parties has made it so we are unaware of the problems their users face and it is easier to make innocuous changes that break them