-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the authors field optional #3052
Make the authors field optional #3052
Conversation
text/0000-deprecate-authors-field.md
Outdated
`cargo init` will stop pre-populating the field when running the command, and | ||
it will not include the field at all in the default `Cargo.toml`. Crate authors | ||
will still be able to manually include the field before publishing if they so | ||
choose, even though Cargo will warn when trying to publish those crates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not fully sold that a warning is necessary. If it's not populated by default, what's the issue with someone adding it if they want to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main purpose I see for a warning is telling people who ran cargo init
/cargo new
before this RFC that they can actually remove the field if they so choose. Also, if the goal is to deprecate the field we should eventually have people stop using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the authors field and would not appreciate it going away.
It's fair to make it not required, but it shouldn't actually be removed entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeating from the Zulip. I'm not sure I see the need for the deprecation. This field has valid uses in many areas as people have pointed out, if crates.io doesn't want to display it they don't have to, but I don't see why that requires deprecation on Cargo's side, rather than solely making it optional. crates.io could stop displaying the information already even without this RFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The [package]
table of Cargo.toml
only contains metadata used by Cargo or registries, and I'd like to see it remain that way (by deprecating and eventually removing through editions). Ultimately that's not my call to make though, and I'm curious what the Cargo team thinks about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a fan of deprecation. The fields of a manifest file in a package management system are precious, and their numbers tend to trend towards infinite if there's not a clear effort to keep them tidy and minimal (see also package.json). The harm here may not be obvious; many fields can all have many useful reasons to exist. But for new and existing users alike, reading existing manifest files and creating new ones, an excess of optional fields is at best overwhelming and at worst actively confusing. The optionality of a field is not local to the individual manifest file and a name like "author" sounds... authoritative. In my opinion, the benefit of keeping this field doesn't outweigh the cost of the longterm maintenance (which will be costly on several axis). I also think this is a great, low-risk opportunity to develop a workflow and practice for deprecating manifest fields, which will be critical for the future ergonomics of crates.io, cargo, and devtools that leverage Cargo.toml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the benefit of keeping this field doesn't outweigh the cost of the longterm maintenance (which will be costly on several axis).
The action with least maintenance is to entirely remove all code involving the authors
field, and treating it as an unknown field (impossible due to $CARGO_PKG_AUTHORS
but let's put this aside).
Currently, both cargo and crates.io ignores these unknown fields in [package]
. So, at least on the axis of code maintenance, actively throwing a deprecation warning is more costly than doing nothing. Unless rust-lang/cargo#3576 is implemented.
What's the reason why authors of a crate can't be renamed or removed after the crate was published? Is the reason technical (e.g. because of changing hashes) or social? |
The reason is technical: that metadata is stored in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're taking two steps here. Making the authors
field unnecessary and deprecating it are totally different things. Is it possible to do this step by step?
I understand that you want to remove authors
completely to avoid crates.io maintenance issues in the future, which this problem cannot be replaced by merely removing the field. But this change is too aggressive without sufficient compensating benefit, at least in my opinion.
text/0000-deprecate-authors-field.md
Outdated
their name from the Internet, and the crates.io team doesn't have any way to | ||
address that at the moment except for deleting the affected crates or versions | ||
altogether. We don't do that lightly, but there were a few cases where we were | ||
forced to do so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really justified that we conduct a major change just for a minor use case that happens very rarely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also only removes their name from the internet, but not the contents they created. Is this really meaningful in that sense? In particular, what if for exmaple, their names for some reason got into the code section of another person's crate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really justified that we conduct a major change just for a minor use case that happens very rarely?
One of the things I value the most is the personal safety of every Rust user. I strongly believe changes like this are justified if they can prevent people from being harmed.
This also only removes their name from the internet, but not the contents they created. Is this really meaningful in that sense? In particular, what if for exmaple, their names for some reason got into the code section of another person's crate?
This is anecdotal evidence, but I have had access to [email protected] for almost two years, and all of the cases where personal information needed to be deleted were related to package.authors
, not the source code of the crates. Of course we can't prevent people from intentionally adding their name in the source code, but not forcing them to do so will address most of the issues.
text/0000-deprecate-authors-field.md
Outdated
The contents of the field also tend to scale poorly as the size of a project | ||
grows, with projects either making the field useless by just stating "The | ||
$PROJECT developers" or only naming the original authors without mentioning | ||
other major contributors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we look at it from another way? Authors is not for accreditation, but for contacting a maintainer. In that case what if we just rename authors
to maintainer
/contact
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not the main reason why I'd like for this RFC to land. It's another effect that I personally think is positive, but it's more of a collateral benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the authors want to be contactable they can provide contact details in the description/readme/homepage still (I assume most maintainers will want to be contacted via their projects issue tracker, not random emails).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If authors
is for contact information, you've got the same problem again – people's contact info changes in a manner completely unrelated to crate versions. Do you make a minor release when you change your email address? Things like this shouldn't even need to be in the version control, imo, because they're conceptually unlinked from the software.
text/0000-deprecate-authors-field.md
Outdated
published versions: this is highly desirable to ensure working builds don't | ||
break in the future, but it also has the unfortunate side-effect of preventing | ||
people from updating the list of crate authors defined in `Cargo.toml`'s | ||
`package.authors` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really not possible to redact their names from existing packages? The only real use case for package.authors
is env!("CARGO_PKG_AUTHORS")
. Could anyone conduct a research to study how often this is actually used? Even if they are used, redacting a field from an existing package is unlikely to cause any issues unless, for some reason, a certain crate fails to compile without having a :
in $CARGO_PKG_AUTHORS
, or unless the crate tries to encode some logic inside the authors field. (This is hilarious, but I have actually seen the latter done in another community by someone who doesn't want his software to be "stolen" by forking + changing author name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the contents of a crate will invalidate its hash, which will prevent any person depending on the crate from building their code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only other use-case I know is for listing maintainers of separate crates in large internal workspaces, but that can be easily achieved in some other way.
text/0000-deprecate-authors-field.md
Outdated
Cargo currently provides author information to the crate via | ||
`CARGO_PKG_AUTHORS`, and some crates (such as `clap`) use this information. | ||
Deprecating the authors field will require crates currently using it to change, | ||
such as by inlining the author data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the expected impact in the long term? If it is eventually removed, will the BC for current packages using $CARGO_PKG_AUTHORS
be broken?
If we don't intend to remove it in the long term, why deprecate (instead of remove) it at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to remove the field in the future is left as a future possibility. An approach I could see working is using the edition mechanism, but I think that's out of scope for this RFC.
text/0000-deprecate-authors-field.md
Outdated
The API will continue returning the `authors` field in every endpoint which | ||
currently includes it, but the field will always be empty (even if the crate | ||
author manually adds data to it). The database dumps will also stop including | ||
the field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to cause a superset of the problems caused by redacting authors in existing versions upon author's explicit request. Are you sure this is justified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I'm aware there is no documented API endpoint that exposes the authorship information, and the database dumps are clearly marked as "experimental". Removing the information from there will mean we can delete it from the crates.io database.
text/0000-deprecate-authors-field.md
Outdated
`cargo init`, and it will not include the field in the default template for | ||
`Cargo.toml`. Cargo will also treat the field as deprecated, eventually | ||
displaying a deprecation warning when someone tries to publish a crate with the | ||
field set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plus, this no longer requires the $USER
variable to be set in cargo new
. This is actually good news for docker image maintainers.
Where is the hash used? Is it guaranteed to be stable such that external tools may depend on it? |
The hash of the crate is used by Cargo to ensure dependencies were not tampered with. If any hash in |
Have you considered integrating this with the edition system, such that we specify that the hash may be mutable for edition 2021 crates? Since edition 2021 crates cannot be compiled by older rust toolchains anyway, this is unlikely to cause issues. I heard we're not continuing with editions though, so this may not be a good idea. |
I'm very for this change and folks who really want it can add an But, one potential alternative could be to categorise certain metadata as being excluded from package versions entirely and updated separately, and I see that as a valid extension of this. For example, it might be nice to be able to update the maintenance status badge without pushing a new version, or fix a typo in the description. |
Allowing the registry to alter the contents of the published crates without Cargo preventing builds would remove the immutability guarantee we currently have, and it would make reproducible builds way harder if not impossible to achieve. To me it seems like that approach would cause much more fallout than this RFC.
That's a really interesting idea! I definitely see the appeal of storing the metadata somewhere else, but that is going to require a lot of design work to get it right. As an example, I could see the maintenance badge to be "versionless" metadata while the description to be still tied to each individual version. Even if we implement that, we'll need this RFC or an equivalent of it to land in order to remove that metadata from the |
Yes, please add that as a future possibility. |
With my Cargo team hat on (though not speaking for the rest of the Cargo team), this seems reasonable to me, and thank you for clearly laying out the rationale. I suspect the most notable transition difficulty will be for crates that are currently relying on I'd like to see a note in the RFC proposing guidance for how such crates should proceed. Crates that currently read Otherwise, this looks good to me. Nominating for discussion in the next Cargo meeting. |
I think would be better to make Please note that most licenses expect the information to be present in a way or another. |
I agree with the problem it causes and would like to see a fix for this. There is another problem with this field: it has no clear connection with crate owners, so reconciliation of However, this field has some uses that don't have a replacement (yet?):
Of course in all these cases people should still have control over their personal info. So a solution that moves where this data is stored to make it mutable to me seems better than complete removal. |
I see no issue with making the authors field optional. I am strongly opposed to the rest of this proposal. I see several serious problems with it, the following being among them:
I am personally very sympathetic to the motivations for this proposal and am happy to discuss my POV on them out of channel. Despite my strong objections above, they should not in any way be construed as a judgment of the RFC author. Indeed, I thank the author for this valuable contribution and wish to encourage them to continue to participate. Likewise, I am grateful for the opportunity to express my opinion here and to be able to read the alternative points of view of others. |
@rljacobson You're missing that the crates-io website displays what is published and checksummed in the registry. While it could override what is in the registry, this wouldn't change the data for others. For example, https://lib.rs displays data from the registry, bypassing crates-io website wherever possible. The registry is promised to be immutable, and this is relied on by various tools, caches, and lockfiles, so the registry can't change. People's names are mutable, therefore they can't be in the immutable part of the registry. |
After reading the comments of others above, I want to point out that we already have a mechanism for dealing with errors or mistakes in software, including those that maybe be very sensitive or ethically significant. What do we do when, for example, a serious security flaw is discovered in a widely used piece of software? What do we do when someone accidentally publishes proprietary information in violation of copyright with their code? If someone publishes the private information of hundreds of their company's customers? I do not see a compelling reason to make authorship a distinguished special case of these kinds of bugs. |
Currently the registry has yanking for hiding insecure software, but it doesn't delete or change it. The crates-io registry sometimes deletes crates that are spam or subject to legal complaints, but this is meant to be rare occurrence. |
This is a brute fact of reality regardless of the features we do or do not provide users. Once you decide to change you want to make a change to a crate, the original crate will not magically change on everyone's computer. And again, you are describing a problem with how crates.io functions, not a problem with cargo.
But the data they published in the past is NOT mutable. And there is nothing anyone can do about that regardless of their desire. What you CAN do is remove the effected crate. This is what we do with every other instance of content that has been published that we wish to no longer be available. That's what should happen here. |
Well, no. This is under discussion. There are people who desire to change their names in already-published crates, and we're discussing how to accommodate that. I mean for crates that have already been published using existing metadata+checksum this is hard, but we can change Cargo/crates-io/registry so that names will be mutable going forward. Like many things in computer science, this could be fixed with a layer of indirection - the immutable packages could contain a numeric identifiers for each author (so that there's an immutable record of who was credited) and a separate mutable identifier->name mapping. |
The fact that the crate was previously published is a fact of history. You can only change the present/future, not the past. Unless you invent a time machine, in which case I would like to invite you and your time machine over to dinner sometime soon. There are a few recent events of the last year I would like to address. :) What is under discussion is whether we should allow an existing crate to be modified and still be considered the same crate. I answer with a strong and passionate no. I think we can and should accommodate name changes in another way. |
If there was, say, a security flaw in an old version of a crate: you'd yank the bad version and publish a new version. In the case of a dire enough problem a crate could even be deleted entirely, as was mentioned. I'm sympathetic to a person wanting to change their name in the authors field, but I don't follow why they can't also use the "yank and publish an update" system we already have for other problems with published crate content. Combined with removing the requirements for an |
It occurs to me that the proposal does not do a good job of solving the problem it is meant to and may even make it worse. The goal is to enable, hypothetically, Robert Jacobson (me) to change his name in all of his crates to Robert Lawrence. We can imagine there is a reason of personal security for me to want to do so. The best case scenario is if "Robert Jacobson” is removed from public repositories hosting my code and from projects depending on my code, the number of which might be significant (hypothetically). Changing “Robert Jacobson” to “Robert Lawrence” on crates.io while not changing the identity of the crate will only provide the new name to future downloads of the crate. Existing projects will not see the change. On the other hand, pulling the crate and bumping the version number for the modified crate is likely to result in more projects depending on the crate to get the updated name, either because the crate isn’t cached or because the As for removing the authors field, this will undoubtedly have the effect of a nonstandard |
This is a lot of feedback, thanks y'all for commenting!
@lu-zero I don't think the authors field has any impact on licensing. Most of the licenses I'm aware of require the license text to be reproduced, so my non-lawyer understanding is that
@kornelski crates.io actually stores ownership changes in its database, and I could see that being useful to track who can publish new releases. If you have an use for that kind of data we can eventually discuss including it in the database dumps.
@rljacobson crates.io is part of the Rust project, and closely collaborates with the Cargo team. All of the features where Cargo interacts with a registry are developed in collaboration with the crates.io team, and this is one of those features!
I totally agree that the contents of published crates (and their checksum) should never change, that's out of the question. What I'm proposing in this RFC is that we deprecate including structured authorship information in the Changes to the authorship information don't affect reproducibility of builds, and don't change any security property of the crate. The contents of the field are basically freeform text, and there is nothing preventing me from publishing a crate under your name. The kind of authorship information that matter from a reliability and security perspective (who's allowed to publish new releases) is already tracked outside of
Deleting crates can have an enormous impact on the ecosystem (as everyone saw when packages were deleted in other ecosystems). Thankfully we never had to delete a popular crate yet, but what would happen if we had to remove a crate a big chunk of the ecosystem depends on? That's surely going to become a problem in the future and I want to do everything I can to minimize the disruption it's going to cause. Lowering the amount of cases where we have to do that is the best tool at our disposal.
In my experience the authors field is practical only if there is one or a couple maintainers for a crate. It's not feasible, for example, to set authorship information for the Rust compiler in its
We have yanking for security vulnerabilities, which prevents new uses of the crate. For the other cases unfortunately we have to delete the crates, but that doesn't mean we shouldn't strive to delete the least amount of crates possible.
What I'm proposing here is not to modify an existing crate, but to avoid including information that could need to be changed in the future.
We can't prevent people from including their name in the source code of their crate, but as Ashley pointed out in another comment having the field in the
@Lokathor unfortunately yanking can't prevent people from being harassed or doxxed. Both of those things happened multiple times already, and I want avoid crates.io becoming a tool to harass as much as possible. |
I feel like the vast majority of people publishing crates (and therefore consciously making their work public for all the world to see) either don't care about the name they put into the authors slot or actively want attrition for their work, so removing the authors field entirely seems to me a knee-jerk reaction to combat a niche desire. I have no problem with making the field optional for publishing to accommodate those who don't want their name on anything, but automatically filling it out is what I think most users desire and if it's simply made optional, those who care can remove the field and therefore not expose their information. This really seems like a genuinely good middle ground, those who want privacy can get it by simply deleting a line from their project while anyone who doesn't care has cargo keep functioning for them as normal (entering their names automatically, which is assumedly what they prefer). To further that Cargo could even offer an option in the |
@Kixiron the problem is that doesn't actually help with the main issue raised in the RFC. I think it would useful to have a separate file to store data which is not part of the crate, ie. metadata. This could include authors, description, and other information that is not code, nor related to the functioning of that code. This information would be stored in the registry when you publish a crate, but could also be updated/removed at any time, via eg. |
That's not actually what the RFC is about though, that would be a tangential thing. The title of the RFC is |
Updated the RFC to remove mentions of deprecating the field. The field will still be optional, @rfcbot fcp merge |
Team member @pietroalbini has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. The RFC will be merged soon. |
Yay! The @rust-lang/cargo and @rust-lang/crates-io teams have decided to accept this RFC. To track further discussion, subscribe to the tracking issue here: rust-lang/rust#83227 |
Remove "Authors" section from crate details page see rust-lang/rfcs#3052 😉 r? `@pietroalbini`
That field doesn't convey anything meanigful and in any case, it displays incorrect information unless we maintain that field. Also see, rust-lang/rfcs#3052 (comment)
That field doesn't convey anything meanigful and in any case, it displays incorrect information unless we maintain that field. Also see, rust-lang/rfcs#3052 (comment)
That field doesn't convey anything meanigful and in any case, it displays incorrect information unless we maintain that field. Also see, rust-lang/rfcs#3052 (comment)
This RFC proposes to make the
package.authors
field ofCargo.toml
optional. This RFC also proposes preventing Cargo from auto-filling it, allowing crates to be published to crates.io without the field being present, and avoiding displaying its contents on the crates.io and docs.rs UI.Rendered