Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the authors field optional #3052

Merged
merged 5 commits into from
Mar 17, 2021

Conversation

pietroalbini
Copy link
Member

@pietroalbini pietroalbini commented Jan 7, 2021

This RFC proposes to make the package.authors field of Cargo.toml optional. This RFC also proposes preventing Cargo from auto-filling it, allowing crates to be published to crates.io without the field being present, and avoiding displaying its contents on the crates.io and docs.rs UI.

Rendered

@pietroalbini pietroalbini added T-cargo Relevant to the Cargo team, which will review and decide on the RFC. T-crates-io Relevant to the crates.io team, which will review and decide on the RFC. labels Jan 7, 2021
`cargo init` will stop pre-populating the field when running the command, and
it will not include the field at all in the default `Cargo.toml`. Crate authors
will still be able to manually include the field before publishing if they so
choose, even though Cargo will warn when trying to publish those crates.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fully sold that a warning is necessary. If it's not populated by default, what's the issue with someone adding it if they want to?

Copy link
Member Author

@pietroalbini pietroalbini Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main purpose I see for a warning is telling people who ran cargo init/cargo new before this RFC that they can actually remove the field if they so choose. Also, if the goal is to deprecate the field we should eventually have people stop using it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the authors field and would not appreciate it going away.

It's fair to make it not required, but it shouldn't actually be removed entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeating from the Zulip. I'm not sure I see the need for the deprecation. This field has valid uses in many areas as people have pointed out, if crates.io doesn't want to display it they don't have to, but I don't see why that requires deprecation on Cargo's side, rather than solely making it optional. crates.io could stop displaying the information already even without this RFC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The [package] table of Cargo.toml only contains metadata used by Cargo or registries, and I'd like to see it remain that way (by deprecating and eventually removing through editions). Ultimately that's not my call to make though, and I'm curious what the Cargo team thinks about it.

Copy link
Member

@ashleygwilliams ashleygwilliams Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a fan of deprecation. The fields of a manifest file in a package management system are precious, and their numbers tend to trend towards infinite if there's not a clear effort to keep them tidy and minimal (see also package.json). The harm here may not be obvious; many fields can all have many useful reasons to exist. But for new and existing users alike, reading existing manifest files and creating new ones, an excess of optional fields is at best overwhelming and at worst actively confusing. The optionality of a field is not local to the individual manifest file and a name like "author" sounds... authoritative. In my opinion, the benefit of keeping this field doesn't outweigh the cost of the longterm maintenance (which will be costly on several axis). I also think this is a great, low-risk opportunity to develop a workflow and practice for deprecating manifest fields, which will be critical for the future ergonomics of crates.io, cargo, and devtools that leverage Cargo.toml.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the benefit of keeping this field doesn't outweigh the cost of the longterm maintenance (which will be costly on several axis).

The action with least maintenance is to entirely remove all code involving the authors field, and treating it as an unknown field (impossible due to $CARGO_PKG_AUTHORS but let's put this aside).

Currently, both cargo and crates.io ignores these unknown fields in [package]. So, at least on the axis of code maintenance, actively throwing a deprecation warning is more costly than doing nothing. Unless rust-lang/cargo#3576 is implemented.

@Aloso
Copy link

Aloso commented Jan 7, 2021

What's the reason why authors of a crate can't be renamed or removed after the crate was published? Is the reason technical (e.g. because of changing hashes) or social?

@pietroalbini
Copy link
Member Author

What's the reason why authors of a crate can't be renamed or removed after the crate was published? Is the reason technical (e.g. because of changing hashes) or social?

The reason is technical: that metadata is stored in the Cargo.toml, which is part of the crate tarball. Updating it would change the hash, breaking immutability and more importantly breaking all the projects with a Cargo.lock.

Copy link

@SOF3 SOF3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're taking two steps here. Making the authors field unnecessary and deprecating it are totally different things. Is it possible to do this step by step?

I understand that you want to remove authors completely to avoid crates.io maintenance issues in the future, which this problem cannot be replaced by merely removing the field. But this change is too aggressive without sufficient compensating benefit, at least in my opinion.

their name from the Internet, and the crates.io team doesn't have any way to
address that at the moment except for deleting the affected crates or versions
altogether. We don't do that lightly, but there were a few cases where we were
forced to do so.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really justified that we conduct a major change just for a minor use case that happens very rarely?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also only removes their name from the internet, but not the contents they created. Is this really meaningful in that sense? In particular, what if for exmaple, their names for some reason got into the code section of another person's crate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really justified that we conduct a major change just for a minor use case that happens very rarely?

One of the things I value the most is the personal safety of every Rust user. I strongly believe changes like this are justified if they can prevent people from being harmed.

This also only removes their name from the internet, but not the contents they created. Is this really meaningful in that sense? In particular, what if for exmaple, their names for some reason got into the code section of another person's crate?

This is anecdotal evidence, but I have had access to [email protected] for almost two years, and all of the cases where personal information needed to be deleted were related to package.authors, not the source code of the crates. Of course we can't prevent people from intentionally adding their name in the source code, but not forcing them to do so will address most of the issues.

The contents of the field also tend to scale poorly as the size of a project
grows, with projects either making the field useless by just stating "The
$PROJECT developers" or only naming the original authors without mentioning
other major contributors.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we look at it from another way? Authors is not for accreditation, but for contacting a maintainer. In that case what if we just rename authors to maintainer/contact?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the main reason why I'd like for this RFC to land. It's another effect that I personally think is positive, but it's more of a collateral benefit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the authors want to be contactable they can provide contact details in the description/readme/homepage still (I assume most maintainers will want to be contacted via their projects issue tracker, not random emails).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If authors is for contact information, you've got the same problem again – people's contact info changes in a manner completely unrelated to crate versions. Do you make a minor release when you change your email address? Things like this shouldn't even need to be in the version control, imo, because they're conceptually unlinked from the software.

published versions: this is highly desirable to ensure working builds don't
break in the future, but it also has the unfortunate side-effect of preventing
people from updating the list of crate authors defined in `Cargo.toml`'s
`package.authors` field.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really not possible to redact their names from existing packages? The only real use case for package.authors is env!("CARGO_PKG_AUTHORS"). Could anyone conduct a research to study how often this is actually used? Even if they are used, redacting a field from an existing package is unlikely to cause any issues unless, for some reason, a certain crate fails to compile without having a : in $CARGO_PKG_AUTHORS, or unless the crate tries to encode some logic inside the authors field. (This is hilarious, but I have actually seen the latter done in another community by someone who doesn't want his software to be "stolen" by forking + changing author name)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the contents of a crate will invalidate its hash, which will prevent any person depending on the crate from building their code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other use-case I know is for listing maintainers of separate crates in large internal workspaces, but that can be easily achieved in some other way.

Cargo currently provides author information to the crate via
`CARGO_PKG_AUTHORS`, and some crates (such as `clap`) use this information.
Deprecating the authors field will require crates currently using it to change,
such as by inlining the author data.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected impact in the long term? If it is eventually removed, will the BC for current packages using $CARGO_PKG_AUTHORS be broken?

If we don't intend to remove it in the long term, why deprecate (instead of remove) it at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to remove the field in the future is left as a future possibility. An approach I could see working is using the edition mechanism, but I think that's out of scope for this RFC.

The API will continue returning the `authors` field in every endpoint which
currently includes it, but the field will always be empty (even if the crate
author manually adds data to it). The database dumps will also stop including
the field.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to cause a superset of the problems caused by redacting authors in existing versions upon author's explicit request. Are you sure this is justified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I'm aware there is no documented API endpoint that exposes the authorship information, and the database dumps are clearly marked as "experimental". Removing the information from there will mean we can delete it from the crates.io database.

`cargo init`, and it will not include the field in the default template for
`Cargo.toml`. Cargo will also treat the field as deprecated, eventually
displaying a deprecation warning when someone tries to publish a crate with the
field set.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus, this no longer requires the $USER variable to be set in cargo new. This is actually good news for docker image maintainers.

@SOF3
Copy link

SOF3 commented Jan 7, 2021

What's the reason why authors of a crate can't be renamed or removed after the crate was published? Is the reason technical (e.g. because of changing hashes) or social?

The reason is technical: that metadata is stored in the Cargo.toml, which is part of the crate tarball. Updating it would change the hash, breaking immutability and more importantly breaking all the projects with a Cargo.lock.

Where is the hash used? Is it guaranteed to be stable such that external tools may depend on it?

@pietroalbini
Copy link
Member Author

Where is the hash used? Is it guaranteed to be stable such that external tools may depend on it?

The hash of the crate is used by Cargo to ensure dependencies were not tampered with. If any hash in Cargo.lock does not match, Cargo will refuse to start any build.

@SOF3
Copy link

SOF3 commented Jan 7, 2021

Have you considered integrating this with the edition system, such that we specify that the hash may be mutable for edition 2021 crates? Since edition 2021 crates cannot be compiled by older rust toolchains anyway, this is unlikely to cause issues.

I heard we're not continuing with editions though, so this may not be a good idea.

@clarfonthey
Copy link
Contributor

I'm very for this change and folks who really want it can add an AUTHORS.md file.

But, one potential alternative could be to categorise certain metadata as being excluded from package versions entirely and updated separately, and I see that as a valid extension of this. For example, it might be nice to be able to update the maintenance status badge without pushing a new version, or fix a typo in the description.

@pietroalbini
Copy link
Member Author

Have you considered integrating this with the edition system, such that we specify that the hash may be mutable for edition 2021 crates? Since edition 2021 crates cannot be compiled by older rust toolchains anyway, this is unlikely to cause issues.

Allowing the registry to alter the contents of the published crates without Cargo preventing builds would remove the immutability guarantee we currently have, and it would make reproducible builds way harder if not impossible to achieve. To me it seems like that approach would cause much more fallout than this RFC.

But, one potential alternative could be to categorise certain metadata as being excluded from package versions entirely and updated separately, and I see that as a valid extension of this. For example, it might be nice to be able to update the maintenance status badge without pushing a new version, or fix a typo in the description.

That's a really interesting idea! I definitely see the appeal of storing the metadata somewhere else, but that is going to require a lot of design work to get it right. As an example, I could see the maintenance badge to be "versionless" metadata while the description to be still tied to each individual version.

Even if we implement that, we'll need this RFC or an equivalent of it to land in order to remove that metadata from the Cargo.toml, so I see it more like a future possibility. I can add it there if you want!

@Lokathor
Copy link
Contributor

Lokathor commented Jan 7, 2021

Yes, please add that as a future possibility.

@joshtriplett
Copy link
Member

With my Cargo team hat on (though not speaking for the rest of the Cargo team), this seems reasonable to me, and thank you for clearly laying out the rationale.

I suspect the most notable transition difficulty will be for crates that are currently relying on CARGO_PKG_AUTHORS, such as those using clap or similar.

I'd like to see a note in the RFC proposing guidance for how such crates should proceed. Crates that currently read CARGO_PKG_AUTHORS will need to handle it not being present. Crates relying on dependencies to read CARGO_PKG_AUTHORS will need to make use of some other mechanism to specify the authors (such as directly in the source), assuming they still wish to do so.

Otherwise, this looks good to me. Nominating for discussion in the next Cargo meeting.

@lu-zero
Copy link

lu-zero commented Jan 7, 2021

I think would be better to make author default to "Project Authors" and defer to the version control or other means to identify the authors.

Please note that most licenses expect the information to be present in a way or another.

@kornelski
Copy link
Contributor

kornelski commented Jan 7, 2021

I agree with the problem it causes and would like to see a fix for this.

There is another problem with this field: it has no clear connection with crate owners, so reconciliation of authors and owners (to display both as a single list) is difficult and error-prone. It requires having a database of email-GitHub mappings, and for team-owned crates it's not even possible to cross-check the two sets.

However, this field has some uses that don't have a replacement (yet?):

  1. It allows giving credit to collaborators and previous authors without giving them ownership of the crate and publishing rights. It allows crediting people and orgs that aren't GitHub entities. The field is ordered and filtered, unlike set of crate owners (e.g. CI publishing bot or rust-bus backup account is not an author). npm has collaborators field that is deal for this. Cargo's authors field is the closest equivalent.

  2. It can outlive GitHub accounts, so it can work as a backup contact information.

  3. Crates have author's name even when their GitHub account doesn't.

  4. It works as a historical record. crates-io database dumps contain ownership information, but it's only the latest state, not a changelog.

Of course in all these cases people should still have control over their personal info. So a solution that moves where this data is stored to make it mutable to me seems better than complete removal.

@rljacobson
Copy link

I see no issue with making the authors field optional. I am strongly opposed to the rest of this proposal. I see several serious problems with it, the following being among them:

  1. Making changes to cargo because of the limitations of a website somewhere makes zero sense to me. The functioning of the website has nothing to do with cargo itself. The appropriate place to propose a change is with the maintainers of the website.
  2. The checksum SHOULD change if the content changes. This is by design. Violating this design has serious security and compatibility implications, as infrastructure can no longer trust in the current guarantees assumed of the checksum. Changing the content of the list of authors is by definition a change to the crate, and changing this semantics has significant technical consequences.
  3. If you change the crate, you should bump the version number. Retroactively changing a crate for any reason and leaving the version number and/or the checksum the same is a violation of the social contract that has long been established in the software community. I feel that there is a serious ethical problem in violating this social contract.
  4. If there is a problem with how the checksum is used, the appropriate fix is to change how you use the checksum. Maybe you should be using a different identification mechanism. An RFC proposing to incorporate such a mechanism should be welcome, in my view, if it is helpful to the community.
  5. The concerns of an author regarding the content of a crate is not and should not be the purview of the cargo project to address. There may be a wide range of reasons someone would object to the content of a crate. Accounting for them all by modifying the features and requirements of cargo serves to complicate the manifest, not simplify it as is suggested by another poster. If an author no longer wants content to be available somewhere, the correct course of action is to request the content to be removed. Yes, this means the older version of the crate will no longer be available. That is precisely what the author is requesting. If crates.io wants to implement a mechanism that redirects requests to an alternative crate—or any other mechanism to accommodate the content change—then they have the freedom to do so. They should not have the freedom to make changes to the content of a crate and tell people it is the same crate. It is not the same crate.
  6. The author field serves the very important role of attribution. Providing proper attribution is a strongly respected value within the scientific community which serves a variety of purposes both ethical and utilitarian. The challenges of accurately recognizing authorship are not unique to software. They exist in other intellectual domains as well. I see no barrier to implementing additional ways of serving attribution (under the constraints of my comments above), but deprecating the author field is not the right way forward.

I am personally very sympathetic to the motivations for this proposal and am happy to discuss my POV on them out of channel. Despite my strong objections above, they should not in any way be construed as a judgment of the RFC author. Indeed, I thank the author for this valuable contribution and wish to encourage them to continue to participate. Likewise, I am grateful for the opportunity to express my opinion here and to be able to read the alternative points of view of others.

@kornelski
Copy link
Contributor

kornelski commented Jan 7, 2021

@rljacobson You're missing that the crates-io website displays what is published and checksummed in the registry. While it could override what is in the registry, this wouldn't change the data for others. For example, https://lib.rs displays data from the registry, bypassing crates-io website wherever possible. The registry is promised to be immutable, and this is relied on by various tools, caches, and lockfiles, so the registry can't change. People's names are mutable, therefore they can't be in the immutable part of the registry.

@rljacobson
Copy link

After reading the comments of others above, I want to point out that we already have a mechanism for dealing with errors or mistakes in software, including those that maybe be very sensitive or ethically significant. What do we do when, for example, a serious security flaw is discovered in a widely used piece of software? What do we do when someone accidentally publishes proprietary information in violation of copyright with their code? If someone publishes the private information of hundreds of their company's customers? I do not see a compelling reason to make authorship a distinguished special case of these kinds of bugs.

@kornelski
Copy link
Contributor

Currently the registry has yanking for hiding insecure software, but it doesn't delete or change it. The crates-io registry sometimes deletes crates that are spam or subject to legal complaints, but this is meant to be rare occurrence.

@rljacobson
Copy link

@rljacobson You're missing that the crates-io website displays what is published and checksummed in the registry. While it could override what is in the registry, this wouldn't change the data for others.

This is a brute fact of reality regardless of the features we do or do not provide users. Once you decide to change you want to make a change to a crate, the original crate will not magically change on everyone's computer. And again, you are describing a problem with how crates.io functions, not a problem with cargo.

People's names are mutable, therefore they can't be in the immutable part of the registry.

But the data they published in the past is NOT mutable. And there is nothing anyone can do about that regardless of their desire.

What you CAN do is remove the effected crate. This is what we do with every other instance of content that has been published that we wish to no longer be available. That's what should happen here.

@kornelski
Copy link
Contributor

kornelski commented Jan 7, 2021

But the data they published in the past is NOT mutable. And there is nothing anyone can do about that regardless of their desire.

Well, no. This is under discussion. There are people who desire to change their names in already-published crates, and we're discussing how to accommodate that.

I mean for crates that have already been published using existing metadata+checksum this is hard, but we can change Cargo/crates-io/registry so that names will be mutable going forward.

Like many things in computer science, this could be fixed with a layer of indirection - the immutable packages could contain a numeric identifiers for each author (so that there's an immutable record of who was credited) and a separate mutable identifier->name mapping.

@rljacobson
Copy link

But the data they published in the past is NOT mutable. And there is nothing anyone can do about that regardless of their desire.

Well, no. This is under discussion.

The fact that the crate was previously published is a fact of history. You can only change the present/future, not the past. Unless you invent a time machine, in which case I would like to invite you and your time machine over to dinner sometime soon. There are a few recent events of the last year I would like to address. :)

What is under discussion is whether we should allow an existing crate to be modified and still be considered the same crate. I answer with a strong and passionate no. I think we can and should accommodate name changes in another way.

@Lokathor
Copy link
Contributor

Lokathor commented Jan 7, 2021

If there was, say, a security flaw in an old version of a crate: you'd yank the bad version and publish a new version. In the case of a dire enough problem a crate could even be deleted entirely, as was mentioned.

I'm sympathetic to a person wanting to change their name in the authors field, but I don't follow why they can't also use the "yank and publish an update" system we already have for other problems with published crate content.

Combined with removing the requirements for an authors field (which it seems that everyone so far agrees with) that should be satisfactory.

@rljacobson
Copy link

It occurs to me that the proposal does not do a good job of solving the problem it is meant to and may even make it worse.

The goal is to enable, hypothetically, Robert Jacobson (me) to change his name in all of his crates to Robert Lawrence. We can imagine there is a reason of personal security for me to want to do so. The best case scenario is if "Robert Jacobson” is removed from public repositories hosting my code and from projects depending on my code, the number of which might be significant (hypothetically). Changing “Robert Jacobson” to “Robert Lawrence” on crates.io while not changing the identity of the crate will only provide the new name to future downloads of the crate. Existing projects will not see the change.

On the other hand, pulling the crate and bumping the version number for the modified crate is likely to result in more projects depending on the crate to get the updated name, either because the crate isn’t cached or because the Cargo.toml always fetches the latest minor version (or whatever), forcing the download of the modified crate. This is the more desirable outcome.

As for removing the authors field, this will undoubtedly have the effect of a nonstandard AUTHORS.md file or something similar becoming common practice. Now my name is in some AUTHORS.md file, and I want to change it. What do I do? I am certainly not better off than if I was listed in the authors field in the Cargo.toml. In fact, I am arguably worse off, because there is not and cannot be a formal mechanism incorporated into the tooling to help me.

@pietroalbini
Copy link
Member Author

pietroalbini commented Jan 7, 2021

This is a lot of feedback, thanks y'all for commenting!

Please note that most licenses expect the information to be present in a way or another.

@lu-zero I don't think the authors field has any impact on licensing. Most of the licenses I'm aware of require the license text to be reproduced, so my non-lawyer understanding is that license = "SPDX" or the authors list don't have legal meaning.

It works as a historical record. crates-io database dumps contain ownership information, but it's only the latest state, not a changelog.

@kornelski crates.io actually stores ownership changes in its database, and I could see that being useful to track who can publish new releases. If you have an use for that kind of data we can eventually discuss including it in the database dumps.

Making changes to cargo because of the limitations of a website somewhere makes zero sense to me. The functioning of the website has nothing to do with cargo itself. The appropriate place to propose a change is with the maintainers of the website.

@rljacobson crates.io is part of the Rust project, and closely collaborates with the Cargo team. All of the features where Cargo interacts with a registry are developed in collaboration with the crates.io team, and this is one of those features!

The checksum SHOULD change if the content changes. This is by design. Violating this design has serious security and compatibility implications, as infrastructure can no longer trust in the current guarantees assumed of the checksum. Changing the content of the list of authors is by definition a change to the crate, and changing this semantics has significant technical consequences.
If you change the crate, you should bump the version number. Retroactively changing a crate for any reason and leaving the version number and/or the checksum the same is a violation of the social contract that has long been established in the software community. I feel that there is a serious ethical problem in violating this social contract.

I totally agree that the contents of published crates (and their checksum) should never change, that's out of the question. What I'm proposing in this RFC is that we deprecate including structured authorship information in the Cargo.toml, so that when changes on this inevitably happen we don't have to delete the crate and break every single user of that crate.

Changes to the authorship information don't affect reproducibility of builds, and don't change any security property of the crate. The contents of the field are basically freeform text, and there is nothing preventing me from publishing a crate under your name. The kind of authorship information that matter from a reliability and security perspective (who's allowed to publish new releases) is already tracked outside of Cargo.toml (and thus outside the checksum).

The concerns of an author regarding the content of a crate is not and should not be the purview of the cargo project to address. There may be a wide range of reasons someone would object to the content of a crate. Accounting for them all by modifying the features and requirements of cargo serves to complicate the manifest, not simplify it as is suggested by another poster. If an author no longer wants content to be available somewhere, the correct course of action is to request the content to be removed. Yes, this means the older version of the crate will no longer be available. That is precisely what the author is requesting. If crates.io wants to implement a mechanism that redirects requests to an alternative crate—or any other mechanism to accommodate the content change—then they have the freedom to do so. They should not have the freedom to make changes to the content of a crate and tell people it is the same crate. It is not the same crate.

Deleting crates can have an enormous impact on the ecosystem (as everyone saw when packages were deleted in other ecosystems). Thankfully we never had to delete a popular crate yet, but what would happen if we had to remove a crate a big chunk of the ecosystem depends on? That's surely going to become a problem in the future and I want to do everything I can to minimize the disruption it's going to cause. Lowering the amount of cases where we have to do that is the best tool at our disposal.

The author field serves the very important role of attribution. Providing proper attribution is a strongly respected value within the scientific community which serves a variety of purposes both ethical and utilitarian. The challenges of accurately recognizing authorship are not unique to software. They exist in other intellectual domains as well. I see no barrier to implementing additional ways of serving attribution (under the constraints of my comments above), but deprecating the author field is not the right way forward.

In my experience the authors field is practical only if there is one or a couple maintainers for a crate. It's not feasible, for example, to set authorship information for the Rust compiler in its Cargo.toml, as there are way too many contributors. That results in the field either being outdated (only listing the original maintainer) or defaulting to something like "The Rust Project Developers". Both of those approaches render the field useless in my opinion.

After reading the comments of others above, I want to point out that we already have a mechanism for dealing with errors or mistakes in software, including those that maybe be very sensitive or ethically significant. What do we do when, for example, a serious security flaw is discovered in a widely used piece of software? What do we do when someone accidentally publishes proprietary information in violation of copyright with their code? If someone publishes the private information of hundreds of their company's customers? I do not see a compelling reason to make authorship a distinguished special case of these kinds of bugs.

We have yanking for security vulnerabilities, which prevents new uses of the crate. For the other cases unfortunately we have to delete the crates, but that doesn't mean we shouldn't strive to delete the least amount of crates possible.

What is under discussion is whether we should allow an existing crate to be modified and still be considered the same crate. I answer with a strong and passionate no. I think we can and should accommodate name changes in another way.

What I'm proposing here is not to modify an existing crate, but to avoid including information that could need to be changed in the future.

As for removing the authors field, this will undoubtedly have the effect of a nonstandard AUTHORS.md file or something similar becoming common practice. Now my name is in some AUTHORS.md file, and I want to change it. What do I do? I am certainly not better off than if I was listed in the authors field in the Cargo.toml. In fact, I am arguably worse off, because there is not and cannot be a formal mechanism incorporated into the tooling to help me.

We can't prevent people from including their name in the source code of their crate, but as Ashley pointed out in another comment having the field in the Cargo.toml encourages people to fill it in when they're publishing a new crate and they're looking at what fields are available.

I'm sympathetic to a person wanting to change their name in the authors field, but I don't follow why they can't also use the "yank and publish an update" system we already have for other problems with published crate content.

@Lokathor unfortunately yanking can't prevent people from being harassed or doxxed. Both of those things happened multiple times already, and I want avoid crates.io becoming a tool to harass as much as possible.

@Kixiron
Copy link
Member

Kixiron commented Jan 7, 2021

I feel like the vast majority of people publishing crates (and therefore consciously making their work public for all the world to see) either don't care about the name they put into the authors slot or actively want attrition for their work, so removing the authors field entirely seems to me a knee-jerk reaction to combat a niche desire. I have no problem with making the field optional for publishing to accommodate those who don't want their name on anything, but automatically filling it out is what I think most users desire and if it's simply made optional, those who care can remove the field and therefore not expose their information. This really seems like a genuinely good middle ground, those who want privacy can get it by simply deleting a line from their project while anyone who doesn't care has cargo keep functioning for them as normal (entering their names automatically, which is assumedly what they prefer). To further that Cargo could even offer an option in the config.toml file that toggles auto-authorship in the Cargo.toml, with it set to true (the default) it acts as it does now and fills in the authors field from the various assorted sources and if it's set to false it omits that particular line from the template that's generated with cargo new

@Diggsey
Copy link
Contributor

Diggsey commented Jan 7, 2021

@Kixiron the problem is that doesn't actually help with the main issue raised in the RFC.

I think it would useful to have a separate file to store data which is not part of the crate, ie. metadata. This could include authors, description, and other information that is not code, nor related to the functioning of that code.

This information would be stored in the registry when you publish a crate, but could also be updated/removed at any time, via eg. cargo publish --metadata-only. We can then populate this metadata file by default, as it can easily be amended later if needed.

@Kixiron
Copy link
Member

Kixiron commented Jan 7, 2021

That's not actually what the RFC is about though, that would be a tangential thing. The title of the RFC is Deprecate the authors field, and that seems to be too large of a reaction to a niche need, that kind of sweeping change affects everyone (the vast majority of whom want their names on their crates) for a minority's (very valid) concern

@pietroalbini
Copy link
Member Author

Updated the RFC to remove mentions of deprecating the field. The field will still be optional, cargo new will not populate it and crates.io and docs.rs will not display its contents.

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Mar 2, 2021

Team member @pietroalbini has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Mar 2, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Mar 6, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label Mar 6, 2021
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Mar 16, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Mar 16, 2021

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@pietroalbini pietroalbini merged commit 3c7c68a into rust-lang:master Mar 17, 2021
@pietroalbini pietroalbini deleted the deprecate-authors-field branch March 17, 2021 10:43
@pietroalbini
Copy link
Member Author

Yay! The @rust-lang/cargo and @rust-lang/crates-io teams have decided to accept this RFC.

To track further discussion, subscribe to the tracking issue here: rust-lang/rust#83227

bors added a commit to rust-lang/crates.io that referenced this pull request Mar 21, 2021
Remove "Authors" section from crate details page

see rust-lang/rfcs#3052 😉

r? `@pietroalbini`
messense added a commit to messense/maturin that referenced this pull request Jun 21, 2021
messense added a commit to messense/maturin that referenced this pull request Jun 21, 2021
ramnivas added a commit to exograph/exograph that referenced this pull request Jan 6, 2022
That field doesn't convey anything meanigful and in any case, it displays incorrect information unless we maintain that field.

Also see, rust-lang/rfcs#3052 (comment)
ramnivas added a commit to exograph/exograph that referenced this pull request Jan 7, 2022
That field doesn't convey anything meanigful and in any case, it displays incorrect information unless we maintain that field.

Also see, rust-lang/rfcs#3052 (comment)
shadaj pushed a commit to exograph/exograph that referenced this pull request Apr 20, 2023
That field doesn't convey anything meanigful and in any case, it displays incorrect information unless we maintain that field.

Also see, rust-lang/rfcs#3052 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-cargo Relevant to the Cargo team, which will review and decide on the RFC. T-crates-io Relevant to the crates.io team, which will review and decide on the RFC. to-announce
Projects
No open projects
Status: Done (Stabilized)
Development

Successfully merging this pull request may close these issues.