Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make -Z http-registry use index.crates.io when accessing crates-io #10725

Merged
merged 1 commit into from
Jun 7, 2022

Conversation

arlosi
Copy link
Contributor

@arlosi arlosi commented Jun 3, 2022

Use sparse+https://index.crates.io/ to access crates.io when -Z http-registry is set.

  • Cargo.lock files still emit the github URL https://github.com/rust-lang/crates.io-index.
  • Allows publishing to a source-replaced crates-io only for index.crates.io. Other source-replacements of crates-io are still blocked.

Fixes #10722
r? @Eh2406

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 3, 2022
@Eh2406
Copy link
Contributor

Eh2406 commented Jun 3, 2022

During publishing, when looking up which registry a crate comes from, the source replacement is not followed.

I think we need to think very carefully about this. What is the behavior? How do we communicate that behavior? Is it the behavior people would expect? Etc.

Most concretely, does this mean that cargo publish requires downloading the git index? (Even if it does, this may be a step in the right direction. But eventually Cargo should not need to interact with the git index for crates.io.)

For example let's say I have a corporate registry that is set up as a crates.io source replacement. It mirrors crates.io where license and security checks have been approved. Currently this system does not need to protect against publishing internal packages, they will get the error about publishing with source replacement. After this fix if someone runs cargo publish on internal package, where will that code be sent? To the API as specified in config.json on crates.io or the API specified by the corporate registry's config.json?

Should this be tied to -Z http-registry, or should we have another way to enable/disable this that is separate from enabling support for http-registries?

I think this is good for now. Before we stabilize we may need a way to say "cargo knows how to talk http-registries but I want to use the git version of crates.io anyway." However I personally think we should wait for real use case where that is needed before adding such functionality to our permanent public interface.

@arlosi
Copy link
Contributor Author

arlosi commented Jun 4, 2022

Agreed, we need to think carefully about the source-replacement portion of this change.

Q: Does this mean that cargo publish requires downloading the git index?
A: No. It will download the index via the replaced source.

Q: where will the package be sent on a cargo publish for a replaced source?
A: it will be sent to the API specified in the config.json in the replaced source. This should be the crates.io API, but the replaced source could change it.

The proposed solution in this change works well for 1:1 mirrors of crates.io. However, there could be many issues if the replaced registry modifies the api value in config.json. That said, we have to trust a replaced registry not to be evil in general, since it could serve malicious crates.

Other API operations (yank, owners, etc) already work the way this change suggests (using the replaced source's api value). Currently only publishing to crates.io is special - and the error message it gives is rather confusing since it's really intended to stop a separate case (publishing to crates.io with non-crates.io deps).

For example let's say I have a corporate registry that is set up as a crates.io source replacement. It mirrors crates.io where license and security checks have been approved.

If the corporate registry wants to prevent anyone publishing crates, they could not set the api in the mirror's config.json.

However, consider the scenario where I want to publish a crate to crates.io, but I have source replacement configured to use the approved corporate mirror described. Currently I would have to comment out the source replacement, do the publish, then uncomment. This is the issue described in #6722.

Options I can see:

  • Allow publishing with source replacement configured, use the replaced source to find the api (proposed by this PR).
  • Allow publishing with source replacement configured, but ignore the replacement and fetch the original source to find the api.
  • Disallow all API operations with source-replaced registries.
  • Disallow publishing to crates.io with source replacement, but allow other API operations (the current state).

Whatever we decide, we should make publish consistent with the other API operations.

If we want to defer the decision outside to outside this change, I can add more crates.io specific code that only allows publishing to crates.io's github or index.crates.io to keep the current behavior (but still make -Z http-registry use index.crates.io).

@Eh2406
Copy link
Contributor

Eh2406 commented Jun 6, 2022

This is an insta stable change to how cargo publish works with a source-replacement. Currently it errors, now it publishes to the replaced registry. I think we should merge, but want to check with the team.
@rfcbot fcp merge

@Eh2406 Eh2406 added the T-cargo Team: Cargo label Jun 6, 2022
@Eh2406
Copy link
Contributor

Eh2406 commented Jun 6, 2022

Trying again:
@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Jun 6, 2022

Team member @Eh2406 has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period An FCP proposal has started, but not yet signed off. disposition-merge FCP with intent to merge labels Jun 6, 2022
@ehuss
Copy link
Contributor

ehuss commented Jun 6, 2022

the source replacement is not followed.

I'm wondering if this is the desired behavior. Per issues like #6722, users have a mirror of crates.io. They want to publish to the real crates.io, not the replacement. Or am I misunderstanding what this does?

And related to that, I am uncertain about how the management of tokens should work. I'd like to be very careful about considering leaking tokens to the wrong remote. For example, if I have a mirror of crates.io, and I run cargo publish, does it now send my crates.io token to the mirror?

I'm curious what interaction this will have with this. We should probably turn that to an error, but I can't recall if that would be a problem.

@joshtriplett
Copy link
Member

joshtriplett commented Jun 7, 2022 via email

@Eh2406
Copy link
Contributor

Eh2406 commented Jun 7, 2022

I'm wondering if this is the desired behavior. Per issues like #6722, users have a mirror of crates.io. They want to publish to the real crates.io, not the replacement. Or am I misunderstanding what this does?

You are understanding this correctly. Making sure this is the user's intended behavior is one of the reasons I wanted to have a full discussion of it. I would assume that mirrors that want to allow publishing through would not change there API in config.json. So that publishing to the mirror and publishing too the original contact the same endpoint.

It's important to keep in mind that according to the docs source-replacement mirrors are not allowed to have more crates then the original. https://doc.rust-lang.org/cargo/reference/source-replacement.html so it could be that the only meaningful options are "publish to the original" or "error". so we may not need to design for other alternatives.

I'd like to be very careful about considering leaking tokens to the wrong remote.

Yes, whatever we design we should have tests for.

For example, if I have a mirror of crates.io, and I run cargo publish, does it now send my crates.io token to the mirror?

I would hope that it sends the token you have configured for the mirror to the API from the mirror. And would not read the token you have configured for crates.io.

I'm curious what interaction this will have with this. We should probably turn that to an error, but I can't recall if that would be a problem.

I've never fully understood that warning. I probably need to reacquaint myself with its history.

Part of the goal of the asymmetric token work was to make sure that we don't have shared-secret tokens that can leak.

110% if we had only asymmetric tokens the security would be significantly simpler to think about! But we do not yet have that in common use. 🤷

@Eh2406
Copy link
Contributor

Eh2406 commented Jun 7, 2022

For example, if I have a mirror of crates.io, and I run cargo publish, does it now send my crates.io token to the mirror?

What does it currently do for yank and unyank? Whatever our decision is about the intended security envelope, we should make sure the different commands match.

@arlosi
Copy link
Contributor Author

arlosi commented Jun 7, 2022

Thanks for all the feedback. After considering this further, I think we should not make the change as it's currently proposed.

I'd like to be very careful about considering leaking tokens to the wrong remote.

The current situation (and the way this change proposes) makes it too easy to incorrectly configure things and send the token to the wrong place.

The problem is that source-replacement registries tend to keep the original api value. In the case of crates.io, this means that replacement registry tokens would be sent to crates.io.

If we try to fix this by using the token from the from the base registry, with the api from the replaced registry, then we have the opposite potential problem. If a source-replacement of crates.io changes the api value, then the crates.io token would be sent to the replacement registry.

I'm starting to prefer the "allow publishing with source replacement configured but ignore the replacement and fetch the original source" option. This avoids any potential token mix-ups. Source replacements are supposed to contain a strict subset of crates from the original, so publishing to a replaced source doesn't make sense.

Operation Follow Source Replacement
Index ✔️
Download ✔️
Login
Search
Yank
Unyank
Publish
Owners

I'm going modify this PR to remove the change to source replacement and start a different PR to handle that. This PR will go back to what the title says.

@Eh2406 can you cancel the FCP?

@Eh2406
Copy link
Contributor

Eh2406 commented Jun 7, 2022

@rfcbot fcp cancel

Whatever mechanism we put in place for errors should have hardcoded exceptions for this one "source replacement". Publishing to crates.io should not involve downloading the git index.

@rfcbot
Copy link
Collaborator

rfcbot commented Jun 7, 2022

@Eh2406 proposal cancelled.

@rfcbot rfcbot removed proposed-final-comment-period An FCP proposal has started, but not yet signed off. disposition-merge FCP with intent to merge labels Jun 7, 2022
@arlosi arlosi force-pushed the http-cratesio branch 3 times, most recently from 3089480 to bb9d908 Compare June 7, 2022 20:29
@arlosi
Copy link
Contributor Author

arlosi commented Jun 7, 2022

@Eh2406 This change is updated to only allow publishing to source-replaced crates.io when the replacement is sparse+https://index.crates.io/

@Eh2406
Copy link
Contributor

Eh2406 commented Jun 7, 2022

I tested by hand. I can't think of a way to get the test suite to test this directly, without actually interacting with the real crates.io indexes.

@bors r+

If someone has a suggestion for better testing, we can make a follow-up PR.

@bors
Copy link
Contributor

bors commented Jun 7, 2022

📌 Commit 445db94 has been approved by Eh2406

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 7, 2022
@bors
Copy link
Contributor

bors commented Jun 7, 2022

⌛ Testing commit 445db94 with merge 85e457e...

@bors
Copy link
Contributor

bors commented Jun 7, 2022

☀️ Test successful - checks-actions
Approved by: Eh2406
Pushing 85e457e to master...

@bors bors merged commit 85e457e into rust-lang:master Jun 7, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 8, 2022
Update cargo

7 commits in 38472bc19f2f76e245eba54a6e97ee6821b3c1db..85e457e158db216a2938d51bc3b617a5a7fe6015
2022-05-31 02:03:24 +0000 to 2022-06-07 21:57:52 +0000
- Make -Z http-registry use index.crates.io when accessing crates-io (rust-lang/cargo#10725)
- Respect submodule update=none strategy in .gitmodules (rust-lang/cargo#10717)
- Expose rust-version through env var (rust-lang/cargo#10713)
- add validation for string "true"/"false" in lto profile (rust-lang/cargo#10676)
- Enhance documentation of testing (rust-lang/cargo#10726)
- Clear disk space on CI. (rust-lang/cargo#10724)
- Enforce to use tar v0.4.38 (rust-lang/cargo#10720)
@ehuss ehuss added this to the 1.63.0 milestone Jun 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-cargo Team: Cargo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Teach cargo that crates.io has two equivalent indexes.
7 participants