Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload index metadata to index/ when publishing new crates #4661

Merged
merged 1 commit into from
May 19, 2022

Conversation

arlosi
Copy link
Contributor

@arlosi arlosi commented Mar 22, 2022

Cargo can access http-based registries via rust-lang/cargo#10470.

This change causes crates.io to publish any changed metadata files to index/ on S3 in addition to the git-based index. The S3 bucket is configured by new environment variables S3_INDEX_*.

A new admin tool for bulk-uploading crates is also added.

@Turbo87 Turbo87 added C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works A-backend ⚙️ labels Mar 23, 2022
Copy link
Member

@jtgeibel jtgeibel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @arlosi!

There are a few operational issues that we'll need to address before this can be merged.

  • We need to invalidate the CloudFront cache for a file after modifying it on S3. Even with a short cache-control interval files will enter the cache at different times and clients could end up seeing an inconsistent index state (such as a crate with a dependency that does not yet appear to be published). We could then probably increase the max-age allowed to be cached by CF.
  • These files should probably be stored in their own S3 bucket. We should also serve them from something like index.crates.io instead of static.crates.io. I'll discuss with infra at our next meeting.
  • We need a way to populate the full repository, as these changes will only upload an index file upon a publish. A command could be added to src/admin that publishes all of the crate files. There will probably be some complexity around getting this right, since if there is a gap between this task and enabling it in production then updates will be missed, but if there is overlap then updates could be overwritten with older data.
  • Similarly, we should add logic to the delete_crate admin task to delete the index file from S3 and invalidate the CF cache. This happens pretty rarely and we don't automatically update the git index yet, but it would be helpful update the HTTP index automatically so that we don't forget.

src/uploaders.rs Show resolved Hide resolved
src/worker/git.rs Outdated Show resolved Hide resolved
@Eh2406
Copy link
Contributor

Eh2406 commented Mar 29, 2022

On behalf of the cargo team, I would love to be part of meetings about designing this. I'm also not sure about when it's best to hash things out in this PR, or on the zulip conversation, or in a meeting.

Having said that I will try and respond to each point as shortly as I can. Happy to talk in depth about any of them when we have picked the right venue.

  • "We need to invalidate the CloudFront cache" invalidations only happened so quickly. If max-age is shorter then the invalidation roll out time, then there is no point. As the maintainer of Cargos dependency resolver, I don't think inconsistencies are going to be a big problem, but happy to talk it over.
  • The conversation about S3 buckets is happing in the zulip conversation, and I will leave it to people with more skin in the game.
  • "We need a way to populate the full repository" If this PR has written a file then that file is up to date. S3 has a atomic upload a file if not exist command. Anytime after PR has been deployed, iterate over all files in the git repo and upload to S3 using "no overight". I don't know where this script should live.
  • "we should add logic to the delete_crate admin task" this makes sense.

One overarching thought, the cargo side of this is not ready for stabilization. At this point it's premature for crates.io to be prepared for a fully operationalized HTTP index. As long as we have a plan in place, and each step gets us closer to it, we can take it one step at a time.

@jtgeibel
Copy link
Member

S3 has a atomic upload a file if not exist command.

That sounds like a great solution!

One overarching thought, the cargo side of this is not ready for stabilization. [...] As long as we have a plan in place, and each step gets us closer to it, we can take it one step at a time.

Thanks for the additional context. I agree that at this point we just need to agree on a plan for long term operations, and can begin iterating towards that.

invalidations only happened so quickly. If max-age is shorter then the invalidation roll out time, then there is no point.

Okay, it looks like "[o]bject invalidations typically take from 60 to 300 seconds to complete." The PR currently sets the max-age to 600 seconds. For projects that release multiple crates they expect each crate to be published within seconds (to publish crates which depend on it), not minutes, so I wonder if we would need to set a very short max-age and only get the benefits of caching for very popular crates. I'd love to have some estimates of expected traffic and the associated costs of various options to base these design decisions on.

I don't think inconsistencies are going to be a big problem, but happy to talk it over.

My main concern here is also related to new publishes (which will probably be solved by however we solve the above). If I publish new major versions of a batch of crates that depend on each other, then currently there would be a period of up to 10 minutes where some users may have a broken build if they attempt to upgrade. In contrast, the whole git index is updated atomically and we can ensure that for every dependency in the index cargo will find at least one crate version that satisfies it. (Currently I think the only exceptions are if a crate is deleted, or if versions are yanked without a compatible replacement.)

@Eh2406
Copy link
Contributor

Eh2406 commented Mar 30, 2022

S3 has a atomic upload a file if not exist command.

Well it has a "atomic copy if not exist" command. x-amz-copy-source-if-unmodified-since on CopyObject. For some reason that header is not available on upload.

The PR currently sets the max-age to 600 seconds.

That is sort of in the awkward middle. If we want to ensure updates take less than say 30 seconds, then a short max-age is the only way to do that. If we are comfortable with updates occasionally taking over 300 seconds, then an infinitely long max-age and invalidations are the best way to do that. I'm not sure what's best, but my understanding is that we can change this pretty easily at anytime.

I'd love to have some estimates of expected traffic and the associated costs of various options to base these design decisions on.

At the moment the traffic should be approximately 0, has the feature is not even available on nightly. As we stabilize it it should eventually grow too match current traffic for cloning the index. I don't know if the numbers I got from @pietroalbini can be shared publicly. I will start a DM on zulip if that works for you.

If I publish new major versions of a batch of crates that depend on each other, then currently there would be a period of up to 10 minutes where some users may have a broken build if they attempt to upgrade.

Yes, all the oddities occur when trying to build a package within one max-age of a publish.
In general the resolver will pick an older version if it matches the requirements, or give an error message about the dependency not being available. Resolver error messages are not the best, sadly. But they should at least list witch package is out of date. Perhaps an HTTP header can tell Cargo the max-age and it can get added to the error message. "but no versions of foo matched bars requirement ^1.0.0 ... Note: versions published in the past 300 seconds my not yet be available." or something.

Even with gits nice atomic properties, even if a version does match every requirement that does not mean that every requirement can be satisfied. Every version that matches could conflict with some other requirement, or the lockfile.

@pietroalbini
Copy link
Member

Created buckets and CDNs both for staging and production, the existing credentials have access to them:

@arlosi arlosi force-pushed the http-index branch 2 times, most recently from 20cf719 to 92698e3 Compare April 7, 2022 22:06
@arlosi
Copy link
Contributor Author

arlosi commented Apr 8, 2022

we should add logic to the delete_crate admin task to delete the index file from S3 and invalidate the CF cache.

The delete_crate admin task doesn't seem to modify the git index (only the database). Maybe I'm missing something?

@Eh2406
Copy link
Contributor

Eh2406 commented Apr 9, 2022

You are not missing anything. To quote:

Similarly, we should add logic to the delete_crate admin task to delete the index file from S3 and invalidate the CF cache. This happens pretty rarely and we don't automatically update the git index yet, but it would be helpful update the HTTP index automatically so that we don't forget.

@arlosi
Copy link
Contributor Author

arlosi commented Apr 19, 2022

Oops, I misread the request. The delete_crate admin task now removes the http index metadata as well.

@arlosi arlosi requested a review from jtgeibel April 19, 2022 20:59
@arlosi
Copy link
Contributor Author

arlosi commented May 11, 2022

Is someone available to review this? Let me know if there are changes needed.

cargo-registry-index/lib.rs Outdated Show resolved Hide resolved
cargo-registry-index/lib.rs Outdated Show resolved Hide resolved
cargo-registry-index/lib.rs Outdated Show resolved Hide resolved
cargo-registry-index/lib.rs Outdated Show resolved Hide resolved
cargo-registry-index/lib.rs Outdated Show resolved Hide resolved
src/uploaders.rs Outdated Show resolved Hide resolved
src/admin/render_readmes.rs Outdated Show resolved Hide resolved
src/config/base.rs Outdated Show resolved Hide resolved
src/admin/upload_index.rs Outdated Show resolved Hide resolved
@bors
Copy link
Contributor

bors commented May 17, 2022

☔ The latest upstream changes (presumably 531b5c8) made this pull request unmergeable. Please resolve the merge conflicts.

Also provides a new admin tool to bulk upload existing index files.
Copy link
Member

@Turbo87 Turbo87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving and merging this with the understanding that this is an experimental feature without any stability guarantees for now.

As @jtgeibel pointed out in #4661 (review), there are still a few issues to solve for this implementation, but we agree that we can solve these in an incremental way while the feature is still considered experimental.

@Turbo87 Turbo87 merged commit acb38cb into rust-lang:master May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-backend ⚙️ C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants