-
-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring of remote stores after initial addition of OpenDAL #19902
Labels
Comments
This was referenced Sep 21, 2023
huonw
added a commit
that referenced
this issue
Sep 21, 2023
…9827) This is the final major step that unlocks #11149: - adding new byte store and action cache providers, implemented using https://opendal.apache.org - plumbing through to be able to be use them via `remote_store_address = "..."` - (and implementing an "example" one, that's backed by the local file system) This doesn't yet resolve that issue, as it doesn't implement any sort of useful "simple" truly-remote cache, but it sets up most of the infrastructure. ### User facing change This only adds support for one additional store for now: `experimental:file:///path/to/dir`, which writes the store to `/path/to/dir`. This is store is enough to validate that this functionality (mostly) works, and even seems like it would be useful for: - debugging/testing remote cache behaviour (e.g. can set up reduced reproducers for remote caching bugs using normal file manipulation commands) - potentially for real, for using an NFS mount as a remote cache, without forcing _all_ of Pants' caches to have the NFS overhead. The user facing functionality here is configured entirely through `remote_store_address`: that option now supports URIs that have schemes other than `grpc` or `grpcs`. To start with, just `file`. It also explicitly supports these schemes being experimental, communicated through the `experimental:` scheme-prefix. This is to, hopefully, allow landing the feature earlier and with less risk. For example, running `pants test ::` in the follow example writes various blobs to subdirectories of `/tmp/remote-cache-example` (e.g. `action-cache/e3/2f/e32f034f...` and `byte-store/50/ce/50ce4a68...`), and a second `pants --no-pantsd --no-local-cache test ::` command is able to service it from cache, reporting ` //:slow-test succeeded in 5.01s (cached remotely)` ```toml # pants.toml [GLOBAL] pants_version = "2.17.0" backend_packages = ["pants.backend.shell"] remote_store_address = "experimental:file:///tmp/remote-cache-example" remote_cache_read = true remote_cache_write = true ``` ```python # BUILD experimental_test_shell_command(name="slow-test", command="sleep 5", tools=["sleep"]) ``` ### Details of this change There's three main bits of work: 1. add the new generic OpenDAL byte store and action cache provider (in the two pairs of `base_opendal.rs` + tests files) 2. hook up the specific FS one (in `store/src/remote.rs` and `remote/src/remote_cache.rs`) 3. adjust the options parsing to support them, especially the `experimental:` scheme handling (in `global_options.py`) None of these are huge in isolation, but they add up to a fair chunk of code. I think each of those files can be reviewed somewhat in isolation, in that order. ### Why OpenDAL? It's used by sccache for similar remote-caching purposes (https://github.com/search?q=repo%3Amozilla%2Fsccache%20opendal&type=code), and supports a whole lot of services: - blob stores (e.g. S3, GCS) - key-value caches (e.g. Redis) - CI caches (e.g. GitHub Actions) - weirder ones (IPFS, Google Drive) Pants doesn't/shouldn't support all of them, but definitely blob stores and CI caches seem exciting! ### Next steps This only adds support for caching "remotely" to the local file system but I'll add support for GitHub Actions as follow up work (#19831), and maybe S3 too. Hopefully as part of doing that I'll work through any authentication (etc.) issues and so it becomes easier for others to add the backends they need too. I've suggested we use the URI scheme for deciding on the service, rather than an explicit `remote_store_service = "reapi"` option, or similar. In future, some services may use the same scheme, e.g. I imagine several services might conventionally use `http://` or `https://`. My current thinking is to solve this similar to pip's [VCS requirements](https://pip.pypa.io/en/stable/topics/vcs-support/) (`vcs+transport://...`) or `sqlalchemy` [database URLs](https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls) (`dialect+driver://...`), e.g. `remote_store_address = "s3+http://..."` or `remote_store_address = "http+s3://..."`, and then the provider would strip off the `s3` part as appropriate. There's also a whole lot of TODOs that I don't think are critical to landing this code, and I can chip away at once it's in: by landing this sooner, I'm hoping we can start having the user-facing parts of #11149 being tested, validated and enjoyed sooner rather than later. I've already done a loooot of *pre*-factoring (see all the PRs referencing #11149), so it'd be nice to switch to having the feature in and just do some normal *re*-factoring. 😅 TODOs mostly tracked in #19902
huonw
added a commit
that referenced
this issue
Sep 27, 2023
This adds support for remote caching to the Github Actions Cache, by setting `remote_store_address = "experimental:github-actions-cache+https://..."`. This is implemented as just another OpenDAL (#19827) provider. As written, this requires the `ACTIONS_CACHE_URL` and `ACTIONS_RUNTIME_TOKEN` values to be massaged into the appropriate pants configuration. I've included basic docs for how to do this. This set-up can/should be packaged into the `init-pants` action in future, but for now, we're in experimental-land. There's various ways in which this can be improved, but I think it's helpful to have a "real world" example of a remote cache beyond the `file://` one to be able to guide the refactoring, plus good to start testing this earlier. Thus, I'd like to land this, and do them as follow-up: - better docs/help (always!), particularly on the `using-pants-in-ci` docs page, which I haven't updated here - now that I've implemented this, I don't like my `<provider>+<scheme>://...` proposal from #19827. I think it'd be better to have a separate `remote_store_provider = "experimental-github-actions-cache"` (or `"reapi"`, etc.) option and `remote_store_address` is just the raw URL (with the provider validating it can understand the URL, of course) - it'd be good to be able to pass the token in in a simpler way, I'm thinking there could be a `remote_oauth_bearer_token` option literally, which is expected (and suggested in docs) to be passed as the `PANTS_REMOTE_OAUTH_BEARER_TOKEN` env var, in addition to `remote_oauth_bearer_token_path` These are tracked in #19902 (and all of the others in that issue would be good too, but weren't as acutely surfaced by this change, I'll start chipping away at it once this is in). This is awkward to test. My plan, not implemented here, is to somehow tee up a dedicated set of tests to run within https://github.com/nektos/act somehow. For now, we'll have to go with manual testing, e.g. this specific PR is validated by https://github.com/huonw/pants-cache-github-actions-poc/actions/runs/6298908094/job/17119702612#step:11:983 This is the first example of something useful user-facing for #11149. This doesn't close it, I'll wait until we have a blob store like S3 too. The commits are individually reviewable. (Replaces #17840.)
huonw
added a commit
that referenced
this issue
Oct 5, 2023
This does a no-functionality-change refactoring of the remote _store_ providers (backed by REAPI and OpenDAL). This splits the concept of a remote provider out into their own internal "surface" crate `remote_provider`, plus three crates that are (mostly) implementation-details of that one: - `remote_provider/remote_provider_traits` for the traits (and structs!) that the various providers have to implement - `remote_provider/remote_provider_reapi` for the gRPC/REAPI-backed provider - `remote_provider/remote_provider_opendal` for the OpenDAL-backed provider They're arranged like this: ```mermaid graph BT remote("process_execution/remote (existing)") store("fs/store (existing)") traits("remote_provider_traits (new)") grpc("remote_provider_reapi (new)") opendal("remote_provider_opendal (new)") selector("remote_provider (new)") grpc --> traits opendal --> traits selector --> grpc selector --> opendal remote -- for remote execution --> grpc remote --> selector store --> selector ``` Theoretically the new crates other than `remote_provider` are an implementation detail... except there's a helper in `remote_provider_reapi` that's used for the remote _execution_ in `process_execution/remote`, in addition to the byte store and action cache, hence the dependency there. This is one point of #19902, following up on #19827 (comment). The commits are individually reviewable, although the overall PR view gives a slightly more useful view of the overall file renames for some files.
This was referenced Oct 29, 2023
huonw
added a commit
that referenced
this issue
Oct 31, 2023
This replaces uses of `RemoteCacheProviderOptions` with `RemoteStoreOptions` (previously `RemoteOptions`): they're very similar, and only separate because they were previously in very different parts of the codebase, but this has changed with #19958. This makes a few additional changes to make the merging smoother: - removes the "capability cell" sharing optimisation, which enabled sharing of the `GetCapabilitiesRequest` gRPC request result between the CAS and the remote execution providers, if both were being used with the same address. I thought that optimising one-time (on start-up) requests from 2 to 1 wasn't worth the complexity of plumbing it through these more generic options, but can restore it if it is. - renames various fields This does meant that there's a few options that are seemingly irrelevant to the action cache providers, e.g. "batch API size limit" is meaningless. I think this is fine: I think it's fundamental to supporting multiple providers that there'll be a grab-bag of options, which some providers support and others do not. The commits are individually reviewable. (This is one point of #19902.)
huonw
added a commit
that referenced
this issue
Nov 15, 2023
This takes the unusual step of _removing_ documentation. In particular, this removes the documentation for the still _very_ experimental new remote cache backends: GitHub Actions and file system, because they're useless and very unstable: - there's significant user-facing refactoring work that I'm not going to get to for 2.19 (#19902), particularly around changing how the options are configured - the GHA backend seems to be very... limited, and only works for small repos (#20133) I'm targeting 2.19, as it stabilises, and leaving the docs in `main` (for 2.20).
huonw
added a commit
that referenced
this issue
Nov 21, 2023
…rer_token_path` (#20116) This adds a new global option `remote_oauth_bearer_token`, that allows setting the remote store OAuth token directly. This is easier to work with in CI contexts (and similar), where setting an environment variable is often quite simple, and having a direct option allows setting this via `PANTS_REMOTE_OAUTH_BEARER_TOKEN`. For instance, it simplifies how one can configure the token for the GitHub Actions Cache remote cache provider. I believe this replaces `remote_oauth_bearer_token_path` fully: I think `remote_oauth_bearer_token_path = "/path/to/token.txt"` can become `remote_oauth_bearer_token = "@/path/to/token.txt"`. Thus, I have deprecated the `..._path` option, for removal in a couple of versions. (This is part of #19902.)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
With #19827, there's a bunch of refactoring of the remote stores:
remote_provider_opendal
)grpc://
->http://
(ands
version) mapping within the REAPI provider, as an implementation detail (add support for mTLS in remote cache usage #19887 (comment))remote_store_provider = "..."
option rather than trying to shove it intoremote_store_address
via the schema (with a default to "reapi" or something) (Add remote_provider = "..." option, replacing scheme-look-ups #20240)remote_oauth_bearer_token
option to allow setting the token directly via an env var (Addremote_oauth_bearer_token
option, deprecatingremote_oauth_bearer_token_path
#20116)The text was updated successfully, but these errors were encountered: