Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a GitHub Actions Cache remote cache backend #19831

Merged
merged 7 commits into from
Sep 27, 2023
Merged

Add a GitHub Actions Cache remote cache backend #19831

merged 7 commits into from
Sep 27, 2023

Conversation

huonw
Copy link
Contributor

@huonw huonw commented Sep 14, 2023

This adds support for remote caching to the Github Actions Cache, by setting remote_store_address = "experimental:github-actions-cache+https://...". This is implemented as just another OpenDAL (#19827) provider.

As written, this requires the ACTIONS_CACHE_URL and ACTIONS_RUNTIME_TOKEN values to be massaged into the appropriate pants configuration. I've included basic docs for how to do this. This set-up can/should be packaged into the init-pants action in future, but for now, we're in experimental-land.

There's various ways in which this can be improved, but I think it's helpful to have a "real world" example of a remote cache beyond the file:// one to be able to guide the refactoring, plus good to start testing this earlier. Thus, I'd like to land this, and do them as follow-up:

  • better docs/help (always!), particularly on the using-pants-in-ci docs page, which I haven't updated here
  • now that I've implemented this, I don't like my <provider>+<scheme>://... proposal from Add support for OpenDAL-backed remote stores (file://, initially) #19827. I think it'd be better to have a separate remote_store_provider = "experimental-github-actions-cache" (or "reapi", etc.) option and remote_store_address is just the raw URL (with the provider validating it can understand the URL, of course)
  • it'd be good to be able to pass the token in in a simpler way, I'm thinking there could be a remote_oauth_bearer_token option literally, which is expected (and suggested in docs) to be passed as the PANTS_REMOTE_OAUTH_BEARER_TOKEN env var, in addition to remote_oauth_bearer_token_path

These are tracked in #19902 (and all of the others in that issue would be good too, but weren't as acutely surfaced by this change, I'll start chipping away at it once this is in).

This is awkward to test. My plan, not implemented here, is to somehow tee up a dedicated set of tests to run within https://github.com/nektos/act somehow. For now, we'll have to go with manual testing, e.g. this specific PR is validated by https://github.com/huonw/pants-cache-github-actions-poc/actions/runs/6298908094/job/17119702612#step:11:983

This is the first example of something useful user-facing for #11149. This doesn't close it, I'll wait until we have a blob store like S3 too.

The commits are individually reviewable.

(Replaces #17840.)

@huonw huonw changed the base branch from main to huonw/11149-dal September 14, 2023 08:28
@huonw

This comment was marked as outdated.

Base automatically changed from huonw/11149-dal to main September 21, 2023 23:33
huonw added a commit that referenced this pull request Sep 21, 2023
…9827)

This is the final major step that unlocks #11149: 

- adding new byte store and action cache providers, implemented using
https://opendal.apache.org
- plumbing through to be able to be use them via `remote_store_address =
"..."`
- (and implementing an "example" one, that's backed by the local file
system)

This doesn't yet resolve that issue, as it doesn't implement any sort of
useful "simple" truly-remote cache, but it sets up most of the
infrastructure.

### User facing change

This only adds support for one additional store for now:
`experimental:file:///path/to/dir`, which writes the store to
`/path/to/dir`.

This is store is enough to validate that this functionality (mostly)
works, and even seems like it would be useful for:
- debugging/testing remote cache behaviour (e.g. can set up reduced
reproducers for remote caching bugs using normal file manipulation
commands)
- potentially for real, for using an NFS mount as a remote cache,
without forcing _all_ of Pants' caches to have the NFS overhead.

The user facing functionality here is configured entirely through
`remote_store_address`: that option now supports URIs that have schemes
other than `grpc` or `grpcs`. To start with, just `file`. It also
explicitly supports these schemes being experimental, communicated
through the `experimental:` scheme-prefix. This is to, hopefully, allow
landing the feature earlier and with less risk.

For example, running `pants test ::` in the follow example writes
various blobs to subdirectories of `/tmp/remote-cache-example` (e.g.
`action-cache/e3/2f/e32f034f...` and `byte-store/50/ce/50ce4a68...`),
and a second `pants --no-pantsd --no-local-cache test ::` command is
able to service it from cache, reporting ` //:slow-test succeeded in
5.01s (cached remotely)`

```toml
# pants.toml
[GLOBAL]
pants_version = "2.17.0"

backend_packages = ["pants.backend.shell"]

remote_store_address = "experimental:file:///tmp/remote-cache-example"
remote_cache_read = true
remote_cache_write = true
```

```python
# BUILD
experimental_test_shell_command(name="slow-test", command="sleep 5", tools=["sleep"])
```

### Details of this change

There's three main bits of work:

1. add the new generic OpenDAL byte store and action cache provider (in
the two pairs of `base_opendal.rs` + tests files)
2. hook up the specific FS one (in `store/src/remote.rs` and
`remote/src/remote_cache.rs`)
3. adjust the options parsing to support them, especially the
`experimental:` scheme handling (in `global_options.py`)

None of these are huge in isolation, but they add up to a fair chunk of
code. I think each of those files can be reviewed somewhat in isolation,
in that order.

### Why OpenDAL?

It's used by sccache for similar remote-caching purposes
(https://github.com/search?q=repo%3Amozilla%2Fsccache%20opendal&type=code),
and supports a whole lot of services:

- blob stores (e.g. S3, GCS)
- key-value caches (e.g. Redis)
- CI caches (e.g. GitHub Actions)
- weirder ones (IPFS, Google Drive)

Pants doesn't/shouldn't support all of them, but definitely blob stores
and CI caches seem exciting!

### Next steps

This only adds support for caching "remotely" to the local file system
but I'll add support for GitHub Actions as follow up work (#19831), and
maybe S3 too. Hopefully as part of doing that I'll work through any
authentication (etc.) issues and so it becomes easier for others to add
the backends they need too.

I've suggested we use the URI scheme for deciding on the service, rather
than an explicit `remote_store_service = "reapi"` option, or similar. In
future, some services may use the same scheme, e.g. I imagine several
services might conventionally use `http://` or `https://`. My current
thinking is to solve this similar to pip's [VCS
requirements](https://pip.pypa.io/en/stable/topics/vcs-support/)
(`vcs+transport://...`) or `sqlalchemy` [database
URLs](https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls)
(`dialect+driver://...`), e.g. `remote_store_address = "s3+http://..."`
or `remote_store_address = "http+s3://..."`, and then the provider would
strip off the `s3` part as appropriate.

There's also a whole lot of TODOs that I don't think are critical to
landing this code, and I can chip away at once it's in: by landing this
sooner, I'm hoping we can start having the user-facing parts of #11149
being tested, validated and enjoyed sooner rather than later. I've
already done a loooot of *pre*-factoring (see all the PRs referencing
#11149), so it'd be nice to switch to having the feature in and just do
some normal *re*-factoring. 😅

TODOs mostly tracked in #19902
@huonw

This comment was marked as resolved.

@huonw

This comment was marked as resolved.

@huonw huonw marked this pull request as ready for review September 26, 2023 00:18
@huonw huonw requested a review from stuhood September 26, 2023 00:19
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks @huonw.

let mut writer = match self
.operator
.writer_with(&path)
.content_length(digest.size_bytes as u64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this a workaround that is no longer necessary post apache/opendal#3163, or did this end up in the wrong commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coincidence, unrelated to the fixes I made: the content_length method was removed in 0.40.0 (apache/opendal#3044), and the upgrade to a recent HEAD here goes past that.

src/rust/engine/fs/store/src/remote.rs Outdated Show resolved Hide resolved
@huonw huonw merged commit 9382fe2 into main Sep 27, 2023
@huonw huonw deleted the huonw/11149-ghac branch September 27, 2023 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants