Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a GitHub Actions Cache remote cache backend #19831

Merged
merged 7 commits into from
Sep 27, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/markdown/Using Pants/remote-caching-execution.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ By default, Pants executes processes in a local [environment](doc:environments)

2. "Remote execution" where Pants offloads execution of processes to a remote server (and consumes cached results from that remote server)

Pants does this by using the "Remote Execution API" to converse with the remote cache or remote execution server.
Pants does this by using the "Remote Execution API" to converse with the remote cache or remote execution server. Pants also [supports some additional providers](doc:remote-caching) other than Remote Execution API that provide only remote caching, without execution.

What is Remote Execution API?
-----------------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,26 @@ createdAt: "2021-03-19T21:40:24.451Z"
What is remote caching?
=======================

Remote caching allows Pants to store and retrieve the results of process execution to and from a remote server that complies with the [Remote Execution API](https://github.com/bazelbuild/remote-apis) standard ("REAPI"), rather than only using your machine's local Pants cache. This allows Pants to share a cache across different runs and different machines, for example, all of your CI workers sharing the same fine-grained cache.
Remote caching allows Pants to store and retrieve the results of process execution to and from a remote server, rather than only using your machine's local Pants cache. This allows Pants to efficiently share a cache across different runs and different machines, for example, all of your CI workers sharing the same fine-grained cache.

Setup
=====
Pants supports several remote caching providers:

- [Remote Execution API](https://github.com/bazelbuild/remote-apis) ("REAPI"), which also supports [remote execution](doc:remote-execution)
- GitHub Actions Cache
- Local file system

Remote Execution API
====================

Server
------

Remote caching requires the availability of a REAPI-compatible cache. See the [REAPI server compatibility guide](doc:remote-caching-execution#server-compatibility) for more information.
See the [REAPI server compatibility guide](doc:remote-caching-execution#server-compatibility) for more information about REAPI-compatible caches.

Pants Configuration
-------------------

After you have either set up a REAPI cache server or obtained access to one, the next step is to point Pants to it so that Pants will use it to read and write process results.
After you have either set up a REAPI cache server or obtained access to one, the next step is to point Pants to it so that Pants will use it to read and write process results.

For the following examples, assume that the REAPI server is running on `cache.corp.example.com` at port 8980 and that it is on an internal network. Also assume that the name of the REAPI instance is "main." At a minimum, you will need to configure `pants.toml` as follows:

Expand All @@ -34,6 +40,64 @@ remote_instance_name = "main"

If the endpoint is using TLS, then the `remote_store_address` option would be specified with the `grpcs://` scheme, i.e. `"grpcs://cache.corp.example.com:8980"`.

GitHub Actions Cache
====================

GitHub Actions provides built-in caching service which Pants supports using for sharing caches across GitHub Actions runs (not with machines outside of GitHub Actions). It is typically used via the `actions/cache` action to cache whole directories and files, but Pants can use the same functionality for fine-grained caching.
huonw marked this conversation as resolved.
Show resolved Hide resolved

> 🚧 GitHub Actions Cache support is still experimental
>
> Support for this cache provider is still under development, with more refinement required. Please [let us know](doc:getting-help) if you use it and encounter errors or warnings.

Workflow
--------

The values of the `ACTIONS_CACHE_URL` and `ACTIONS_RUNTIME_TOKEN` environment variables need to be provided to Pants via the `[GLOBAL].remote_store_address` and `[GLOBAL].remote_store_headers` options respectively. They are only provided to action calls (not shell steps that use `run: ...`). Include a step like the following in your jobs, which sets those options via environment variables, before executing any Pants commands:

```yaml
- name: Configure Pants caching to GitHub Actions Cache
uses: actions/github-script@v6
with:
script: |
core.exportVariable('PANTS_REMOTE_STORE_ADDRESS', 'experimental:github-actions-cache+' + (process.env.ACTIONS_CACHE_URL || ''));
core.exportVariable('PANTS_REMOTE_STORE_HEADERS', `+{'authorization':'Bearer ${process.env.ACTIONS_RUNTIME_TOKEN || ''}'}`);
```

Pants Configuration
-------------------

Once the GitHub values are configured, Pants will read the environment variables. You will also need to configure pants to read and write to the cache only while in CI, such as [via a `pants.ci.toml` configuration file](doc:using-pants-in-ci#configuring-pants-for-ci-pantscitoml-optional):

```toml
[GLOBAL]
# GitHub Actions cache URL and token are set via environment variables
remote_cache_read = true
remote_cache_write = true
```

If desired, you can also set `remote_instance_name` to a string that's included as a prefix on each cache key, which will be then be displayed in the 'Actions' > 'Caches' UI.

Local file system
=================

Pants can cache "remotely" to a local file system path, which can be used for a networked mount cache, without having to pay the cost of storing Pants' local cache on the network mount too. This can also be used for testing/validation.

> 🚧 Local file system caching support is still experimental
>
> Support for this cache provider is still under development, with more refinement required. Please [let us know](doc:getting-help) if you use it and encounter errors or warnings.

Pants Configuration
-------------------

To read and write the cache to `/path/to/cache`, you will need to configure `pants.toml` as follows:

```toml
[GLOBAL]
remote_store_address = "experimental:file:///path/to/cache"
remote_cache_read = true
remote_cache_write = true
```

Reference
=========

Expand Down
15 changes: 15 additions & 0 deletions src/python/pants/option/global_options.py
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,21 @@ def renderer(_: object) -> str:
"""
),
),
_RemoteAddressScheme(
schemes=("github-actions-cache+http", "github-actions-cache+https"),
supports_execution=False,
experimental=True,
description=softwrap(
f"""
Use the GitHub Actions Cache for fine-grained caching. This requires extracting
`ACTIONS_CACHE_URL` (passing it in `[GLOBAL].remote_store_address`) and
`ACTIONS_RUNTIME_TOKEN` (storing it in a file and passing
`[GLOBAL].remote_oauth_bearer_token_path` or setting `[GLOBAL].remote_store_headers` to
include `authorization: Bearer {{token...}}`). See
{doc_url('remote-caching#github-actions-cache')} for more details.
"""
),
),
)


Expand Down
35 changes: 8 additions & 27 deletions src/rust/engine/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 8 additions & 1 deletion src/rust/engine/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,14 @@ notify = { git = "https://github.com/pantsbuild/notify", rev = "276af0f3c5f300bf
num_cpus = "1"
num_enum = "0.5"
once_cell = "1.18"
opendal = { version = "0.39.0", default-features = false }
# TODO: this is waiting for several changes to be released (likely in 0.41):
# https://github.com/apache/incubator-opendal/pull/3163
# https://github.com/apache/incubator-opendal/pull/3177
opendal = { git = "https://github.com/apache/incubator-opendal", rev = "97bcef60eb0b515bd2442ab5b671080766fa35eb", default-features = false, features = [
"services-memory",
"services-fs",
"services-ghac",
] }
os_pipe = "1.1"
parking_lot = "0.12"
peg = "0.8"
Expand Down
5 changes: 1 addition & 4 deletions src/rust/engine/fs/store/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,7 @@ tower-service = { workspace = true }
tryfuture = { path = "../../tryfuture" }
uuid = { workspace = true, features = ["v4"] }
workunit_store = { path = "../../workunit_store" }
opendal = { workspace = true, default-features = false, features = [
"services-memory",
"services-fs",
] }
opendal = { workspace = true }

[dev-dependencies]
criterion = { workspace = true }
Expand Down
8 changes: 8 additions & 0 deletions src/rust/engine/fs/store/src/remote.rs
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,14 @@ async fn choose_provider(options: RemoteOptions) -> Result<Arc<dyn ByteStoreProv
"byte-store".to_owned(),
options,
)?))
} else if let Some(url) = address.strip_prefix("github-actions-cache+") {
// TODO: this is relying on python validating that it was set as
// `github-actions-cache+https://...`
huonw marked this conversation as resolved.
Show resolved Hide resolved
Ok(Arc::new(base_opendal::Provider::github_actions_cache(
url,
"byte-store".to_owned(),
options,
)?))
} else {
Err(format!(
"Cannot initialise remote byte store provider with address {address}, as the scheme is not supported",
Expand Down
70 changes: 59 additions & 11 deletions src/rust/engine/fs/store/src/remote/base_opendal.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,16 @@ use async_trait::async_trait;
use bytes::Bytes;
use futures::future;
use hashing::{async_verified_copy, Digest, Fingerprint, EMPTY_DIGEST};
use http::header::AUTHORIZATION;
use opendal::layers::{ConcurrentLimitLayer, RetryLayer, TimeoutLayer};
use opendal::{Builder, Operator};
use tokio::fs::File;
use workunit_store::ObservationMetric;

use super::{ByteStoreProvider, LoadDestination, RemoteOptions};

const GITHUB_ACTIONS_CACHE_VERSION: &str = "pants-1";

#[derive(Debug, Clone, Copy)]
pub enum LoadMode {
Validate,
Expand Down Expand Up @@ -71,6 +74,44 @@ impl Provider {
Provider::new(builder, scope, options)
}

pub fn github_actions_cache(
url: &str,
scope: String,
options: RemoteOptions,
) -> Result<Provider, String> {
let mut builder = opendal::services::Ghac::default();

builder.version(GITHUB_ACTIONS_CACHE_VERSION);
builder.endpoint(url);

// extract the token from the `authorization: Bearer ...` header because OpenDAL's Ghac service
// reasons about it separately (although does just stick it in its own `authorization: Bearer
// ...` header internally).
let header_help_blurb = "Using GitHub Actions Cache remote cache requires a token set in a `authorization: Bearer ...` header, set via [GLOBAL].remote_store_headers or [GLOBAL].remote_oauth_bearer_token_path";
let Some(auth_header_value) = options.headers.get(AUTHORIZATION.as_str()) else {
let existing_headers = options.headers.keys().collect::<Vec<_>>();
return Err(format!(
"Expected to find '{}' header, but only found: {:?}. {}",
AUTHORIZATION, existing_headers, header_help_blurb,
));
};

let Some(token) = auth_header_value.strip_prefix("Bearer ") else {
return Err(format!(
"Expected '{}' header to start with `Bearer `, found value starting with {:?}. {}",
AUTHORIZATION,
// only show the first few characters to not accidentally leak (all of) a secret, but
// still give the user something to start debugging
&auth_header_value[..4],
header_help_blurb,
));
};

builder.runtime_token(token);

Provider::new(builder, scope, options)
}

fn path(&self, fingerprint: Fingerprint) -> String {
// We include the first two bytes as parent directories to make listings less wide.
format!(
Expand Down Expand Up @@ -158,11 +199,15 @@ impl ByteStoreProvider for Provider {

let path = self.path(digest.hash);

self
.operator
.write(&path, bytes)
.await
.map_err(|e| format!("failed to write bytes to {path}: {e}"))
match self.operator.write(&path, bytes).await {
Ok(()) => Ok(()),
// The item already exists, i.e. these bytes have already been stored. For example,
// concurrent executions that are caching the same bytes. This makes the assumption that
// which ever execution won the race to create the item successfully finishes the write, and
// so no wait + retry (or similar) here.
Err(e) if e.kind() == opendal::ErrorKind::AlreadyExists => Ok(()),
Err(e) => Err(format!("failed to write bytes to {path}: {e}")),
}
}

async fn store_file(&self, digest: Digest, mut file: File) -> Result<(), String> {
Expand All @@ -174,12 +219,15 @@ impl ByteStoreProvider for Provider {

let path = self.path(digest.hash);

let mut writer = self
.operator
.writer_with(&path)
.content_length(digest.size_bytes as u64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this a workaround that is no longer necessary post apache/opendal#3163, or did this end up in the wrong commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coincidence, unrelated to the fixes I made: the content_length method was removed in 0.40.0 (apache/opendal#3044), and the upgrade to a recent HEAD here goes past that.

.await
.map_err(|e| format!("failed to start write to {path}: {e}"))?;
let mut writer = match self.operator.writer(&path).await {
Ok(writer) => writer,
// The item already exists, i.e. these bytes have already been stored. For example,
// concurrent executions that are caching the same bytes. This makes the assumption that
// which ever execution won the race to create the item successfully finishes the write, and
// so no wait + retry (or similar) here.
Err(e) if e.kind() == opendal::ErrorKind::AlreadyExists => return Ok(()),
Err(e) => return Err(format!("failed to start write to {path}: {e} {}", e.kind())),
};

// TODO: it would be good to pass through options.chunk_size_bytes here
match tokio::io::copy(&mut file, &mut writer).await {
Expand Down
8 changes: 8 additions & 0 deletions src/rust/engine/process_execution/remote/src/remote_cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,14 @@ async fn choose_provider(
"action-cache".to_owned(),
remote_options,
)?))
} else if let Some(url) = address.strip_prefix("github-actions-cache+") {
// TODO: this is relying on python validating that it was set as
huonw marked this conversation as resolved.
Show resolved Hide resolved
// `github-actions-cache+https://...`
Ok(Arc::new(base_opendal::Provider::github_actions_cache(
url,
"action-cache".to_owned(),
remote_options,
)?))
} else {
Err(format!(
"Cannot initialise remote action cache provider with address {address}, as the scheme is not supported",
Expand Down