[feature] Support Remote Execution API for caching #1520

rhuanbarreto · 2024-06-24T09:20:03Z

Is your feature request related to a problem? Please describe.

Although moonbase has a caching service, for regulatory reasons we cannot store cached artifacts outside our own domain.

Many other monorepo tools like bazel, pants and rush enables the usage of your own storage backend for caching artifacts.

On the other hand, caching the .moon/cache folder in github actions doesn't help much either once the size limits of github are too low.

Describe the solution you'd like

I would like to have a config so I can self host my own cached artifacts in Azure Blob Storage for example. If this includes running a container separately for the service like https://github.com/buchgr/bazel-remote it's fine.

Describe alternatives you've considered

For now using moonbase is actually hard as it creates a dependency a service outside our domain.
So only alternative is using Github / Azure DevOps pipeline caching.

The text was updated successfully, but these errors were encountered:

milesj · 2024-06-24T18:28:41Z

I've been working on making moonbase self-hostable, but while doing so, I've had thoughts of just reworking it into a generic remote caching server. I keep going back and forth on which approach would be better. Either way, it's a lot for me to maintain at the moment.

rhuanbarreto · 2024-06-25T07:48:53Z

No rush at all! Very important to have but also not the top priority right now.

One suggestion to cut some corners that don't need to be developed: You can leverage bazel-remote right away and avoid building the same abstraction again. Leverage it so you don't need to build something that almost became an industry standard. This will also put a big plus on the monorepo.tools website for moonrepo.

So the REAPI is a gRPC Protobuf implementation where the bazel-remote responds with the cache parts in a streaming way, which saves lots of back and forth.

One implementation in rust is done by Pants in this file: https://github.com/pantsbuild/pants/blob/main/src/rust/engine/process_execution/src/cache.rs

Hope you can find a way! It would be very beneficial to all the community.

milesj · 2024-06-25T16:43:30Z

Yeah agreed, I've also thought about piggy backing off of bazel's APIs. Might as well.

dudicoco · 2024-08-15T20:32:14Z

Maybe the remote cache can be implemented on the client side instead of having to use a server?
That way the client can directly read/write the cache from blob storage.

rhuanbarreto · 2024-08-15T22:05:11Z

By using bazel-remote we do this. But moon must support this as the source for finding the cache hits and hydrating the state.

milesj · 2024-08-15T22:08:08Z

I've briefly looked into this, and I will be moving to bazel's APIs, since they also offer action caching which I'll need in the future. Just need to find the time to integrate it. If anyone else wants to tackle it, let me know.

dudicoco · 2024-08-16T19:51:50Z

By using bazel-remote we do this. But moon must support this as the source for finding the cache hits and hydrating the state.

Can you elaborate? Doesn't bazel-remote require a server?

rhuanbarreto · 2024-08-19T10:16:20Z

Yes. We run a bazel-remote container backed by azure blob storage. We connect to bazel-remote using mTLS connection. We use this today with Pants. If moon could support the same, we don't need to have many different places for managing this cache.

dudicoco · 2024-08-25T08:21:43Z

@rhuanbarreto I still don't understand your point.

My suggestion was to have the client make direct API calls to the blob storage (S3 etc.) instead of communicating with a server which has to be deployed and maintained. In addition a server would require another authentication and authorization mechanism for the clients, which you would get out of the box with IAM permissions for a client based solution.

So I still don't see the advantage of having a server based solution which adds extra complexity and overhead.

milesj · 2024-09-12T21:53:25Z

Good news, a new rust crate recently popped up that does a lot of the heavy lifting for the bazel remote APIs. https://github.com/amkartashov/bazel-remote-apis-rust

Will give this a shot for the next release.

rhuanbarreto · 2024-09-13T10:07:51Z

OMG! Great news! If you need an alpha tester, you know where to find me.

One small request: Make sure moon can support mTLS connections. htppasswd is too unsafe.

larsivi · 2024-10-09T18:06:49Z

The issue linked from nx above is what took me here. I co-own a small startup that mostly runs in GCP. Having used nx for a while in our monorepo, I started to really like the plugins that provided caching via whatever, but in my case GCS buckets. The plugin basically uses the GCS API, and stores/fetches directly to/from the configured bucket. This works both from within GCP, and from dev machines given the proper credentials. My big beef with the changes nx are doing, is that they get paid plugins that do the exact same while blocking the open source ones. I don't necessarily mind using nx cloud or similar (moonbase), but I'd rather use infra we already pay for (and/or have a payment relationship with), rather than buying yet another service. (The new paid plugins doesn't support GCS either at this point.)

Setting up an additional VM for proxying sounds very unnecessary, unless it can provide some additional functionality.

Anyway, I hope this can be come to a useful resolution, as I am now considering the options that are not nx, and moonrepo looks very interesting.

milesj · 2024-11-16T04:43:02Z

An update on this:

I've got a basic implementation working that communicates with https://github.com/buchgr/bazel-remote. PR here: #1651

Uploading to CAS was relatively easy.

However, downloading from CAS is currently blocked. The issue is that I don't know how to reference the cached item in CAS and download the correct blob. The bazel APIs require a digest (hash + size) but we only have rhe hash. We can't calculate the size without archiving the build before running the task, which is far too much overhead.

The bazel asset API actually solve this, as you can associate metadata with an uploaded blob via tagging, but bazel-remote does not support the asset API... https://github.com/buchgr/bazel-remote/blob/master/server/grpc_asset.go#L218

I don't think this will land in the next release, until I can figure out how to calculate these digests.

milesj · 2024-11-16T20:44:50Z

I've been thinking about this even more, and I'm still quite confused.

I took a lot at pants, which uses these bazel APIs, and it looks like the scan the outputs on the file system, read the bytes and size of each file, and collect all digests for these files, then upload them all to remote cache as individual files.

In moon, we pack all the outputs into a single tarball archive, store that at .moon/cache/outputs, and upload the tarball to moonbase with the associated hash. This pattern doesn't look possible with bazel APIs, as we would need to generate the outputs and create the tarball, before the task has ran, which simply isn't possible (lol).

But even if we follow the pants/bazel way of doing things, it still doesn't make much sense. For example:

After a task is ran, we can scan all the outputs, create digests, and upload to remote cache. Super easy.
Before a task is ran, we need to check for a cache hit. But if there are no outputs that exist locally, we can't create digests, and download from the remote cache. How are we supposed to create these digests without knowing the actual size of the outputs? Which isn't possible without running the task?? And at that point it defeats having a cache since we're still doing the work???

milesj · 2024-11-16T23:04:30Z

Ok, ok, I think I finally figured it all out, thanks to this article: https://bitrise.io/blog/post/bazel-remote-caching-api

I need to use the ActionResult as an intermediary cache, which then maps the outputs the to the task being ran. https://github.com/bazelbuild/remote-apis/blob/main/build/bazel/remote/execution/v2/remote_execution.proto#L1056

rhuanbarreto added the enhancement New feature or request label Jun 24, 2024

eliellis mentioned this issue Sep 26, 2024

Nx 19.7.0+: Custom remote caches aren't respected with relevant NX_*DB* flags enabled nrwl/nx#28150

Closed

4 tasks

Jordan-Hall mentioned this issue Oct 14, 2024

Migrating Nx core to Rust & deprecating custom runners nrwl/nx#28434

Open

milesj mentioned this issue Nov 16, 2024

new: Support Bazel Remote APIs. #1651

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature] Support Remote Execution API for caching #1520

[feature] Support Remote Execution API for caching #1520

rhuanbarreto commented Jun 24, 2024

milesj commented Jun 24, 2024

rhuanbarreto commented Jun 25, 2024

milesj commented Jun 25, 2024

dudicoco commented Aug 15, 2024

rhuanbarreto commented Aug 15, 2024

milesj commented Aug 15, 2024

dudicoco commented Aug 16, 2024

rhuanbarreto commented Aug 19, 2024

dudicoco commented Aug 25, 2024

milesj commented Sep 12, 2024

rhuanbarreto commented Sep 13, 2024

larsivi commented Oct 9, 2024

milesj commented Nov 16, 2024

milesj commented Nov 16, 2024 •

edited

Loading

milesj commented Nov 16, 2024

[feature] Support Remote Execution API for caching #1520

[feature] Support Remote Execution API for caching #1520

Comments

rhuanbarreto commented Jun 24, 2024

milesj commented Jun 24, 2024

rhuanbarreto commented Jun 25, 2024

milesj commented Jun 25, 2024

dudicoco commented Aug 15, 2024

rhuanbarreto commented Aug 15, 2024

milesj commented Aug 15, 2024

dudicoco commented Aug 16, 2024

rhuanbarreto commented Aug 19, 2024

dudicoco commented Aug 25, 2024

milesj commented Sep 12, 2024

rhuanbarreto commented Sep 13, 2024

larsivi commented Oct 9, 2024

milesj commented Nov 16, 2024

milesj commented Nov 16, 2024 • edited Loading

milesj commented Nov 16, 2024

milesj commented Nov 16, 2024 •

edited

Loading