Reject/warn/monitor on large objects entering the cache #440

alexeagle · 2021-05-14T14:23:58Z

Our cache deployment has suffered badly from docker images being added by users as inputs to actions. Something like 600MB object then needs to get fetched by CI agents and it overloads the cache with requests.

I'm not exactly sure what the cache could do differently to help. At least showing the largest object, with some hint about the action that generated it.

The feedback loop would result in us making sure these targets are tagged no-remote-cache so the actions they produce don't upload.

The text was updated successfully, but these errors were encountered:

mostynb · 2021-05-14T18:39:40Z

Hi,

In cache-only mode (ie without remote execution) the client doesn't upload actions to the cache server- the only information that the server could report is the sizes of the blobs and their sha256 hashes, and if it were to scan the entire cache lookup table it could report the lookup keys for ActionResult messages that refer to large blobs. But that would be slow and still be difficult to find the corresponding actions on the client side.

I suppose the cache server could choose to reject blobs over a certain size limit, but I think most clients would just log a warning and continue, and then the cache hit rate would drop until someone notices. So this wouldn't be a great solution either.

Perhaps this is best solved on the client side? I don't know if bazel has a way to do this.

alexeagle · 2021-05-14T19:31:13Z

thanks for the reply @mostynb . That makes sense, I agree that Bazel is the more likely place to make such a feature.

alexeagle · 2021-05-20T15:06:14Z

Based on discussion on that linked Bazel issue I think this is actually a desirable feature.

@mostynb would you accept a PR adding a flag to reject large WriteResource requests, basically in this spot https://gist.github.com/alexeagle/71d39f470ba45c7f8a7fcc38a70bfd8c ?

mostynb · 2021-05-20T19:41:46Z

Sure, PRs welcome.

I guess --max_blob_size <bytes> (default 0, unlimited) would be a reasonable command line flag/config file setting.

Please make this apply to the HTTP interface as well as gRPC.

There's a special case to consider- what should we do if the client uploads an ActionCache blob with inlined CAS blobs, with total size greater than the limit, but each individual blob below the size limit? gRPC UpdateActionResult calls are limited to 4M in practice by convention (and that's probably lower than any max_blob_size value that would be set), but HTTP AC uploads can be arbritrarily large.

This causes both a gRPC and HTTP endpoint to reject blob writes that are larger than the configured size (in MB) Fixes buchgr#440

This flag specifies the maximum (logical) blob size that the cache will accept from clients. This limit is not applied to preexisting blobs in the cache. Implements buchgr#440

This flag specifies the maximum (logical) blob size that the cache will accept from clients. This limit is not applied to preexisting blobs in the cache. Implements #440

mostynb · 2021-06-10T19:47:28Z

This feature is now available in the v2.1.0 release.

These were introduced to reduce load on a remote-cache instance to avoid network saturation. A month later, a feature was added in one remote-cache implementatation which provides a different fix: buchgr/bazel-remote#440 rejects large input files on upload. In practice, while these action do often produce huge outputs, they are also slow to re-execute. In many cases it's worth it to use a remote-cache for RunAndCommitLayer in particular to avoid a local rebuild even though it's a large network fetch. Currently users can't configure this because we've hardcoded the values. If they do want to keep the no-remote-cache execution requirement, they can do this via a tag (provided they opt-in to experimental_allow_tags_propagation, see bazelbuild/bazel#8830) #1856 (comment) is an example of a user asking for these to be removed.

alexeagle closed this as completed May 14, 2021

alexeagle mentioned this issue May 20, 2021

[Feature Request] - Bazel Caching validity and management bazelbuild/bazel#11182

Closed

alexeagle reopened this May 20, 2021

alexeagle mentioned this issue May 20, 2021

container rules put high load on remote cache bazelbuild/rules_docker#1855

Closed

alexeagle added a commit to alexeagle/bazel-remote that referenced this issue May 24, 2021

Add --max_blob_size flag

ffdb70f

This causes both a gRPC and HTTP endpoint to reject blob writes that are larger than the configured size (in MB) Fixes buchgr#440

alexeagle mentioned this issue May 24, 2021

Add --max_blob_size flag #443

Closed

alexeagle added a commit to aspect-forks/bazel-remote that referenced this issue Jun 2, 2021

Add --max_blob_size flag

f4b7932

This causes both a gRPC and HTTP endpoint to reject blob writes that are larger than the configured size (in MB) Fixes buchgr#440

mostynb pushed a commit that referenced this issue Jun 10, 2021

Add --max_blob_size flag

08a343d

This flag specifies the maximum (logical) blob size that the cache will accept from clients. This limit is not applied to preexisting blobs in the cache. Implements #440

mostynb closed this as completed Jun 10, 2021

alexeagle mentioned this issue Mar 16, 2022

perf: remove no-remote-cache execution requirements bazelbuild/rules_docker#2043

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reject/warn/monitor on large objects entering the cache #440

Reject/warn/monitor on large objects entering the cache #440

alexeagle commented May 14, 2021

mostynb commented May 14, 2021

alexeagle commented May 14, 2021

alexeagle commented May 20, 2021

mostynb commented May 20, 2021

mostynb commented Jun 10, 2021

Reject/warn/monitor on large objects entering the cache #440

Reject/warn/monitor on large objects entering the cache #440

Comments

alexeagle commented May 14, 2021

mostynb commented May 14, 2021

alexeagle commented May 14, 2021

alexeagle commented May 20, 2021

mostynb commented May 20, 2021

mostynb commented Jun 10, 2021