-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for compression on gRPC cache #14041
Add support for compression on gRPC cache #14041
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@googlebot I signed it! |
a522fc5
to
57f86be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
For reasons, we can't merge PR that contains both non third_party
and third_party
changes. Can you split third_party
changes into another PR?
src/main/java/com/google/devtools/build/lib/remote/GrpcCacheClient.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/Chunker.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/options/RemoteOptions.java
Outdated
Show resolved
Hide resolved
cc27594
to
90c49c9
Compare
Thank you @benjaminp for pointing that out. @philwo seems like #12437 got reverted, what was the reason of the revert? From #11968, it seems there was an issue with the version of apache commons compress, which is not used here. Can zstd-jni be merged back? |
90c49c9
to
99d43c5
Compare
From the rollback commit: "We're rolling back this feature before cutting Bazel 4.0 LTS, as we can't merge the remaining part in time and are worried that the newly added patched JNI dependency will make it hard for distributions like Debian to accept. Let's revisit this when we find a suitable pure Java version of Zstd." The only pure Java version I could find was https://github.com/airlift/aircompressor, which still doesn't support Big Endian (and it doesn't look like they care about that). So, we'd be OK with adding zstdlib and zstd-jni now, probably.
|
I assume https://github.com/luben/zstd-jni is the repo of this library? Looks like it doesn't have much dependencies and already builds with gradle, so should be easy to package for Debian. But leaving @olekw to confirm. |
99d43c5
to
d86b71e
Compare
1f82978
to
4250f0f
Compare
e8369f5
to
6c13594
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me, thank you! I'd like someone from @bazelbuild/remote-execution to give an LGTM, too, then we can import this.
In this case I think it should be possible to import the third-party change directly from this PR, so I can try that first once we're ready to merge. |
"archive": "v1.5.0-4.zip", | ||
"sha256": "d320d59b89a163c5efccbe4915ae6a49883ce653cdc670643dfa21c6063108e4", | ||
"urls": [ | ||
"https://github.com/luben/zstd-jni/archive/v1.5.0-4.zip", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be good to mirror this in mirror.bazel.build
. @philwo could you help with that as part of the merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @philwo we can start to merge this.
The import is almost done except that we use an older zstd-jni version internally which doesn't contain the fix luben/zstd-jni@3d51bdc. So the test @AlessandroPatti, while we are waiting for upgrading internal zstd-jni (it's not an easy task and I am afraid that it won't be done soon), is there any workaround we could apply? |
What version are you using internally? I can try to reproduce on my end and see if there is any workaround. |
We are at luben/zstd-jni@190e4f2 with some custom patches.
SGTM. |
Imported and waiting for the final review. Some zstd tests are not tested internally but should be fine since they are always tested on Bazel CI. |
@coeuvre What's your feeling on the risk level of getting this into 5.0? |
It should be fine to get this into 5.0 since it is guarded by an experimental flag. |
Add support for compressed transfers from/to gRPC remote caches with flag --experimental_remote_cache_compression. Fixes bazelbuild#13344. Closes bazelbuild#14041. PiperOrigin-RevId: 409328001
* Add patch files for zstd-jni Partial commit for third_party/*, see #14203. Closes #14203 Signed-off-by: Yun Peng <[email protected]> * Remote: Add support for compression on gRPC cache Add support for compressed transfers from/to gRPC remote caches with flag --experimental_remote_cache_compression. Fixes #13344. Closes #14041. PiperOrigin-RevId: 409328001 Co-authored-by: Alessandro Patti <[email protected]>
So I got up close and personal with this implementation when I added zstd support to buildfarm, per the supplied streams in the zstd package, and I feel this needs to be reworked. The overall stream is not being compressed for transport in a tidily shorter sequence of upload/download chunks. Instead, each chunk is being compressed and sent as a shorter WriteRequest/ReadResponse. This results in drastically poorer overall byte ratio of identity:compressed compared to a simple This was discovered when the upload went through buildfarm's rechunking mechanism for shards, where request chunks would be combined or split on 16k boundaries by default, and the stream processor only worked on the current chunk, discarding appended bytes, and eliciting decompression corruption. Overall, this implementation is in conflict with the zstd compressor description in the remote apis, which makes no mention of sequential compressor streams in compressed packed blob content, but instead expects the full content to be deliverable to a compressor engine in a single context. I'm going to try to adjust this to get a full blob's worth of compression going, with even block sizes and no per-request/response semantics for the stream. |
Add support for compressed transfers from/to gRPC remote caches. Tested with bazel-remote 2.13.
Relates to #13344