Deliver/Expect singly compressed content with a single ByteStream::Write/Read #16808
Labels
P2
We'll consider working on this in future. (Assignee optional)
team-Remote-Exec
Issues and PRs for the Execution (Remote) team
type: feature request
Description of the feature request:
So I got up close and personal with this implementation when I added zstd support to buildfarm, per the supplied streams in the zstd package, and I feel that #14041 needs to be reworked. This will also need the consensus of the rest of @bazelbuild/remote-execution, since the implications here are a compatibility break with existing implementations (but not a break with persistent data), perhaps mitigable through version/feature specification.
The overall stream is not being compressed for transport in a tidily shorter sequence of upload/download chunks. Instead, each chunk is being compressed and sent as a shorter WriteRequest/ReadResponse. This results in drastically poorer overall byte ratio of identity:compressed compared to a simple
zstd
invocation on blob content, with a higher possibility of blob inflation due to the reduced context and overhead of the smaller blocks.This was discovered when the upload went through buildfarm's rechunking mechanism for shards, where request chunks would be combined or split on 16k boundaries by default, and the stream processor only worked on the current chunk, discarding appended bytes, and eliciting decompression corruption.
Overall, this implementation is in conflict with the zstd compressor description in the remote apis, which makes no mention of sequential compressor streams in compressed packed blob content, but instead expects the full content to be deliverable to a compressor engine in a single context.
This issue tracks a correction to the compressor which creates a single zstd compressed blob, chunked into even block sizes for upload and no per-request/response semantics for the stream.
What underlying problem are you trying to solve with this feature?
Bazel is not producing strictly remote-apis adherent client implementation with --experimental_remote_cache_compression, resulting in
Which operating system are you running Bazel on?
linux
What is the output of
bazel info release
?5.3.2
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
https://github.com/bazelbuild/remote-apis/blob/7d1354eef67545561bf689acdaabb41a99d98584/build/bazel/remote/execution/v2/remote_execution.proto#L241-L248
(Note that this unfortunately does not include specific language about how a ByteStream's compressed content should represent a single expansion via the compressor specified. I will also seek to add this language)
buildfarm/buildfarm#1211
Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: