Replies: 2 comments 4 replies
-
You have the data, you have knowledge. Ship it! |
Beta Was this translation helpful? Give feedback.
4 replies
-
Relevant thread on BazelBuild slack: https://bazelbuild.slack.com/archives/C01E7TH8XK9/p1722517487095289 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
tl;dr: We are considering replacing the Merkle tree cache (
--experimental_remote_merkle_tree_cache
) with a more focused implementation that only caches the Merkle tree nodes for tree artifacts and runfiles trees; we believe this provides most of the benefit without risking making things worse for other kinds of inputs. Please chime in if your personal observations disagree with the conclusions of this post.In Bazel, the inputs to an action are represented as a depset: a directed acyclic graph where each node contains an arbitrary set of inputs and references to other nodes. On the other hand, the remote execution/caching protocol (and, by extension, the Bazel disk cache format, which is adapted from it) represents the inputs to an action as a Merkle tree, where each node describes a self-contained, non-overlapping portion of the input hierarchy.
Ordinarily, the input Merkle tree for an action is computed by flattening the input depset into a list, iterating the list while sorting the inputs by parent directory, computing the Merkle tree node for each directory, and assembling the entire Merkle tree by linking the nodes. This means that two actions whose inputs overlap may end up doing repeated work.
The
--experimental_remote_merkle_tree_cache
flag (disabled by default) tries to avoid some of this work by maintaining an in-memory cache mapping each depset node into a Merkle tree describing only the inputs contained in that node. The full input Merkle tree for an action can then be obtained by converting each depset node into the respective Merkle tree and computing the union of those trees. If the same depset node is shared by multiple actions, work can in theory be saved.While this idea sounds somewhat reasonable in theory, in practice things are more complicated. The fundamental observation is that, except for tree artifacts and runfiles trees, the topological structure of depsets is at odds with the filesystem hierarchy: multiple depset nodes may contribute inputs to the same parent directory, and thus to the same node of the final Merkle tree. Therefore, merging the Merkle trees contributed by each node is not, in general, a cheap operation. (This is not a novel insight, and it was in fact noted in the discussion that led to the introduction of the cache; see #10875 (comment) and following comments.)
In fact, we have quite a bit of evidence (both from bug reports and our own measurements) that enabling the Merkle tree cache very often makes build times worse due to all of the time spent merging a very large number of very small Merkle trees. The only case where they seem to be a significant performance win is when the build is dominated by large tree artifacts. This can be explained by the fact that the Merkle tree nodes corresponding to a tree artifact (and, by extension, runfiles trees, which have the same properties) cannot intersect with other nodes, and are thereby cheap to merge.
(There's a separate problem with the current implementation, discussed at length in #18686 and #20862, whereby using a Merkle tree cache too small to accommodate the largest depset in the build can cause catastrophic performance degradation; while that implementation issue could be fixed, it still wouldn't address the problem I'm describing here.)
In light of these observations, and in the absence of concrete evidence (in the form of reproducible build instructions) that the current implementation is helpful outside of the use case of large tree artifacts and/or runfiles trees, we are considering scaling down the implementation to cover only that case, with the hope of turning it into a "sometimes useful, at worst neutral" toggle instead of a "sometimes useful, often detrimental" one.
We welcome your feedback if you object to these plans. Tagging some folks who are likely to have a stake: @lberki @meisterT @coeuvre @moroten @DavidANeil @lukaciko
Beta Was this translation helpful? Give feedback.
All reactions