-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Track empty directories in tree artifacts #16015
base: master
Are you sure you want to change the base?
[WIP] Track empty directories in tree artifacts #16015
Conversation
@tjgq Could you take a quick glance and let me know whether you approve of the general approach (introducing |
@alexjski Could you take a first pass since you're the expert on tree artifacts? Feel free to reassign back to me to shepherd it in if/once you're happy with the general approach. |
Please note that there are also special cases I haven't dealt with yet outside of test code: For example, the Starlark |
I have some questions and comments here:
There also is a workaround for declaring empty directories within trees today. The capability is weaker since it only allows you to declare that at analysis time (as opposed to whatever the action does), but you can declare that an action returns multiple, nested tree artifacts, e.g.
This will make Bazel aware of the inner directory and you will have an empty tree artifact for that. Addition/removal in such sense clearly affects the action itself, hence will be handled in incremental builds. Same goes for action cache. Let me add @aiuto for the potential API change here and @coeuvre for remote execution discussion. |
Let me first clarify the intention behind this PR: I view #15789 and #15901 as serious correctness issues in a supported feature that is enabled by default (directory outputs). The PR is meant to address these issues while preserving backwards compatibility, especially concerning API. I explicitly do not want to introduce new features.
My personal feeling is that you almost never want the empty directories in the Starlark-facing expansions - the directories are only there to ensure correct incrementality. Staying in line with this PR being a backwards compatible bug fix, I would vote for skipping over
With this PR we do (although I haven't added tests for special remote functionality such as injected artifacts yet) and this is one of the core motivations for this PR - after all, both remote and disk cache are covered by the
I only found two instances of concrete template actions in the Bazel source code, so I just put the logic for skipping
Given that the aim of the PR is to fix incorrect behavior while preserving backwards compatibility, is a flag necessary? I would very much prefer Bazel's stable features to behave correctly by default. I will happily add more tests and ignore In the worst case, couldn't we introduce a flag that defaults to true in Bazel but can be overridden internally at Google? |
IMO what's fundamentally at stake here is the semantics of a tree artifact: is it a representation of the full directory structure including empty directories, or merely of regular files (and symlinks) found within? The current implementation doesn't provide a conclusive answer because it's inconsistent: a freshly executed action is allowed to produce empty subdirectories, but retrieving an action result from a cache won't recreate them. I should admit that, when I filed #15901 in response to #15789, I was a bit hasty to assume that we want the former interpretation for tree artifacts. It's possible that the latter interpretation is the one we prefer. If that were to be the case, then the way to fix the inconsistency would be to have Bazel delete empty subdirectories left behind by the execution of a tree-generating action. |
I agree that if we decide that I heavily lean towards not going down that route though for two reasons:
|
I wonder if at this point it would not be productive to draft a doc to discuss the problem and alternatives considered (https://bazel.build/contribute/design-documents). In particular, I would expect that most actions don't care about empty directories (thinking of compilers etc), it would be interesting to know a case study where that is important. Now, I understand the general idea here, I am personally not convinced that this PR won't result with getting empty directories in places unintentionally (e.g., I think it would make it to As I said before, this feature would require us the ability to at least disable that internally, not sure if for Bazel we would not want to start with a feature switch either, but I am less invested in that, so I am indifferent, will defer that to @tjgq. Couple alternatives/ideas to consider (those are not necessarily good ideas, more like loose thoughts):
|
Fixes #15789
Fixes #15901