-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Teach the FilesystemValueChecker about remote outputs #7269
Conversation
3af20a5
to
a714e8d
Compare
for (Map.Entry<TreeFileArtifact, FileArtifactValue> e : childFileValues.entrySet()) { | ||
isRemote = isRemote && e.getValue().isRemote(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like things get strange when some files within the tree are remote and others non-remote. can you add some comments about when that would happen and the expected semantics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can that happen? A TreeArtifact can only ever be created by one piece by an action.
Can it maybe happen that only parts of it are materialized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I ll add some comments (and checks maybe). I think it should be either remote or local and not a mix of both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Can you comment on the need for this? How often is the output tree modified inter-build? Inside google, it feels like we're moving towards a world where we implicitly assume the output tree is immutable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is unused code at this point, right? there's nothing in bazel codebase creating these artifact values with isRemote() == true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make sure to add integration tests internally.
src/main/java/com/google/devtools/build/lib/skyframe/FilesystemValueChecker.java
Outdated
Show resolved
Hide resolved
@@ -417,6 +422,14 @@ private boolean actionValueIsDirtyWithDirectSystemCalls(ActionExecutionValue act | |||
try { | |||
ArtifactFileMetadata fileMetadata = | |||
ActionMetadataHandler.fileMetadataFromArtifact(file, null, tsgm); | |||
FileArtifactValue fileValue = actionValue.getArtifactValue(file); | |||
boolean lastSeenRemotely = fileValue != null && fileValue.isRemote(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is fileValue
null here? Is my understanding correct that this is for cases when we don't actually need to call stat()
to get the metadata of the output of an action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. IIRC fileValue
here is none null only for injected metadata.
for (Map.Entry<TreeFileArtifact, FileArtifactValue> e : childFileValues.entrySet()) { | ||
isRemote = isRemote && e.getValue().isRemote(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can that happen? A TreeArtifact can only ever be created by one piece by an action.
Can it maybe happen that only parts of it are materialized?
@@ -417,6 +422,14 @@ private boolean actionValueIsDirtyWithDirectSystemCalls(ActionExecutionValue act | |||
try { | |||
ArtifactFileMetadata fileMetadata = | |||
ActionMetadataHandler.fileMetadataFromArtifact(file, null, tsgm); | |||
FileArtifactValue fileValue = actionValue.getArtifactValue(file); | |||
boolean lastSeenRemotely = fileValue != null && fileValue.isRemote(); | |||
if (!fileMetadata.exists() && lastSeenRemotely) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the expected semantics when:
- A file is created remotely
- The file is later materialized
- The file is then deleted by
rm
Is Bazel expected to notice the deletion in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Bazel expected to notice the deletion in that case?
It wouldn't notice it as an ActionExecutionValue
is immutable and we can't and shouldn't modify it after action execution. It's a valid point you are raising though and I think there are two ways forwards:
- Output files that are materialized after action execution will remain in the output base and Bazel will think that they are actually stored remotely and won't notice the scenario you are describing.
- The remote module keeps track of all outputs it's materializing after action execution and deletes them again at the end of a build. That way Bazel's in memory state would match what's in the output base after the build.
Apologies for the delay. I was out of office.
The need for this arrives from not download all build outputs for remote builds but instead injecting metadata about build outputs. In Bazel remote execution we'd like to have it so that some build outputs are downloaded locally (i.e. top level) and others are not and only live remotely.
It's my understanding that this is not a supported scenario by Bazel aka undefined behavior. Why are you asking? |
yep. I broke this change out from a larger patch set. I ll be injecting |
Isn't this code you're modifying in |
@benjaminp you are right. I shall learn to read better. In my head I mapped inter-build to mean during the build. @janakdr I wouldn't be able to quantify inter-build modifications to the output-tree. Ideally these don't happen, but as Benjamin correctly wrote it's hard a best effort mechanism that isn't fool proof. |
a714e8d
to
481db5e
Compare
The FilesystemValueChecker is used by Bazel to detect modified outputs before a command. This change teaches it about remote outputs that don't exist in the output base. That is if SkyFrame has metadata about a remote output file which does not exist in the output base it will not invalidate the output. However, if the file exists in the output base it will be taken as the source of truth. Progress towards bazelbuild#6862
481db5e
to
a38b6c2
Compare
The FilesystemValueChecker is used by Bazel to detect modified
outputs before a command. This change teaches it about remote
outputs that don't exist in the output base. That is if
SkyFrame has metadata about a remote output file which does not
exist in the output base it will not invalidate the output. However,
if the file exists in the output base it will be taken as the source
of truth.
Progress towards #6862