-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move BES upload check earlier, to prevent calling FindMissingBlobs #16999
Conversation
Before this change, every file referenced in the BEP would have a `FindMissingBlobs` call, which can be very expensive with hundreds of outputs. This could make `--experimental_remote_build_event_upload=minimal` a performance regression compared to `--experimental_build_event_upload_strategy=local`.
Side comment: it would be nice if those |
If this gets in quickly, I think it should go in 6.0 as it only affects code that uses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
You are right, if Bazel shouldn't upload these files, it shouldn't call FindMissingBlobs for them neither.
@@ -236,7 +237,9 @@ private Single<List<PathMetadata>> queryRemoteCache( | |||
List<PathMetadata> filesToQuery = new ArrayList<>(); | |||
Set<Digest> digestsToQuery = new HashSet<>(); | |||
for (PathMetadata path : paths) { | |||
if (shouldQuery(path)) { | |||
if (!shouldUploadBasedOnUploadMode(path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move this check into shouldQuery
. Tests are failing because we didn't add path into knownRemotePaths
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I didn't put it into shouldQuery
, was because the path might not be remote. So not sure on the answer to this.
Ha, the reason we call |
So yeah, we want the paths to be converted, if we uploaded them in this build, not if they were uploaded ever. Do you recommend a different way to accomplish this (or is this the correct way to get that behavior)? This PR was mainly a bug report with a suggested fix. I'm fine with you opening a PR with a proper fix and closing out this PR. |
We can know if one file is uploaded in this build, but we can't know if the file is available in the remote cache (because it is uploaded before) without using If we solely relies on whether the file is uploaded in this build to determine whether to convert the path, there could be an issue that the paths for previously uploaded output are represented as local path in this build. We could instead always use |
OTOH, we can always set |
I like this one. |
Either way, @coeuvre can you submit a new PR with the changes/test fixes? This was more a bug report than a fix, so I don't need us to go back and forth on my PR. Thanks! |
yes, working on it. |
When using `--experimental_remote_build_event_upload=minimal`, instead of querying remote cache to decide whether convert local path of a file to `bytestream://`, it now always use `bytestream://` for BEP referenced files. Closes #16999. Closes #17025. PiperOrigin-RevId: 502351401 Change-Id: Ic858367ffaf3c2a2c20db88ada85fbbe1d93aae8 Co-authored-by: Chi Wang <[email protected]>
When using `--experimental_remote_build_event_upload=minimal`, instead of querying remote cache to decide whether convert local path of a file to `bytestream://`, it now always use `bytestream://` for BEP referenced files. Closes #16999. Closes #17025. PiperOrigin-RevId: 502351401 Change-Id: Ic858367ffaf3c2a2c20db88ada85fbbe1d93aae8
Before this change, every file referenced in the BEP would have a
FindMissingBlobs
call, which can be very expensive with hundreds of outputs. This could make--experimental_remote_build_event_upload=minimal
a performance regression compared to--experimental_build_event_upload_strategy=local
.