Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows, CI: getting "Failed to delete output files after incomplete download" #6890

Closed
laszlocsomor opened this issue Dec 11, 2018 · 10 comments
Assignees
Labels
P0 This is an emergency and more important than other current work. (Assignee required) type: bug untriaged

Comments

@laszlocsomor
Copy link
Contributor

Description of the problem / feature request:

On Bazel CI, I'm repeatedly getting the following error:

ERROR: D:/b/bk-worker-windows-java8-w528/bazel/google-bazel-presubmit/third_party/protobuf/3.6.1/BUILD:123:1: Couldn't build file third_party/protobuf/3.6.1/_objs/protobuf_lite/stringpiece.obj: C++ compilation of rule '//third_party/protobuf/3.6.1:protobuf_lite' failed: Failed to delete output files after incomplete download. Cannot continue with local execution.: D:/b/f622aej4/execroot/io_bazel/bazel-out/host/bin/third_party/protobuf/3.6.1/_objs/protobuf_lite/stringpiece.obj (Permission denied)
--
  | Target //src:bazel failed to build

(https://buildkite.com/bazel/google-bazel-presubmit/builds/12883#61fdc327-7d64-4979-a3b4-9d35f2c729a2)

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I don't know. I saw the error on CI for my change https://bazel-review.googlesource.com/c/bazel/+/84151.

Retrying the job doesn't seem to help.

Looks like this is where the error comes from:

// We don't propagate the downloadException, as this is a recoverable error and the cause
// of the build failure is really that we couldn't delete output files.
throw new EnvironmentalExecException(
"Failed to delete output files after incomplete "
+ "download. Cannot continue with local execution.",

And I suppose the reason is that outErr has open streams for stdout and stderr, and those streams should be closed before attempting to delete the files here:

outErr.getOutputPath().delete();
outErr.getErrorPath().delete();

What operating system are you running Bazel on?

Replace this line with your answer.

What's the output of bazel info release?

0.20.0

@laszlocsomor
Copy link
Contributor Author

/cc @ola-rozenfeld @buchgr

@laszlocsomor laszlocsomor changed the title Windows: getting "Failed to delete output files after incomplete download" Windows, CI: getting "Failed to delete output files after incomplete download" Dec 11, 2018
@laurentlb laurentlb added the P0 This is an emergency and more important than other current work. (Assignee required) label Dec 14, 2018
@laurentlb
Copy link
Contributor

This is blocking a Bazel release: https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/695#e741bbc1-151b-479b-8af0-72bf7388150d

I'm not even able to test downstream projects with the Bazel candidate, so this can hide unrelated bugs.

@laurentlb
Copy link
Contributor

If it can be fixed or worked around on Monday, we still have a chance of release Bazel before the holiday.

@laszlocsomor
Copy link
Contributor Author

If it can be fixed or worked around on Monday, we still have a chance of release Bazel before the holiday.

Is that what we want? Will we be around to push a patch release if the main release turns out to be bad?

laszlocsomor added a commit to laszlocsomor/bazel that referenced this issue Dec 17, 2018
If downloading a remotely cached action's outErr
failed, the AbstractRemoteActionCache deletes the
files underlying the outErr.

For this deletion to succeed on Windows, the files
must be closed. This commit implements that.

Fixes bazelbuild#6890

Change-Id: Ib15c4a255eb9029d5fd442617a9b7472f31e8f76
@laszlocsomor
Copy link
Contributor Author

I have a repro: #6945 (comment)

@laszlocsomor
Copy link
Contributor Author

I'll start bisecting. Apparently this is a regression in 0.20.0, because I cannot repro the bug with 0.19.0

@laszlocsomor
Copy link
Contributor Author

The good news is, I found the culprit: d2920e3

The bad news is, it was a rollback of 1a95502, which caused problems.

So we cannot be with or without this feature, i.e. of opening files with deletion-sharing on Windows.

The principled fix would be for the AbstractRemoteActionCache to close files before it tries to delete them.

@laszlocsomor
Copy link
Contributor Author

FYI @philwo

@meteorcloudy
Copy link
Member

Wow, thanks for debugging!

@philwo
Copy link
Member

philwo commented Jan 2, 2019

Note to self: Mark Google bug b/121159713 as fixed, when we fixed this to let my colleagues know when this is working again.

Discussions and progress tracking should happen here.

aehlig pushed a commit that referenced this issue Jan 7, 2019
Previously, outerF.setExeception was set before closing the output stream of the download file when download fails. This was causing a permission error when trying to delete the file on Windows.

Fixes #6890

RELNOTES: None
PiperOrigin-RevId: 228138102
aehlig pushed a commit that referenced this issue Jan 18, 2019
Previously, outerF.setExeception was set before closing the output stream of the download file when download fails. This was causing a permission error when trying to delete the file on Windows.

Fixes #6890

RELNOTES: None
PiperOrigin-RevId: 228138102
aehlig pushed a commit that referenced this issue Jan 23, 2019
Previously, outerF.setExeception was set before closing the output stream of the download file when download fails. This was causing a permission error when trying to delete the file on Windows.

Fixes #6890

RELNOTES: None
PiperOrigin-RevId: 228138102
luca-digrazia pushed a commit to luca-digrazia/DatasetCommitsDiffSearch that referenced this issue Sep 4, 2022
    Previously, outerF.setExeception was set before closing the output stream of the download file when download fails. This was causing a permission error when trying to delete the file on Windows.

    Fixes bazelbuild/bazel#6890

    RELNOTES: None
    PiperOrigin-RevId: 228138102
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 This is an emergency and more important than other current work. (Assignee required) type: bug untriaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants