Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel should accept ALREADY_EXISTS for byte stream uploads #12111

Closed
ulfjack opened this issue Sep 16, 2020 · 0 comments
Closed

Bazel should accept ALREADY_EXISTS for byte stream uploads #12111

ulfjack opened this issue Sep 16, 2020 · 0 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@ulfjack
Copy link
Contributor

ulfjack commented Sep 16, 2020

Description of the problem / feature request:

com.google.devtools.build.lib.remote.ByteStreamUploader only treats uploads as successful when they return an OK status. Naively, this seems fine - Bazel first checks if the file exists remotely and then uploads it. However, this is a check-then-act race, where either another thread in the same Bazel process or another Bazel process may finish an upload of the same file between the check and the upload (or even while the upload is ongoing). In that case, the CAS service might return ALREADY_EXISTS for this file, which Bazel currently treats as a non-retryable error.

While the documentation does not clearly indicate what the CAS service should do if a file already exists, the ALREADY_EXISTS return code seems appropriate.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Build a CAS service that returns ALREADY_EXISTS and run a build where multiple actions have the same input. Bazel currently does not deduplicate uploads.

ERROR: /home/ulfjack/Google/bazel2/foo/BUILD:15:9: Executing genrule //foo:bar-0 failed (Exit 34): com.google.devtools.build.lib.remote.BulkTransferException
	at com.google.devtools.build.lib.remote.RemoteCache.waitForBulkTransfer(RemoteCache.java:225)
...
	Suppressed: java.io.IOException: io.grpc.StatusRuntimeException: ALREADY_EXISTS: Digest ca6e692227cd35f862d56f0ef4cb0a3ba06cc86674cff63d8b34a5a35d4999a2/159 already exists
		at com.google.devtools.build.lib.remote.ByteStreamUploader.lambda$uploadBlobAsync$1(ByteStreamUploader.java:247)
		at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.doFallback(AbstractCatchingFuture.java:175)
		at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.doFallback(AbstractCatchingFuture.java:162)
		at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:107)
...
		at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584)
		at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
		at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
		... 3 more
	Caused by: io.grpc.StatusRuntimeException: ALREADY_EXISTS: Digest ca6e692227cd35f862d56f0ef4cb0a3ba06cc86674cff63d8b34a5a35d4999a2/159 already exists
		at io.grpc.Status.asRuntimeException(Status.java:523)
		... 23 more

What operating system are you running Bazel on?

Linux.

ulfjack added a commit to ulfjack/bazel that referenced this issue Sep 16, 2020
If the service returns ALREADY_EXISTS, then assume that the file has
already been successfully uploaded to the ByteStream service.

Fixes bazelbuild#12111.

Change-Id: I7f5af0cca2e5efea32067b092f619c6593af0aac
ulfjack added a commit to ulfjack/bazel that referenced this issue Sep 16, 2020
If the service returns an ALREADY_EXISTS error, then we assume that the
proper file is present remotely.

Fixes bazelbuild#12111.

Change-Id: I7f5af0cca2e5efea32067b092f619c6593af0aac
@oquenchil oquenchil added team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged labels Oct 5, 2020
@coeuvre coeuvre added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels Dec 9, 2020
@coeuvre coeuvre self-assigned this Dec 9, 2020
Yannic added a commit to EngFlow/bazel that referenced this issue Mar 8, 2023
If the service returns an `ALREADY_EXISTS` error, then we assume that
the proper file is present remotely.

Prior art: bazelbuild#12112

Fixes bazelbuild#12111
ShreeM01 pushed a commit to ShreeM01/bazel that referenced this issue Mar 10, 2023
If the service returns an `ALREADY_EXISTS` error, then we assume that the proper file is present remotely.

Prior art: bazelbuild#12112

Fixes bazelbuild#12111

Closes bazelbuild#17692.

PiperOrigin-RevId: 515339566
Change-Id: Iafdd288148e47197cc047d39c9a5e5b6c95acee1
ShreeM01 added a commit that referenced this issue Mar 13, 2023
If the service returns an `ALREADY_EXISTS` error, then we assume that the proper file is present remotely.

Prior art: #12112

Fixes #12111

Closes #17692.

PiperOrigin-RevId: 515339566
Change-Id: Iafdd288148e47197cc047d39c9a5e5b6c95acee1

Co-authored-by: Yannic Bonenberger <[email protected]>
fweikert pushed a commit to fweikert/bazel that referenced this issue May 25, 2023
If the service returns an `ALREADY_EXISTS` error, then we assume that the proper file is present remotely.

Prior art: bazelbuild#12112

Fixes bazelbuild#12111

Closes bazelbuild#17692.

PiperOrigin-RevId: 515339566
Change-Id: Iafdd288148e47197cc047d39c9a5e5b6c95acee1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
3 participants