Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage - StorageWriter how to fail the entire write of InputStream #1232

Closed
fondberg opened this issue Sep 8, 2016 · 15 comments
Closed

Storage - StorageWriter how to fail the entire write of InputStream #1232

fondberg opened this issue Sep 8, 2016 · 15 comments
Assignees
Labels
api: storage Issues related to the Cloud Storage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@fondberg
Copy link

fondberg commented Sep 8, 2016

All documentation points to the following being the best practice for uploading a large file

String bucketName = "my_unique_bucket";
 String blobName = "my_blob_name";
 BlobId blobId = BlobId.of(bucketName, blobName);
 InputStream inputStream = new FileInputStream(new File("largefile.zip"));
 BlobInfo blobInfo = BlobInfo.builder(blobId).contentType("application/octet-stream").build();
 try (WriteChannel writer = storage.writer(blobInfo)) {
   try {
       while ((limit = inputStream.read(buffer)) >= 0) {
            writer.write(ByteBuffer.wrap(buffer, 0, limit));
       }
   } catch (Exception ex) {
     // handle exception
   }
 }

The question is how do we fail an upload if an exception is thrown midway?

@mziccard mziccard added the api: storage Issues related to the Cloud Storage API. label Sep 8, 2016
@mziccard
Copy link
Contributor

mziccard commented Sep 8, 2016

WriteChannel finalizes the upload when close() is called. To avoid finalizing the upload when an exception is thrown you can use explicit close() instead of try-with-resources. Roughly, something like:

WriteChannel writer = storage.writer(blobInfo);
boolean uploadFailed = false;
try {
  while ((limit = inputStream.read(buffer)) >= 0) {
    writer.write(ByteBuffer.wrap(buffer, 0, limit));
  }
} catch (Exception ex) {
  uploadFailed = true;
  // handle exception
}
if (!uploadFailed) {
  writer.close();
}

We might consider adding an abort() method to WriteChannel to be called inside a catch statement to prevent following close() from finalizing the stream. /cc @aozarov

@fondberg
Copy link
Author

fondberg commented Sep 8, 2016

Thanks!

@fondberg
Copy link
Author

fondberg commented Sep 8, 2016

What about a GC finalizer that would close the writer if it is not explicitly cancelled?

@fondberg
Copy link
Author

@mziccard After som discussions internally we are still not convinced this would fail the upload due to it might get closed + flushed anyway by a finalizer at some later point.

Anybody know if the above statement holds true?

@mziccard
Copy link
Contributor

After som discussions internally we are still not convinced this would fail the upload due to it might get closed + flushed anyway by a finalizer at some later point.

I am not aware of any way to "close/abort" a resumable upload session, without actually finalizing the upload. My understanding is that the resumable upload session will expire after a while, if the upload is not finalized. @Capstan is surely more informed than I am (have to ping you directly as I see you are not yet part of the @GoogleCloudPlatform/cloud-storage team).

@mziccard
Copy link
Contributor

FYI @stephenplusplus

@Capstan
Copy link
Contributor

Capstan commented Sep 15, 2016

Edited for a much simpler answer.

Use the DELETE verb on the upload URL. The response to this method and to all further attempts to upload and query status will yield

499 Client Closed Request
Content-Type: application/json; charset=UTF-8

{
 "error": {
  "errors": [
   {
    "domain": "global",
    "reason": "clientClosedRequest",
    "message": "clientClosedRequest"
   }
  ],
  "code": 499,
  "message": "clientClosedRequest"
 }
}

@fondberg
Copy link
Author

@Capstan (and @mziccard ) Thanks for the information, is the suggestion that gcloud-java would use this to implement an abort on a file write that fails mid-way?

@mziccard
Copy link
Contributor

@fonzy2013 Yes we could implement that but it's not a priority at the moment.

Is there a particular use case for deleting a resumable upload session? Regardless of the reason for which a chunk upload failed it seems like a better idea to retry uploading using the resumable session until it expires. Rather than deleting the session and opening a new one.

@Capstan
Copy link
Contributor

Capstan commented Sep 19, 2016

@fonzy2013 You'd really only want to delete a session if you somehow are no longer in control the session URL and know you need to abort it, e.g.,

  • you know that it is doomed to fail, e.g., you realize belatedly that you have corrupted input that you've already committed into the upload session, and the hash checking at the end will give you a client error.
  • you know that it's doomed to corrupt data, e.g., as before except you aren't using one of the hash mechanisms to insure data integrity and thus if the upload completes, it'll succeed in putting bad data into a finalized Google Cloud Storage object.
  • you want to revoke permission to complete the upload, though this is at best a race condition, and there are better ways to manage this.

@Capstan
Copy link
Contributor

Capstan commented Sep 19, 2016

The protocol as described by the Google Cloud Storage JSON documentation is essentially the Google Data resumable protocol, aka gdata, albeit minus the GData-Version and Slug headers. I've updated & republished those docs to include instructions on cancelation.

@fondberg
Copy link
Author

The usecase is that we download assets from another system and uploads them to GCS. The other system which is a build server deletes the assets after a time which can lead to halfuploaded assets to GCS.
We do retry today but we would rather have the start of the retry being clean of previous errors.

I'm not entirely sure what is meant by a session in this context as our uploader runs as a service and uploads all the time.

@fondberg
Copy link
Author

fondberg commented Sep 20, 2016

A bit more information that might be useful: we don't use resumable uploads.
What we would like to guarantee is that a halfwritten file is not commited to GCS.
The solution was not to close the writer which led to the question about finalizer or GC closing it automatically.

@lesv
Copy link
Contributor

lesv commented Feb 10, 2017

We should add abort() following Capstan's instructions.

@garrettjonesgoogle garrettjonesgoogle added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. and removed POST GA - P1 labels Apr 26, 2017
@garrettjonesgoogle garrettjonesgoogle added priority: p2 Moderately-important priority. Fix may not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. triaged for GA labels Jul 18, 2017
@danoscarmike
Copy link
Contributor

This has been added to our feature backlog: https://github.com/GoogleCloudPlatform/google-cloud-java/wiki/Feature-backlog. This issue will be closed but is linked in the backlog and can continue to be used for comment and discussion.

github-actions bot pushed a commit to suztomo/google-cloud-java that referenced this issue Jul 1, 2022
🤖 I have created a release *beep* *boop*
---


## [3.4.0](googleapis/java-asset@v3.3.1...v3.4.0) (2022-06-30)


### Features

* Enable REST transport for most of Java and Go clients ([googleapis#1242](googleapis/java-asset#1242)) ([86eb248](googleapis/java-asset@86eb248))


### Bug Fixes

* update gapic-generator-java with mock service generation fixes ([googleapis#1249](googleapis/java-asset#1249)) ([7d124ad](googleapis/java-asset@7d124ad))


### Dependencies

* update dependency com.google.api.grpc:proto-google-cloud-orgpolicy-v1 to v2.2.1 ([googleapis#1245](googleapis/java-asset#1245)) ([9b0d4e9](googleapis/java-asset@9b0d4e9))
* update dependency com.google.api.grpc:proto-google-cloud-pubsub-v1 to v1.101.0 ([googleapis#1228](googleapis/java-asset#1228)) ([c5f1712](googleapis/java-asset@c5f1712))
* update dependency com.google.api.grpc:proto-google-cloud-pubsub-v1 to v1.101.1 ([googleapis#1233](googleapis/java-asset#1233)) ([42031ef](googleapis/java-asset@42031ef))
* update dependency com.google.api.grpc:proto-google-identity-accesscontextmanager-v1 to v1.3.1 ([googleapis#1246](googleapis/java-asset#1246)) ([d868803](googleapis/java-asset@d868803))
* update dependency com.google.cloud:google-cloud-bigquery to v2.13.1 ([googleapis#1227](googleapis/java-asset#1227)) ([113c52f](googleapis/java-asset@113c52f))
* update dependency com.google.cloud:google-cloud-bigquery to v2.13.3 ([googleapis#1234](googleapis/java-asset#1234)) ([7006c44](googleapis/java-asset@7006c44))
* update dependency com.google.cloud:google-cloud-bigquery to v2.13.4 ([googleapis#1239](googleapis/java-asset#1239)) ([9c59632](googleapis/java-asset@9c59632))
* update dependency com.google.cloud:google-cloud-bigquery to v2.13.6 ([googleapis#1244](googleapis/java-asset#1244)) ([a305745](googleapis/java-asset@a305745))
* update dependency com.google.cloud:google-cloud-core to v2.8.0 ([googleapis#1238](googleapis/java-asset#1238)) ([8224b02](googleapis/java-asset@8224b02))
* update dependency com.google.cloud:google-cloud-resourcemanager to v1.4.0 ([googleapis#1230](googleapis/java-asset#1230)) ([5655582](googleapis/java-asset@5655582))
* update dependency com.google.cloud:google-cloud-shared-dependencies to v2.13.0 ([googleapis#1241](googleapis/java-asset#1241)) ([2ebe221](googleapis/java-asset@2ebe221))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
github-actions bot pushed a commit that referenced this issue Sep 15, 2022
…1.1 (#1232)

[![Mend Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [com.google.cloud:libraries-bom](https://cloud.google.com/java/docs/bom) ([source](https://togithub.com/googleapis/java-cloud-bom)) | `26.1.0` -> `26.1.1` | [![age](https://badges.renovateapi.com/packages/maven/com.google.cloud:libraries-bom/26.1.1/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/maven/com.google.cloud:libraries-bom/26.1.1/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/maven/com.google.cloud:libraries-bom/26.1.1/compatibility-slim/26.1.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/maven/com.google.cloud:libraries-bom/26.1.1/confidence-slim/26.1.0)](https://docs.renovatebot.com/merge-confidence/) |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Renovate will not automatically rebase this PR, because other commits have been found.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, click this checkbox. ⚠ **Warning**: custom changes will be lost.

---

This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/java-automl).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzMi4xODQuMiIsInVwZGF0ZWRJblZlciI6IjMyLjE4NC4yIn0=-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

6 participants