Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Storage read and write streams #824

Merged

Conversation

stephenplusplus
Copy link
Contributor

Related: #340, #362, #423, #435, #436 #809 (comment)

This PR will force a breaking change. But, a breaking change for the better.

We will no longer have to tell users to hook on to the "complete" events. That comes from the request module, and while that is quite a popular module, a user who creates a write or read stream from us doesn't really expect it, as they don't know we're using request underneath. It's like we're forcing them to learn our dependency, which is no good.

In the case of read streams, the user always received "end", but now we've removed the "complete" event. The use can safely listen for end and error, as that will give them all they need. "complete" used to emit with it the response from the API, but we update the file metadata with this object, so they can access the same from "file.metadata" inside their end callback*.

For write streams, we hold off letting the stream "finish" now until we know the data was uploaded successfully. This means we don't have to let it finish, then do post-processing, then do complete. That was messy.

Any questions, please ask! Streams, man.

* Mention in release notes.

Read Streams

Before

file.createReadStream()
  .on('data', function() {})
  .on('complete', function(metadata) {});

After

file.createReadStream()
  .on('data', function() {})
  .on('end', function() {
    // file.metadata === old "complete.metadata" argument
  });

Write Streams

Before

var writeStream = file.createWriteStream();
writeStream.on('complete', function(metadata) {});
writeStream.end('data');

After

var writeStream = file.createWriteStream();
writeStream.on('finish', function() {
  // file.metadata === old "complete.metadata" argument
});
writeStream.end('data');

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Aug 26, 2015
@stephenplusplus stephenplusplus added the api: storage Issues related to the Cloud Storage API. label Aug 26, 2015
@jgeewax
Copy link
Contributor

jgeewax commented Aug 26, 2015

Do you have an example of before / after code snippets? Just so we can clearly communicate to our users what needs to change?

@stephenplusplus
Copy link
Contributor Author

Yep, updated.

@jgeewax
Copy link
Contributor

jgeewax commented Aug 26, 2015

Hmm some of this seems weird to me.

  1. Couldn't we pass the metadata along with the event to keep the callbacks backwards compatible?
  2. Read streams: couldn't we support listening on "end" as an alias for "complete"
  3. Write streams: couldn't we support listening on "complete" as an alias for "finish" ?

From my (outsider's) perspective, it seems like a bunch of renames that we could have potentially hidden from users. I'm also scared that things don't look super consistent:

  • Read: end -> complete
  • Write: complete -> finish

These all sound like synonyms to me... Is it crazy to suggest maybe we call everything the same? (Say, all complete ?)

@stephenplusplus
Copy link
Contributor Author

Well, the thing I actually did was "make our streams not do anything special". "end" and "finish" are actual, Node streams core events that all streams have. We were dancing around them, doing post-processing, forcing us to hide native events and rename them as "complete".

Couldn't we pass the metadata along with the event to keep the callbacks backwards compatible?

Because they are not custom events now (end for readable, and finish for writable), we can't do anything special, such as passing metadata.

Read streams: couldn't we support listening on "end" as an alias for "complete"
Write streams: couldn't we support listening on "complete" as an alias for "finish" ?

We shouldn't do anything custom because:

A) a user will have to learn what we created, instead of using what they expect from all other streams they use in Node
B) compatibility in a pipeline will be broken. If we are stream A and they pipe to stream B, B expects A to follow stream conventions.

I understand backwards compatibility would be nice, but I would urge us not to do this for the same reason we don't fix bugs and carry around "shim bugs". Our code gets weird and our users don't know what the heck's going on. We just need to make a note of it in the release notes and play by semver. A breaking API is expected in 0.x bumps.

@stephenplusplus
Copy link
Contributor Author

These all sound like synonyms to me... Is it crazy to suggest maybe we call everything the same? (Say, all complete ?)

Nope, and I wish they were. Streams are super weird, and they're working to improve on them. I really hope it gets more intuitive. (See https://github.com/stephenplusplus/stream-faqs for tracking how difficult they can be 😕)

@callmehiphop
Copy link
Contributor

huge 👍

a user will have to learn what we created, instead of using what they expect from all other streams they use in Node

This sums up why I love this pull request.

@stephenplusplus
Copy link
Contributor Author

Updated the first post with many references of the confusion our event juggling has caused.

crc32c = true;
md5 = true;
if (rangeRequest) {
// Range requests can't receive data integrity checks.

This comment was marked as spam.

This comment was marked as spam.

@stephenplusplus
Copy link
Contributor Author

PTAL

@callmehiphop
Copy link
Contributor

Looks good to me! Rebase time?

@stephenplusplus
Copy link
Contributor Author

Does it need to be?

@callmehiphop
Copy link
Contributor

It doesn't, this can bask in PR limbo for the rest of time!

@stephenplusplus
Copy link
Contributor Author

Oh, like squashing. Sorry, thought you meant it wouldn't merge without a rebase.

I can rebase, but the commits here are fine separated. It's easier to tear apart later / attribute regressions to if we ever needed to.

@callmehiphop
Copy link
Contributor

Fine by me!

callmehiphop added a commit that referenced this pull request Aug 27, 2015
@callmehiphop callmehiphop merged commit 4f62600 into googleapis:master Aug 27, 2015
@jseeley78
Copy link

Is anyone else having problems with this change besides me? All my writestreams (uploads) now come back with "The uploaded data did not match the data from the server. As a precaution, the file has been deleted. To be sure the content is the same, you should try uploading the file again."

@stephenplusplus
Copy link
Contributor Author

We haven't heard of anything, and our tests seem to work properly. Are there any commonalities between the types of transfers you're doing that fail?

@jseeley78
Copy link

all my cases are piping from imagemagick;

var wr = gfs.createWriteStream(dst, {bucket: gfs.Buckets.archiveBucket});
gm(src)
    .setFormat('png')
    .stream(function(err, stdout, stderr) {
    if (err) return cb(err);
    // pipe output to writestream
    stdout.pipe(wr);
    // on writestream complete; make public/return
     wr.on("complete", function() {
        gfs.makePublic(dst, {bucket: gfs.Buckets.archiveBucket}, cb);
    });
});

my createWriteStream function looks like this->

GFS.prototype.createWriteStream = function(filename, options) {
    options = options || {};
    return this.getBucket(options.bucket).file(filename).createWriteStream({
        metadata: {contentType: this.getContentType(filename)}
    });
};

@stephenplusplus
Copy link
Contributor Author

Thanks! What happens if you listen for finish instead of complete? Also, listen for error on wr as well.

phw added a commit to phw/gcloud-node that referenced this pull request Sep 22, 2015
This properly documents the new events for read and write streams.
phw added a commit to phw/gcloud-node that referenced this pull request Sep 22, 2015
This properly documents the new events for read and write streams introduced in PR googleapis#824.
@jseeley78
Copy link

I don't get the new finished event; I tried listening for finish, complete, end, data and error; I'm only getting back on "error"->

{ [Error: The uploaded data did not match the data from the server. As a precaution, the file has been deleted. To be sure the content is the same, you should try uploading the file again.] code: 'FILE_NO_UPLOAD', errors: [ null ] }

for what its worth is does work just fine when I'm using this via a standard fs.read->

var rd = fs.createReadStream(source),
      file = this.getBucket(bucket).file(target);
      wr = file.createWriteStream({ metadata: { contentType: contentType } });
        wr.on("finish", function() {
        file.acl.add({
            entity: 'allUsers',
            role: gcloud.storage.acl.READER_ROLE
        }, cb);
    });
    rd.pipe(wr);

so it seems to be something with the way node graphicsmagic(imagemagic) is handling the stream. It works fine with release 20 and earlier though...

@stephenplusplus
Copy link
Contributor Author

Could you open a new issue so we can keep digging into this? Also please include a link to the library and I'll try to reproduce.

@jseeley78
Copy link

ok, submitted as ticket #889

sofisl pushed a commit that referenced this pull request Jan 17, 2023
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [yargs](https://yargs.js.org/) ([source](https://togithub.com/yargs/yargs)) | dependencies | major | [`^15.0.0` -> `^16.0.0`](https://renovatebot.com/diffs/npm/yargs/15.4.1/16.0.1) |

---

### Release Notes

<details>
<summary>yargs/yargs</summary>

### [`v16.0.1`](https://togithub.com/yargs/yargs/blob/master/CHANGELOG.md#&#8203;1601-httpswwwgithubcomyargsyargscomparev1600v1601-2020-09-09)

[Compare Source](https://togithub.com/yargs/yargs/compare/v16.0.0...v16.0.1)

### [`v16.0.0`](https://togithub.com/yargs/yargs/blob/master/CHANGELOG.md#&#8203;1600-httpswwwgithubcomyargsyargscomparev1542v1600-2020-09-09)

[Compare Source](https://togithub.com/yargs/yargs/compare/v15.4.1...v16.0.0)

##### ⚠ BREAKING CHANGES

-   tweaks to ESM/Deno API surface: now exports yargs function by default; getProcessArgvWithoutBin becomes hideBin; types now exported for Deno.
-   find-up replaced with escalade; export map added (limits importable files in Node >= 12); [email protected] (new decamelize/camelcase implementation).
-   **usage:** single character aliases are now shown first in help output
-   rebase helper is no longer provided on yargs instance.
-   drop support for EOL Node 8 ([#&#8203;1686](https://togithub.com/yargs/yargs/issues/1686))

##### Features

-   adds strictOptions() ([#&#8203;1738](https://www.github.com/yargs/yargs/issues/1738)) ([b215fba](https://www.github.com/yargs/yargs/commit/b215fba0ed6e124e5aad6cf22c8d5875661c63a3))
-   **helpers:** rebase, Parser, applyExtends now blessed helpers ([#&#8203;1733](https://www.github.com/yargs/yargs/issues/1733)) ([c7debe8](https://www.github.com/yargs/yargs/commit/c7debe8eb1e5bc6ea20b5ed68026c56e5ebec9e1))
-   adds support for ESM and Deno ([#&#8203;1708](https://www.github.com/yargs/yargs/issues/1708)) ([ac6d5d1](https://www.github.com/yargs/yargs/commit/ac6d5d105a75711fe703f6a39dad5181b383d6c6))
-   drop support for EOL Node 8 ([#&#8203;1686](https://www.github.com/yargs/yargs/issues/1686)) ([863937f](https://www.github.com/yargs/yargs/commit/863937f23c3102f804cdea78ee3097e28c7c289f))
-   i18n for ESM and Deno ([#&#8203;1735](https://www.github.com/yargs/yargs/issues/1735)) ([c71783a](https://www.github.com/yargs/yargs/commit/c71783a5a898a0c0e92ac501c939a3ec411ac0c1))
-   tweaks to API surface based on user feedback ([#&#8203;1726](https://www.github.com/yargs/yargs/issues/1726)) ([4151fee](https://www.github.com/yargs/yargs/commit/4151fee4c33a97d26bc40de7e623e5b0eb87e9bb))
-   **usage:** single char aliases first in help ([#&#8203;1574](https://www.github.com/yargs/yargs/issues/1574)) ([a552990](https://www.github.com/yargs/yargs/commit/a552990c120646c2d85a5c9b628e1ce92a68e797))

##### Bug Fixes

-   **yargs:** add missing command(module) signature ([#&#8203;1707](https://www.github.com/yargs/yargs/issues/1707)) ([0f81024](https://www.github.com/yargs/yargs/commit/0f810245494ccf13a35b7786d021b30fc95ecad5)), closes [#&#8203;1704](https://www.github.com/yargs/yargs/issues/1704)

</details>

---

### Renovate configuration

:date: **Schedule**: "after 9am and before 3pm" (UTC).

:vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

:recycle: **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

:no_bell: **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/nodejs-vision).
sofisl pushed a commit that referenced this pull request Jan 24, 2023
- [ ] Regenerate this pull request now.

PiperOrigin-RevId: 413453425

Source-Link: googleapis/googleapis@2b47b24

Source-Link: googleapis/googleapis-gen@7ffe6e0
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiN2ZmZTZlMGExYmY2M2Q4NTQwMDA5Y2U2OTg2NjBlYmI3MWM1NGZmMSJ9

feat: add WEBM_OPUS codec
feat: add SpeechAdaptation configuration
feat: add word confidence
feat: add spoken punctuation and spoken emojis
feat: add hint boost in SpeechContext
sofisl pushed a commit that referenced this pull request Jan 25, 2023
- [ ] Regenerate this pull request now.

PiperOrigin-RevId: 413453425

Source-Link: googleapis/googleapis@2b47b24

Source-Link: googleapis/googleapis-gen@7ffe6e0
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiN2ZmZTZlMGExYmY2M2Q4NTQwMDA5Y2U2OTg2NjBlYmI3MWM1NGZmMSJ9

feat: add WEBM_OPUS codec
feat: add SpeechAdaptation configuration
feat: add word confidence
feat: add spoken punctuation and spoken emojis
feat: add hint boost in SpeechContext
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants