Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: convert string to Buffer when calling unshift(<string>) #27194

Merged
merged 1 commit into from
Jun 2, 2019
Merged

stream: convert string to Buffer when calling unshift(<string>) #27194

merged 1 commit into from
Jun 2, 2019

Conversation

marcosc90
Copy link
Contributor

@marcosc90 marcosc90 commented Apr 11, 2019

readable.unshift can take a string as an argument, but that
string wasn't being converted to a Buffer, which caused a
TypeError: Argument must be a buffer in some cases.
A second optional argument encoding was added to unshift to
prevent all strings being coerced to utf8.

Fixes: #27192

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

A documentation change will be needed, for the encoding argument, but wanted to submit the PR first to see if it will get approved. I will also make a few tests for this if everything is okay.

@nodejs-github-bot nodejs-github-bot added the stream Issues and PRs related to the stream subsystem. label Apr 11, 2019
@Trott
Copy link
Member

Trott commented Apr 11, 2019

Can you add a test for this?

@marcosc90
Copy link
Contributor Author

@Trott added multiple cases, if something is missing or needs to be changed let me know.

@Trott
Copy link
Member

Trott commented Apr 12, 2019

Thanks! Only other thing to add that I can see is the encoding argument to the documentation in doc/api/stream.md.

R: @nodejs/streams

@mcollina mcollina added the semver-minor PRs that contain new features and should be released in the next minor version. label Apr 12, 2019
@mcollina
Copy link
Member

I think this is semver-minor as it's adding a new parameter.

CI: https://ci.nodejs.org/job/node-test-pull-request/22398/
CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/1821/

@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test for how this interacts with .setEncoding()?

@marcosc90
Copy link
Contributor Author

Can you also add a test for how this interacts with .setEncoding()?

I'll submit one in a few. Forgot about that scenario, thanks!

@marcosc90
Copy link
Contributor Author

marcosc90 commented Apr 13, 2019

@addaleax When doing that test, I remember that when state.decoder is set, the chunks are not converted to Buffer. The bug reported in #27192 won't happen, because when state.decoder is set, buffer.concat won't be used.

I don't think this PR should change that, I'm not even sure if it should be saved as a Buffer in that case, even though it's called BufferList, but I'm not very familiar with string decoder.

So I can test that the behaviour is unchanged, meaning that they're saved as string, and that data is emitted with the encoding set in .setEncoding.

Is that ok?

@marcosc90
Copy link
Contributor Author

marcosc90 commented Apr 13, 2019

There is one change left, but would like some feedback.

https://github.com/nodejs/node/blob/13492ba631a0a4aac0e34adcf0f6cfa3cf4e19bd/lib/_stream_readable.js#L229-L237

When .setEncoding is set and a chunk is .pushed, the convertion to that encoding is handled by the state.decoder, of course I can't use that same state.decoder for .unshifting otherwise it will mess with the internal buffer.

But in order to keep the same behaviour, I have to convert the pushed string, to the encoding defined by setEncoding, so the solution that I have right now working is the following:

if (addToFront && state.encoding && state.encoding !== encoding) {
  chunk = Buffer.from(chunk, encoding).toString(state.encoding);
} else if (encoding !== state.encoding) {
  chunk = Buffer.from(chunk, encoding);
  encoding = '';
}

Because the string must be saved in BufferList in the same encoding as setEncoding, so then is emitted to data that way.

Thoughts? Maybe someone has another alternative.

@marcosc90
Copy link
Contributor Author

Is there something else that should be added/changed?

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve left some notes related to the tests. I think we should remove the usage of private API as much as possible - the majority of this feature should be testable without it.

assert.strictEqual(
readable._readableState.buffer.head.data,
chunk
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can be rewritten without using internals, it should consume the stream and check that the chunks comes out in order.

test/parallel/test-stream-readable-unshift.js Outdated Show resolved Hide resolved
assert.strictEqual(
readable._readableState.buffer.head.data.toString('utf8'),
string
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be tested by consuming the stream and verifying that the data comes out in order.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nodejs-github-bot

This comment has been minimized.

@Trott Trott added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Jun 2, 2019
@nodejs-github-bot
Copy link
Collaborator

`readable.unshift` can take a string as an argument, but that
string wasn't being converted to a Buffer, which caused a
<TypeError: Argument must be a buffer> in some cases. Also if a
string was passed, that string was coerced to utf8 encoding.

A second optional argument `encoding` was added to `unshift` to
fix the encoding issue.

Fixes: #27192
PR-URL: #27194
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
@Trott
Copy link
Member

Trott commented Jun 2, 2019

Landed in df339bc

@Trott Trott merged commit df339bc into nodejs:master Jun 2, 2019
targos pushed a commit that referenced this pull request Jun 3, 2019
`readable.unshift` can take a string as an argument, but that
string wasn't being converted to a Buffer, which caused a
<TypeError: Argument must be a buffer> in some cases. Also if a
string was passed, that string was coerced to utf8 encoding.

A second optional argument `encoding` was added to `unshift` to
fix the encoding issue.

Fixes: #27192
PR-URL: #27194
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
@targos targos mentioned this pull request Jun 3, 2019
targos added a commit that referenced this pull request Jun 3, 2019
Notable changes:

* doc:
  * The JSON variant of the API documentation is no longer experimental
    (Rich Trott) #27842.
* esm:
  * JSON module support is always enabled under
    `--experimental-modules`. The `--experimental-json-modules` flag
    has been removed (Myles Borins)
    #27752.
* http,http2:
  * A new flag has been added for overriding the default HTTP server
    socket timeout (which is two minutes). Pass
    `--http-server-default-timeout=milliseconds`
    or `--http-server-default-timeout=0` to respectively change or
    disable the timeout. Starting with Node.js 13.0.0, the timeout will
    be disabled by default
    (Ali Ijaz Sheikh) #27704.
* inspector:
  * Added an experimental `--heap-prof` flag to start the V8 heap
    profiler on startup and write the heap profile to disk before exit
    (Joyee Cheung) #27596.
* stream:
  * The `readable.unshift()` method now correctly converts strings to
    buffers. Additionally, a new optional argument is accepted to
    specify the string's encoding, such as `'utf8'` or `'ascii'`
    (Marcos Casagrande) #27194.
* v8:
  * The object returned by `v8.getHeapStatistics()` has two new
    properties: `number_of_native_contexts` and
    `number_of_detached_contexts` (Yuriy Vasiyarov)
    #27933.

PR-URL: #28040
pull bot pushed a commit to Rachelmorrell/node that referenced this pull request Jun 4, 2019
Notable changes:

* doc:
  * The JSON variant of the API documentation is no longer experimental
    (Rich Trott) nodejs#27842.
* esm:
  * JSON module support is always enabled under
    `--experimental-modules`. The `--experimental-json-modules` flag
    has been removed (Myles Borins)
    nodejs#27752.
* http,http2:
  * A new flag has been added for overriding the default HTTP server
    socket timeout (which is two minutes). Pass
    `--http-server-default-timeout=milliseconds`
    or `--http-server-default-timeout=0` to respectively change or
    disable the timeout. Starting with Node.js 13.0.0, the timeout will
    be disabled by default
    (Ali Ijaz Sheikh) nodejs#27704.
* inspector:
  * Added an experimental `--heap-prof` flag to start the V8 heap
    profiler on startup and write the heap profile to disk before exit
    (Joyee Cheung) nodejs#27596.
* stream:
  * The `readable.unshift()` method now correctly converts strings to
    buffers. Additionally, a new optional argument is accepted to
    specify the string's encoding, such as `'utf8'` or `'ascii'`
    (Marcos Casagrande) nodejs#27194.
* v8:
  * The object returned by `v8.getHeapStatistics()` has two new
    properties: `number_of_native_contexts` and
    `number_of_detached_contexts` (Yuriy Vasiyarov)
    nodejs#27933.

PR-URL: nodejs#28040
@BethGriggs
Copy link
Member

Should this land on v10.x? //cc @mcollina

@mcollina
Copy link
Member

mcollina commented Sep 4, 2019

Yes please!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. semver-minor PRs that contain new features and should be released in the next minor version. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stream.unshift - TypeError: Argument must be a buffer
8 participants