Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More useful error than toString failed when Buffer length greater than kMaxLength? #3175

Closed
mhart opened this issue Oct 4, 2015 · 46 comments
Labels
buffer Issues and PRs related to the buffer subsystem. help wanted Issues that need assistance from volunteers or PRs that need help to proceed.

Comments

@mhart
Copy link
Contributor

mhart commented Oct 4, 2015

I'm assuming there's some sort of memory constraint trying to convert buffers into strings?

$ uname -a
Darwin ReBuke-Pro.local 15.0.0 Darwin Kernel Version 15.0.0: Wed Aug 26 16:57:32 PDT 2015; root:xnu-3247.1.106~1/RELEASE_X86_64 x86_64
$ node --version
v4.1.1
$ node -p 'new Buffer(268435440).toString().length'
268435440

But:

$ node -p 'new Buffer(268435441).toString().length'
buffer.js:378
    throw new Error('toString failed');
    ^

Error: toString failed
    at Buffer.toString (buffer.js:378:11)
    at [eval]:1:23
    at Object.exports.runInThisContext (vm.js:54:17)
    at Object.<anonymous> ([eval]-wrapper:6:22)
    at Module._compile (module.js:434:26)
    at node.js:566:27
    at doNTCallback0 (node.js:407:9)
    at process._tickCallback (node.js:336:13)

Firstly, I'm not sure that this should even be an error (are we really only limited to 256MB? That's... not very much). This seems to have been addressed in the past, but I feel like a regression might have occurred: #1374 (comment)

But that aside, can we actually detect what the error is and report on it in a more detailed way? The current toString check provides little insight.

@bnoordhuis
Copy link
Member

256 MB is the limit for strings in V8 so no, it's not a regression. What would you report in more detail?

@mhart
Copy link
Contributor Author

mhart commented Oct 4, 2015

Anything would be better than that TBH.

@mhart
Copy link
Contributor Author

mhart commented Oct 4, 2015

It's not obvious to the user what's actually gone wrong.

How about "Cannot create string larger than 256MB"? Seems a lot clearer to me.

@bnoordhuis
Copy link
Member

I think a pull request to that effect would be acceptable.

@mhart
Copy link
Contributor Author

mhart commented Oct 4, 2015

But is that the only error that can occur? Looking at the code at the moment, the check is pretty basic – is there no way of getting more detail as to why result may be undefined?

@bnoordhuis
Copy link
Member

Not at the moment, no. The undefined value trickles down from StringBytes::Encode() in src/string_bytes.h. I don't think there are other code paths that return an empty Local<String> but I didn't check exhaustively.

@thefourtheye thefourtheye added the buffer Issues and PRs related to the buffer subsystem. label Oct 4, 2015
@mhart
Copy link
Contributor Author

mhart commented Oct 4, 2015

FWIW, v8 (I assume?) spits out a slightly better error when trying to concat strings:

$ node -p 'var a = "a"; for (var i = 0; i < 27; i++) a = a + a; a.length'
134217728
$ node -p 'var a = "a"; for (var i = 0; i < 28; i++) a = a + a; a.length'
[eval]:1
var a = "a"; for (var i = 0; i < 28; i++) a = a + a; a.length
                                                ^

RangeError: Invalid string length
    at [eval]:1:49
    at Object.exports.runInThisContext (vm.js:54:17)
    at Object.<anonymous> ([eval]-wrapper:6:22)
    at Module._compile (module.js:434:26)
    at node.js:566:27
    at doNTCallback0 (node.js:407:9)
    at process._tickCallback (node.js:336:13)

Is the size limit exposed at all? If it were I'd be happy to add a PR that checks, in the case of an undefined result, whether the Buffer length is greater than it.

@targos
Copy link
Member

targos commented Oct 4, 2015

I don't think it is exposed in JS. The exact limit is (1 << 28) - 16 (268435440) and is defined in

static const int kMaxLength = (1 << 28) - 16;

@mhart mhart changed the title More useful error than toString failed? More useful error than toString failed when Buffer length greater than kMaxLength? Oct 4, 2015
@MylesBorins
Copy link
Contributor

I would love to take a stab at this if people are open to it.

/cc @jasnell

@jasnell
Copy link
Member

jasnell commented Oct 6, 2015

Works for me!
On Oct 6, 2015 12:49 PM, "Myles Borins" [email protected] wrote:

I would love to take a stab at this if people are open to it.

/cc @jasnell https://github.com/jasnell


Reply to this email directly or view it on GitHub
#3175 (comment).

@trevnorris
Copy link
Contributor

trevnorris commented Oct 6, 2015

@MylesBorins Seems the discussion addresses more than just the error message. Mind identifying the specifics of what you'll be working on?

@MylesBorins
Copy link
Contributor

I specifically was interested in a better error message when toString is called on a Buffer larger than kStringMaxLength

That seemed like a pretty easy one to knock out (I've basically got it done already).

@mhart
Copy link
Contributor Author

mhart commented Oct 6, 2015

@MylesBorins I think the discussion was around how to detect the limit though – have you exposed kMaxLength?

@mhart
Copy link
Contributor Author

mhart commented Oct 6, 2015

I mean, the code comments in v8.h for NewFromUtf8, NewFromOneByte and NewFromTwoByte do say "Only returns an empty value when length > kMaxLength" so perhaps as @bnoordhuis suggested it's the only reason for this particular code path... Are the comments enough to go on?

@MylesBorins
Copy link
Contributor

I take back what I said about having this done. I was hard coding the size, which in retrospect is not robust enough.

@trevnorris
Copy link
Contributor

trevnorris commented Oct 6, 2015

@mhart I don't suggest attempting to pre-detect the limit. Let v8 try and it will return whether it could or not. It's both more future proof and allows limitations on system resources to also return undefined.

@MylesBorins Improving the error message is a good place to start.

@mhart
Copy link
Contributor Author

mhart commented Oct 6, 2015

@trevnorris not sure what you mean? I wasn't suggesting to pre-detect the limit.

@mhart
Copy link
Contributor Author

mhart commented Oct 6, 2015

Although I don't exactly know what pre-detect means 😸

@trevnorris
Copy link
Contributor

@mhart

I think the discussion was around how to detect the limit though

This made me believe that you intended on detecting the kMaxStringLength from the string stored in the buffer. In the utf-8 case this isn't trivial. Also, it's an optimization for the uncommon path. Allow the string to be written immediately then handle it on failure. Don't attempt to error early.

@mhart
Copy link
Contributor Author

mhart commented Oct 6, 2015

Yes, I was talking about handling it on failure: #3175 (comment)

@mhart
Copy link
Contributor Author

mhart commented Oct 6, 2015

detecting the kMaxStringLength from the string

No no – I was talking about kMaxLength that's defined in v8.h – so if the result comes back as undefined, then check if the reason it's undefined is because the buffer length is greater than kMaxLength – then we can give the user a much nicer error, including what the actual current limit is.

@trevnorris
Copy link
Contributor

@mhart v8::String::kMaxLength is process.binding('buffer').kStringMaxLength. It's been exposed that way for testing. Should be able to use that for your error message.

EDIT: Sorry. I've been typing kMaxStringLength instead of kStringMaxLength.

For reference, here's where it's exposed: https://github.com/nodejs/node/blob/v4.1.2/src/node_buffer.cc#L979-L981

@mhart
Copy link
Contributor Author

mhart commented Oct 6, 2015

Oh cool! So I guess the only trickiness comes with multi-byte strings? I mean if Buffer.length > kMaxStringLength then it's not necessarily the case that string.length > kMaxStringLength – but... we could just add that to the error message anyway – in a way that doesn't necessarily imply it was the definitive cause...?

"toString failed – maximum string length is 268435440" or similar?

@trevnorris
Copy link
Contributor

trevnorris commented Oct 6, 2015

@mhart Sorry. It's actually a bit deeper than that. If we want to do it properly then we'll have to detect if the return value from StringBytes::Encode(), as used in src/node_buffer.cc, is empty. If it is then we can throw a specific error message. Though that will require converting all of StringBytes::Encode() to use the MaybeLocal<> v8 API so we can catch if the return value is empty (the current API used will throw immediately).

@MylesBorins How are you approaching the problem?

@mhart
Copy link
Contributor Author

mhart commented Oct 6, 2015

@trevnorris Ah, I assumed "empty" in StringBytes::Encode -> "undefined" in JS. So I assumed we could just do the check in JS after an undefined return value.

@mhart
Copy link
Contributor Author

mhart commented Oct 6, 2015

ie:

if (result === undefined) {
  var kMaxStringLength = process.binding('buffer').kMaxStringLength;
  if (this.length > kMaxStringLength)
    throw new Error('toString failed – maximum string length is ' + kMaxStringLength);
  else
    throw new Error('toString failed');
}

@mhart
Copy link
Contributor Author

mhart commented Oct 6, 2015

(that said, it does look like the Local<String> versions have all been deprecated by the MaybeLocal<String> versions – so moving to MaybeLocal is something that'll need to be done at some point anyway – not sure if that's orthogonal to the discussion here though)

@trevnorris
Copy link
Contributor

Hah. Totally missed how Buffer#toString() checks result === undefined.

With the kStringMaxLength check you need to do the following:

if ((encoding === 'ucs2' || encoding === 'ucs-2' ||
    encoding === 'utf16le' || encoding === 'utf-16le') &&
    this.length > (kStringMaxLength / 2))
  throw new Error('toString failed - buffer larger than maximum string size');

if ((encoding === 'utf8' || encoding === 'utf-8') &&
    this.length > kStringMaxLength)
  throw new Error('we don\'t really know this string size...');

if (this.length > kStringMaxLength)
  throw new Error('assuming using a one byte encoding here');

EDIT: This is approximate and messy. Which is why simply checking if the C++ value is empty or not is the more sure way.

@Trott
Copy link
Member

Trott commented Mar 23, 2017

For what it's worth, the error message here has been marginally improved to Error: "toString()" failed (although I'm not sure why the quotation marks are used--that seems peculiar--but the addition of parentheses are a marginal improvement).

And, of course, it still doesn't indicate why it failed though.

addaleax added a commit to addaleax/node that referenced this issue May 1, 2017
- Return `MaybeLocal`s from `StringBytes::Encode`
- Add an `error` out parameter to pass JS exceptions to the callers
  (instead of directly throwing)
- Simplify some of the string generation methods in `string_bytes.cc`
  by unifying the `EXTERN_APEX` logic
- Reduce usage of deprecated V8 APIs.
- Remove error handling logic from JS, the `buffer.*Slice()` methods
  now throw errors themselves.
- Left TODO comments for future semver-major error message
  improvements.

This paves the way for better error messages coming out of the
StringBytes methods.

Ref: nodejs#3175
addaleax added a commit that referenced this issue May 3, 2017
- Return `MaybeLocal`s from `StringBytes::Encode`
- Add an `error` out parameter to pass JS exceptions to the callers
  (instead of directly throwing)
- Simplify some of the string generation methods in `string_bytes.cc`
  by unifying the `EXTERN_APEX` logic
- Reduce usage of deprecated V8 APIs.
- Remove error handling logic from JS, the `buffer.*Slice()` methods
  now throw errors themselves.
- Left TODO comments for future semver-major error message
  improvements.

This paves the way for better error messages coming out of the
StringBytes methods.

Ref: #3175
PR-URL: #12765
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
anchnk pushed a commit to anchnk/node that referenced this issue May 6, 2017
- Return `MaybeLocal`s from `StringBytes::Encode`
- Add an `error` out parameter to pass JS exceptions to the callers
  (instead of directly throwing)
- Simplify some of the string generation methods in `string_bytes.cc`
  by unifying the `EXTERN_APEX` logic
- Reduce usage of deprecated V8 APIs.
- Remove error handling logic from JS, the `buffer.*Slice()` methods
  now throw errors themselves.
- Left TODO comments for future semver-major error message
  improvements.

This paves the way for better error messages coming out of the
StringBytes methods.

Ref: nodejs#3175
PR-URL: nodejs#12765
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
@justinjdickow
Copy link

I... don't understand why the limit is 256 MB :(

@mhart
Copy link
Contributor Author

mhart commented Jul 9, 2017

There's an issue open on v8 too, if you'd like to star it: https://bugs.chromium.org/p/v8/issues/detail?id=6148

@justinjdickow
Copy link

Can we rename the issue from "String length limit is small-ish" to "String length limit is small-af"

@mhart
Copy link
Contributor Author

mhart commented Jul 24, 2017

Great news! v8 has upped the limit to ~1GB on 64-bit archs: https://chromium-review.googlesource.com/c/570047

Hopefully it comes with a nicer error message when it fails too (although that's possibly still a Node.js responsibility?)

@joyeecheung
Copy link
Member

After 63eb267 this now returns a message Cannot create a string larger than 0x3fffffe7 bytes with the error code set to ERR_STRING_TOO_LARGE.

@soichih
Copy link

soichih commented Jan 7, 2019

I am running into this issue when I try to download .tar file via request and piping it to tar.x(). Does this issue imply that the maximum body size that I can download via request.get() is 1G for 64-bit machine?

@soichih
Copy link

soichih commented Jan 7, 2019

nvm.. I just had to set "encoding: null" ... which prevents the body to be passed to toString()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. help wanted Issues that need assistance from volunteers or PRs that need help to proceed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.