-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider removing napi_get_value_string_length #226
Comments
You are probably right. It is better to err on the side of caution. Character encoding should be added to the list of 2 hard problems in CS, which would now be the 3 hard problems:
|
And in case anyone is wondering, the JavaScript |
@jasongin Personally I think the byte length would be the most useful b/c that's what's needed if you want to copy out the string. For that you could just copy the method used by |
@trevnorris There are other APIs to get byte length, that is why this API should just be removed.
|
I'm +1 for removing. Its safer to start with a smaller API and add back than to be stuck with methods that cause confusion and are not really needed. |
This API doesn't serve much purpose, and is only likely to cause confusion and bugs. The intention was that this would return the number of characters in a string independent of encoding, but that's not generally useful. In almost all cases, one of the encoding-specific napi_get_value_string_* APIs is more correct. (Pass a null buffer if only the encoded length is desired.) Anyway the current implementation of napi_get_value_string_length() is technically wrong: it returns the number of 2-byte code units of the UTF-16 encoding, but there are actually some characters that are encoded as two UTF-16 code units. Note the JavaScript String.prototype.length property returns the number of UTF-16 code units, which may be different from the number of characters. So, getting the true character count is not common with JavaScript, and is probably best left to specialized internationalization libraries. Fixes: nodejs/abi-stable-node#226
This API doesn't serve much purpose, and is only likely to cause confusion and bugs. The intention was that this would return the number of characters in a string independent of encoding, but that's not generally useful. In almost all cases, one of the encoding-specific napi_get_value_string_* APIs is more correct. (Pass a null buffer if only the encoded length is desired.) Anyway the current implementation of napi_get_value_string_length() is technically wrong: it returns the number of 2-byte code units of the UTF-16 encoding, but there are actually some characters that are encoded as two UTF-16 code units. Note the JavaScript String.prototype.length property returns the number of UTF-16 code units, which may be different from the number of characters. So, getting the true character count is not common with JavaScript, and is probably best left to specialized internationalization libraries. PR-URL: nodejs#12496 Fixes: nodejs/abi-stable-node#226 Reviewed-By: Michael Dawson <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]> Reviewed-By: James M Snell <[email protected]>
This API doesn't serve much purpose, and is only likely to cause confusion and bugs. The intention was that this would return the number of characters in a string independent of encoding, but that's not generally useful. In almost all cases, one of the encoding-specific napi_get_value_string_* APIs is more correct. (Pass a null buffer if only the encoded length is desired.) Anyway the current implementation of napi_get_value_string_length() is technically wrong: it returns the number of 2-byte code units of the UTF-16 encoding, but there are actually some characters that are encoded as two UTF-16 code units. Note the JavaScript String.prototype.length property returns the number of UTF-16 code units, which may be different from the number of characters. So, getting the true character count is not common with JavaScript, and is probably best left to specialized internationalization libraries. Backport-PR-URL: #19447 PR-URL: #12496 Fixes: nodejs/abi-stable-node#226 Reviewed-By: Michael Dawson <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]> Reviewed-By: James M Snell <[email protected]>
I regret adding this API now, because it doesn't serve much purpose, and is only likely to cause confusion and bugs. The intention was that this would return the number of "characters" in a string independent of encoding, but that's not generally useful. In almost all cases, one of the encoding-specific
napi_get_value_string_*
APIs is more correct. (Pass a null buffer if only the encoded length is desired.)Anyway the current V8 implementation of
napi_get_value_string_length()
is technically wrong: it returns the number of 2-byte code points of the UTF-16 encoding, but there are actually characters that are encoded as two UTF-16 code points.The text was updated successfully, but these errors were encountered: