-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed UTF8/16 converters to support empty string input #676
Conversation
Previously an exception was thrown since Win32 WideCharToMultiByte API returns error on zero-length input.
|
Sure, thanks! My apologies, I ported it from v4.1.0 based branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few minor comments.
@@ -280,9 +280,16 @@ const uint64_t internal::basic_data<T>::POWERS_OF_10_64[] = { | |||
|
|||
FMT_FUNC internal::utf8_to_utf16::utf8_to_utf16(string_view s) { | |||
static const char ERROR_MSG[] = "cannot convert string from UTF-8 to UTF-16"; | |||
if (s.size() > INT_MAX) | |||
if (s.size() > INT_MAX || s.data() == nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add a nullptr check? Won't MultiByteToWideChar
return an error in this case already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but without this check the function will now succeed for invalid input of nullptr
and 0
length (because the size check precedes the call to MultiByteToWideChar
). Clearly this is not the best behavior. Moreover, it fails some tests (UTF16ToUTF8Error/UTF8ToUTF16Error/UTF16ToUTF8Convert).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is nullptr
and 0
invalid? If size is zero than the data can be anything including nullptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this is a conceptual discussion, and probably a matter of definition and taste... Personally, I think that nullptr
is a special case reserved for "invalid data" and this fact should not be silently discarded. I don't see significant advantages to the other approach, but if you prefer it, the code can be easily changed by reverting this check.
However, please note that the above mentioned tests assume that (nullptr, 0) is invalid input. So, if we decide that it becomes valid, we need to redefine the tests logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty string view has the null data pointer (https://godbolt.org/g/JtWcst), so I suggest removing the nullptr
check here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, good point :-)
Thanks!
include/fmt/format.cc
Outdated
buffer_.resize(1); | ||
buffer_[0] = 0; | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow the Google style guide (2-space indent and open brace on the same line:
if (s_size == 0) { // MultiByteToWideChar does not support zero length, handle separately
buffer_.resize(1);
...
Same below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Will fix.
Shall I submit a new (v2) pull-request, to reduce the fix commits clutter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they can be squashed together before merging.
Merged with minor changes in acb469a, thanks! |
@vgalka-sl, could you please review the updated CONTIBUTING document, particularly the part about licensing, and let me know if you agree with it being applied to your contributions to {fmt}? The library is likely to be relicensed (#1073) so I'm collecting approval from all earlier contributors. Thanks! |
Previously an exception was thrown since Win32 WideCharToMultiByte() API returns error on zero-length input.