-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix asfvideo unicode handling #2581
Conversation
|
Yeah. This PR works great locally. Not sure how I can fix honestly. |
It seems like it might be more expensive performance-wise to do the conversion twice? |
Right. If using iconv. No difference on Windows. |
src/helper_functions.cpp
Outdated
@@ -53,8 +39,11 @@ std::string readStringWcharTag(const BasicIo::UniquePtr& io, size_t length) { | |||
Internal::enforce(length <= io->size() - io->tell(), Exiv2::ErrorCode::kerCorruptedMetadata); | |||
DataBuf FieldBuf(length + 1); | |||
io->readOrThrow(FieldBuf.data(), length, ErrorCode::kerFailedToReadImageData); | |||
std::u16string wst(FieldBuf.begin(), FieldBuf.end()); | |||
return utf16ToUtf8(wst); | |||
std::string wst(FieldBuf.begin(), FieldBuf.end() - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can reproduce locally.
when opening the file exiv2/test/tmp/sample_960x540.asf.out, we can see a NULL (^@) character at the end every converted wstring
perhaps we need to not convert the last 2 bytes : std::string wst(FieldBuf.begin(), FieldBuf.end() - 2);
I did this changes locally and test passes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK this will break UTF-8 conversion. Unfortunately, I'm away from my computer for ~2 months so I can't test properly. I don't think the current tests have UTF8 data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this fixed now? I haven't been following what's been happening, but I see that all the checks are passing and the code changes look good to me, so I'm happy to approve this if everything's working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well merge. If it's not working properly with UTF8 data, it's at least better than before.
@@ -53,8 +39,11 @@ std::string readStringWcharTag(const BasicIo::UniquePtr& io, size_t length) { | |||
Internal::enforce(length <= io->size() - io->tell(), Exiv2::ErrorCode::kerCorruptedMetadata); | |||
DataBuf FieldBuf(length + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can take advantage of this pull request to delete the useless reseved extra byte : DataBuf FieldBuf(length);
same thing in line 51
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to cause memory errors.
b83a3f5
to
8d496d0
Compare
Codecov Report
@@ Coverage Diff @@
## main #2581 +/- ##
==========================================
- Coverage 63.92% 63.91% -0.01%
==========================================
Files 103 103
Lines 22311 22309 -2
Branches 10795 10795
==========================================
- Hits 14262 14259 -3
Misses 5828 5828
- Partials 2221 2222 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Use convertStringCharset to convert instead of reimplementing. Some data is UTF-32 and other is UTF-16. Instead of implementing another function for Windows, convert from UCS2-LE to UTF-8 twice. Signed-off-by: Rosen Penev <[email protected]>
Use convertStringCharset to convert instead of reimplementing.
Some data is UTF-32 and other is UTF-16. Instead of implementing another function for Windows, convert from UCS2-LE to UTF-8 twice.