-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIx handling of broken UTF-8 data #180
Comments
robUx4
added
bug
api-break
breaks the API (e.g. programs using it will have to adjust their source code)
labels
Dec 23, 2023
For example these lines crash libebml with an unhandled exception, the utf8 libraries throws a
|
Also the internal storage should be UTF-8. So when copying unicode EBML elements they are not interpreted. bogus data in, bogus data out with the same size. |
robUx4
added a commit
to robUx4/libebml
that referenced
this issue
Dec 26, 2023
If the string buffer comes from an EBML file we should be able to copy the same buffer without interpreting it, even if it's bogus. It may be intentional. Fixes Matroska-Org#180
robUx4
added a commit
to robUx4/libebml
that referenced
this issue
Dec 26, 2023
If the string buffer comes from an EBML file we should be able to copy the same buffer without interpreting it, even if it's bogus. It may be intentional. Fixes Matroska-Org#180
robUx4
added a commit
that referenced
this issue
Dec 27, 2023
If the string buffer comes from an EBML file we should be able to copy the same buffer without interpreting it, even if it's bogus. It may be intentional. Fixes #180
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Right now
UTFstring
doesn't keep/check the validity of its data. It discardsinvalid_code_point
andinvalid_utf8
exceptions.When reading it means the string might not be the one that was expected or the file is broken. We should be able to know if we can trust the data read/converted.
When writing that means the given unicode strings cannot be converted to UTF-8 for some reason. We won't be able to store it accurately in EBML. So we should at least know when calling
UTFstring::SetUTF8()
.The text was updated successfully, but these errors were encountered: