-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Struct.UTFString.get() fails for UTF-16 #30
Comments
This should probably be using Java's charset logic to decode. Will investigate. |
Ahh I see, it's just looking for the nulls to peel them off. Will see what I can do. |
Ok, I understand now. getZeroTerminatedByteArray is used to return the bytes of a string sans the null terminator. It does this by taking the given string address and calling strlen on it. strlen only looks for \0, and then that length is used to allocate and populate a Java byte array. This would be a problem if there's any embedded null bytes, which is obviously a problem for UTF-16 in ASCII range. This is going to be a much more difficult fix, since the actual strlen call happens inside native code. Whenever we change native code, we need to rebuild the native stubs across platforms. I'm also not sure that just changing strlen is the right fix. These functions have no way of knowing what encoding the bytes are in. Here's what I think we should do:
|
My fix was as follows:
} |
@blschatz Possible for you to turn that into a pull request we can integrate? I'm not sure how you're using that within jnr-ffi and your own code (i.e. I'd like to see some examples and ideally tests in a PR). |
This fails due to the underlying call to IO.getZeroTerminatedByteArray - this should really be looking for double nulls not single nulls for wide Charsets.
The text was updated successfully, but these errors were encountered: