-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken null-terminated multibyte string handling when JVM is non-US defaultCharset
#108
Comments
As I found bug with incorrect handling of null-terminated multibyte strings (NTMBS) was introduced in quite old commit. That commit introduced two things:
Second part is fragile for both NTMBS and null-terminated widechar strings (NTWS) which can lead to several issues.
Since Also, passing string-like to So, it's totally broken for I think, proper fix requires to split handing of |
defaultCharset
Hello there! This is a good find. I'm not surprised to find some problems dealing with non-UTF-8 encodings when passing strings in and out of jnr-ffi. I also agree with your assessment; there need to be a way to specify how strings are expected to be encoded when coming from a library. Do you perhaps have some idea of the change to make? You've managed to find the problem and diagnose it pretty well...perhaps you can put together a PR that starts making things better? |
@headius, I'm not yet familar with I'll have this issue in mind if I have several days to dig into but can't promise anything. Also I have no systems with Windows running and this issue doesn't affect me, so it has quite low priority in my list. If someone steps up to fix it, I'll be happy with it. |
This bug bit me today. I don't know how to do a general fix, but for this specific instance you can use the |
When JVM is started with single-byte default charset string read from native lib are handled incorrectly which results to reading
text
section (in case of static strings in binary) untill next doubleNUL
-byte is met.Bug found during trying to show issues with implicit use of
defaultEncoding
inString#getBytes()
in kalium to make PR with forcing use ofstr.getBytes(StandardCharset.UTF_8)
or something similar.And it broke almost all tests instead of one because
char *sodium_version_string()
return value was misinterpreted (1.0.11�chunk_size > (size_t) 0U�stream.nonce != (uint64_t) 0U�/dev/random�ret == 0�/dev/urandom��randombytes/salsa20/randombytes_salsa20_random.c
instead of just1.0.11
). And this string built frombyte[]
/char[]
somewhere injnr-ffi
does containNUL
char at position6
.As could be see from hexdump resulting java string ends where binary contains
NUL
-byte twice first time in stream.Env:
To reproduce get code from #107 and run
mvn clean test -DargLine=-Dfile.encoding=windows-1251
The text was updated successfully, but these errors were encountered: