-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check UTF-8 support #2024
Comments
Hexdump of the file shows that the
|
Should we set that property somewhere early in the launcher, so there's no need to provide it on the command line? |
Maybe. If we think there are no side effects. |
It's unclear to me where the difference takes effect. phoebus/app/display/model/src/main/java/org/csstudio/display/builder/model/persist/ModelReader.java Line 90 in a707942
phoebus/core/framework/src/main/java/org/phoebus/framework/persistence/XMLUtil.java Line 33 in a707942
So the case of opening a display file should already be handled. But here we're receiving text from a PV, where we just fetch the String that we get from channel access:
Maybe that needs to be something like this:
Problem is we don't get the raw bytes. We get the units already converted into a string, so how can you reliably go back to the original bytes? Maybe "file.encoding" is much broader than just reading from files and applies to any byte[]-to-String conversions, and thus the units string from the channel access client library is already in proper UTF-8 when we set "file.encoding". |
Not sure I have the proper EPICS experience to comment at this point, but the metadata field referenced in DBHelper#L191 is jca magic, and I guess we'd like to avoid changing that. |
https://www.baeldung.com/java-char-encoding suggests that file.encoding is indeed quite broad, "the name of the default charset" used by String, input stream, .. Some newer API that doesn't use file.encoding like java.nio.file.Files defaults to UTF-8. So setting file.encoding to UTF-8 just brings everything into alignment with newer API, and since EPICS also uses UTF-8, my vote would be for setting file.encoding early in the launcher code. |
Sounds reasonable... |
Turns out calling System.setProperty("file.encoding", "UTF8") as first statement in main does not always work. When I run this on a Mac, the default charset remains "UTF-8" even though I try selecting UTF-16 via file.encoding:
Adding "-Dfile.encoding=..." to the java command line or JAVA_TOOL_OPTIONS does have an effect. Will create PR that adds -Dfile.encoding=... to the example start scripts and warns in Launcher if default charset differs from UTF-8. |
EPICS database files are supposedly UTF8.
In the attached database file, the EGU are set to UTF-8 0xC2 0xB0 for the 'degree' symbol.
With CSS on Mac, it shows up as such:
x.txt
Does that not work on Windows?
Is something missing to force UTF-8 interpretation of the units, so when Windows has some other default, the units are not properly represented?
The text was updated successfully, but these errors were encountered: