Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force the JVM to use UTF-8 on Windows #24172

Closed
wants to merge 4 commits into from

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Nov 1, 2024

This change patches the app manifest of the java.exe launcher in the embedded JDK to always use the UTF-8 codepage on Windows 1903 and later.

This is necessary because the launcher sets sun.jnu.encoding to the system code page, which by default is a legacy code page such as Cp1252 on Windows. This causes the JVM to be unable to interact with files whose paths contain Unicode characters not representable in the system code page, as well as command-line arguments and environment variables containing such characters.

The Windows VMs in CI are not running Windows 1903 or later yet, so this change can currently only be tested locally by running bazel info character-encoding and verifying that it prints sun.jnu.encoding = UTF-8.

Work towards #374
Work towards #18293
Work towards #23859

@fmeum fmeum force-pushed the 23859-unicode-windows branch 10 times, most recently from 7976d91 to d56b510 Compare November 3, 2024 14:14
@fmeum fmeum force-pushed the 23859-unicode-windows branch 2 times, most recently from 3fec426 to 5961d10 Compare November 4, 2024 09:22
@fmeum fmeum force-pushed the 23859-unicode-windows branch from 5961d10 to 6557f73 Compare November 4, 2024 09:24
@fmeum fmeum requested a review from meteorcloudy November 4, 2024 09:27
@fmeum fmeum marked this pull request as ready for review November 4, 2024 09:27
@github-actions github-actions bot added the awaiting-review PR is awaiting review from an assigned reviewer label Nov 4, 2024
@fmeum
Copy link
Collaborator Author

fmeum commented Nov 4, 2024

@meteorcloudy I started a thread in core-libs-dev to potentially improve the situation in the JDK itself, but I currently don't see any other way of doing this (short of asking users to switch to a UTF-8 system locale, but that may break other software).

@meteorcloudy meteorcloudy added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer labels Nov 4, 2024
@sgowroji sgowroji added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Nov 4, 2024
src/BUILD Show resolved Hide resolved
@fmeum
Copy link
Collaborator Author

fmeum commented Nov 4, 2024

@bazel-io fork 8.0.0

@copybara-service copybara-service bot closed this in 7bb8d2b Nov 5, 2024
@github-actions github-actions bot removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label Nov 5, 2024
@fmeum fmeum deleted the 23859-unicode-windows branch November 5, 2024 22:14
bazel-io pushed a commit to bazel-io/bazel that referenced this pull request Nov 6, 2024
This change patches the app manifest of the `java.exe` launcher in the embedded JDK to always use the UTF-8 codepage on Windows 1903 and later.

This is necessary because the launcher sets sun.jnu.encoding to the system code page, which by default is a legacy code page such as Cp1252 on Windows. This causes the JVM to be unable to interact with files whose paths contain Unicode characters not representable in the system code page, as well as command-line arguments and environment variables containing such characters.

The Windows VMs in CI are not running Windows 1903 or later yet, so this change can currently only be tested locally by running `bazel info character-encoding` and verifying that it prints `sun.jnu.encoding = UTF-8`.

Work towards bazelbuild#374
Work towards bazelbuild#18293
Work towards bazelbuild#23859

Closes bazelbuild#24172.

PiperOrigin-RevId: 693466466
Change-Id: I4914c21e846493a8880ac8c6f5e1afa9fae87366
github-merge-queue bot pushed a commit that referenced this pull request Nov 7, 2024
This change patches the app manifest of the `java.exe` launcher in the
embedded JDK to always use the UTF-8 codepage on Windows 1903 and later.

This is necessary because the launcher sets sun.jnu.encoding to the
system code page, which by default is a legacy code page such as Cp1252
on Windows. This causes the JVM to be unable to interact with files
whose paths contain Unicode characters not representable in the system
code page, as well as command-line arguments and environment variables
containing such characters.

The Windows VMs in CI are not running Windows 1903 or later yet, so this
change can currently only be tested locally by running `bazel info
character-encoding` and verifying that it prints `sun.jnu.encoding =
UTF-8`.

Work towards #374
Work towards #18293
Work towards #23859

Closes #24172.

PiperOrigin-RevId: 693466466
Change-Id: I4914c21e846493a8880ac8c6f5e1afa9fae87366

Commit
7bb8d2b

Co-authored-by: Fabian Meumertzheim <[email protected]>
ramil-bitrise pushed a commit to bitrise-io/bazel that referenced this pull request Dec 18, 2024
This change patches the app manifest of the `java.exe` launcher in the embedded JDK to always use the UTF-8 codepage on Windows 1903 and later.

This is necessary because the launcher sets sun.jnu.encoding to the system code page, which by default is a legacy code page such as Cp1252 on Windows. This causes the JVM to be unable to interact with files whose paths contain Unicode characters not representable in the system code page, as well as command-line arguments and environment variables containing such characters.

The Windows VMs in CI are not running Windows 1903 or later yet, so this change can currently only be tested locally by running `bazel info character-encoding` and verifying that it prints `sun.jnu.encoding = UTF-8`.

Work towards bazelbuild#374
Work towards bazelbuild#18293
Work towards bazelbuild#23859

Closes bazelbuild#24172.

PiperOrigin-RevId: 693466466
Change-Id: I4914c21e846493a8880ac8c6f5e1afa9fae87366
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants