-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix more Unicode bugs #24182
Fix more Unicode bugs #24182
Conversation
d539093
to
b378d1f
Compare
b378d1f
to
db728d4
Compare
@fmeum Mind rebasing before I review? |
c2226bd
to
8670006
Compare
@tjgq Rebased onto master. |
@bazel-io fork 8.0.0 |
@fmeum can you please resolve the conflicts? Thanks! |
8670006
to
664f29c
Compare
@iancha1992 Rebased |
* Use Latin-1 in many native file write rules for consistency with the internal encoding. * Use Latin-1 for the resolved repository file and the JSON profile. * Fix `unused_input_list` handling of non-ASCII characters in file names. * Flip the `legacy_utf8` parameter of `repository_ctx.file` to `False` and make it a no-op. With the previous default, any non-ASCII characters would be written out as double encoded UTF-8, which is not a useful choice. * Change `repository_ctx.template` to operate on raw bytes for consistency with `repository_ctx.read` and to fix substitution with non-ASCII keys/values. * Move some usages of `UTF_8` closer to their usage site to clarify why they are correct. * Fixes parsing of dependency files with Unicode character contents (`/showIncludes` and `.d` files) Closes bazelbuild#24182. PiperOrigin-RevId: 698111811 Change-Id: Ie43bab9eb5963bf81690dd8985d358f544a711c9 (cherry picked from commit 3fdec93)
* Use Latin-1 in many native file write rules for consistency with the internal encoding. * Use Latin-1 for the resolved repository file and the JSON profile. * Fix `unused_input_list` handling of non-ASCII characters in file names. * Flip the `legacy_utf8` parameter of `repository_ctx.file` to `False` and make it a no-op. With the previous default, any non-ASCII characters would be written out as double encoded UTF-8, which is not a useful choice. * Change `repository_ctx.template` to operate on raw bytes for consistency with `repository_ctx.read` and to fix substitution with non-ASCII keys/values. * Move some usages of `UTF_8` closer to their usage site to clarify why they are correct. * Fixes parsing of dependency files with Unicode character contents (`/showIncludes` and `.d` files) Closes #24182. PiperOrigin-RevId: 698111811 Change-Id: Ie43bab9eb5963bf81690dd8985d358f544a711c9 (cherry picked from commit 3fdec93) Fixes #24242
The changes in this PR have been included in Bazel 8.0.0 RC3. Please test out the release candidate and report any issues as soon as possible. |
@tjgq has a better understanding of the cause than I do (I just noticed the failures and initiated the rollback). |
See b/381060195 for some details. I'm fairly confident that the root cause was in Google code, not Bazel code, and I expect the fix to be very localized. So, absent evidence that it breaks anything externally, I'd recommend sticking with the cherry-pick for now. |
* Use Latin-1 in many native file write rules for consistency with the internal encoding. * Use Latin-1 for the resolved repository file and the JSON profile. * Fix `unused_input_list` handling of non-ASCII characters in file names. * Flip the `legacy_utf8` parameter of `repository_ctx.file` to `False` and make it a no-op. With the previous default, any non-ASCII characters would be written out as double encoded UTF-8, which is not a useful choice. * Change `repository_ctx.template` to operate on raw bytes for consistency with `repository_ctx.read` and to fix substitution with non-ASCII keys/values. * Move some usages of `UTF_8` closer to their usage site to clarify why they are correct. * Fixes parsing of dependency files with Unicode character contents (`/showIncludes` and `.d` files) Closes bazelbuild#24182. PiperOrigin-RevId: 698111811 Change-Id: Ie43bab9eb5963bf81690dd8985d358f544a711c9
unused_input_list
handling of non-ASCII characters in file names.legacy_utf8
parameter ofrepository_ctx.file
toFalse
and make it a no-op. With the previous default, any non-ASCII characters would be written out as double encoded UTF-8, which is not a useful choice.repository_ctx.template
to operate on raw bytes for consistency withrepository_ctx.read
and to fix substitution with non-ASCII keys/values.UTF_8
closer to their usage site to clarify why they are correct./showIncludes
and.d
files)