-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python cannot run in the ja_JP.sjis locale used windows-31j encoding. #102388
Comments
The charset name "Windows-31J" is registered in the IANA Charset Registry[1] and is implemented in Python as the cp932 codec. This commit adds windows_31j to the aliases of cp932. [1] https://www.iana.org/assignments/charset-reg/windows-31J Signed-off-by: Masayuki Moriyama <[email protected]>
I will give a supplementary explanation. IANA Windows-31J: Python cp932: https://github.com/python/cpython/blob/main/Tools/unicode/genmap_japanese.py#L26
Python's cp932 and Unicode conversion tables are generated from the same CP932.TXT[2] as IANA's Windows-31J, so I think adding windows-31j to the alias of cp932 is no problem. [1] https://www.iana.org/assignments/charset-reg/windows-31J |
The charset name "Windows-31J" is registered in the IANA Charset Registry[1] and is implemented in Python as the cp932 codec. This commit adds windows_31j to the aliases of cp932. [1] https://www.iana.org/assignments/charset-reg/windows-31J Signed-off-by: Masayuki Moriyama <[email protected]>
The charset name "Windows-31J" is registered in the IANA Charset Registry[1] and is implemented in Python as the cp932 codec. This commit adds windows_31j to the aliases of cp932. [1] https://www.iana.org/assignments/charset-reg/windows-31J Signed-off-by: Masayuki Moriyama <[email protected]>
The charset name "Windows-31J" is registered in the IANA Charset Registry[1] and is implemented in Python as the cp932 codec. This commit adds windows_31j to the aliases of cp932. [1] https://www.iana.org/assignments/charset-reg/windows-31J Signed-off-by: Masayuki Moriyama <[email protected]>
The charset name "Windows-31J" is registered in the IANA Charset Registry[1] and is implemented in Python as the cp932 codec. [1] https://www.iana.org/assignments/charset-reg/windows-31J Signed-off-by: Masayuki Moriyama <[email protected]>
…02389) The charset name "Windows-31J" is registered in the IANA Charset Registry[1] and is implemented in Python as the cp932 codec. [1] https://www.iana.org/assignments/charset-reg/windows-31J Signed-off-by: Masayuki Moriyama <[email protected]>
Closing as the PR has been merged. Thanks! |
…02389) The charset name "Windows-31J" is registered in the IANA Charset Registry[1] and is implemented in Python as the cp932 codec. [1] https://www.iana.org/assignments/charset-reg/windows-31J Signed-off-by: Masayuki Moriyama <[email protected]>
Bug report
Linux using glibc cannot run Python when ja_JP.sjis locale is set as follows.
The charset name "Windows-31J" is registered in the IANA Charset Registry[1].
Windows-31J is supported by perl[2], php[3], ruby[4], java[5], etc.
Python's cp932 is equivalent to Windows-31J, so I propose to add windows_31j to aliases for cp932.
[1] https://www.iana.org/assignments/charset-reg/windows-31J
[2] https://perldoc.perl.org/Encode::JP
[3] https://www.php.net/manual/en/mbstring.encodings.php
[4] https://docs.ruby-lang.org/ja/latest/class/Encoding.html
[5] https://docs.oracle.com/en/java/javase/19/intl/supported-encodings.html
Your environment
Linked PRs
The text was updated successfully, but these errors were encountered: