-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unreadable io::Error debug strings #34318
Comments
I think printing escapes is very useful for debugging strings, since often consoles will be either incapable of displaying the character (the WIndows console is notorious for this), or it is something like the wrong kind of whitespace slipping in which can be really hard to notice otherwise. I feel any solution for this issue should be targeted at |
I'd prefer to output printable characters as they are, escaping is inconvenient for everything that's not in English. |
Which morden console don't support printing Unicode strings? Do you means that very old one? (Windows 10 console do its job perfectly.) |
@liigo While the Windows console is capable of storing arbitrary unicode text, its unicode text rendering abilities are far more limited. It only displays unicode glyphs when the font supports it (it doesn't use font fallback) and it also doesn't work for anything outside the BMP, nor does it support combining characters. It's a simple wchar -> glyph mapping. Here is your chinese error string in the Windows 10 console: |
Yes. It's the same when you |
Use the same procedure as Python to determine whether a character is printable, described in [PEP 3138]. In particular, this means that the following character classes are escaped: - Cc (Other, Control) - Cf (Other, Format) - Cs (Other, Surrogate), even though they can't appear in Rust strings - Co (Other, Private Use) - Cn (Other, Not Assigned) - Zl (Separator, Line) - Zp (Separator, Paragraph) - Zs (Separator, Space), except for the ASCII space `' '` (`0x20`) This allows for user-friendly inspection of strings that are not English (e.g. compare `"\u{e9}\u{e8}\u{ea}"` to `"éèê"`). Fixes rust-lang#34318. [PEP 3138]: https://www.python.org/dev/peps/pep-3138/
Escape fewer Unicode codepoints in `Debug` impl of `str` Use the same procedure as Python to determine whether a character is printable, described in [PEP 3138]. In particular, this means that the following character classes are escaped: - Cc (Other, Control) - Cf (Other, Format) - Cs (Other, Surrogate), even though they can't appear in Rust strings - Co (Other, Private Use) - Cn (Other, Not Assigned) - Zl (Separator, Line) - Zp (Separator, Paragraph) - Zs (Separator, Space), except for the ASCII space `' '` `0x20` This allows for user-friendly inspection of strings that are not English (e.g. compare `"\u{e9}\u{e8}\u{ea}"` to `"éèê"`). Fixes #34318. CC #34422. [PEP 3138]: https://www.python.org/dev/peps/pep-3138/
This issue is still exists in nightly. Could you reopen it? @brson @alexcrichton @tbu- (the author of #34485) pub fn main() {
println!("{:?}", "In Chinese: 文件不存在");
// still prints: "In Chinese: \u{6587}\u{4ef6}\u{4e0d}\u{5b58}\u{5728}"
// expected: "In Chinese: 文件不存在"
} |
The problem occured due to lines like ``` 3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;; 4DB5;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;; ``` in `UnicodeData.txt`, which the script previously interpreted as two characters, although it represents the whole range. Fixes rust-lang#34318.
Fix `fmt::Debug` for strings, e.g. for Chinese characters The problem occured due to lines like ``` 3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;; 4DB5;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;; ``` in `UnicodeData.txt`, which the script previously interpreted as two characters, although it represents the whole range. Fixes #34318.
While trying to open a non-existing file (on Windows 10, Chinese):
It panicked with a message:
The OS native error message here is unreadable for human being.
The message (
"\u{7cfb}\u{7edf}\u{627e}\u{4e0d}\u{5230}..."
) is the escaped result of "No such file or directory" in Chinese ("系统找不到指定的文件")。impl fmt::Debug for str
do this escape.I can read Chinese, which is my mother tongue, but I can't read
\u{7cfb}...
.Possible solutions:
impl fmt::Debug for str
to not escape most Unicode chars. (breaking-change?)Run it on play.rust-lang.org
Is
impl fmt::Debug for str
really friendly enough for debug purpose? Is there possibility that we change its implementation as similar asimpl fmt::Display for str
?The text was updated successfully, but these errors were encountered: