-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8 rendering woes #75
Comments
Thanks, I also added DejaVu Sans Mono to cmd.exe and Cygwin64 Terminal. Now also |
The example below works better in WSL Terminal (i.e. WSLtty) than in Ubuntu from Store for WSL (using Cmd.exe). One can run it in two windows and compare. Some characters are missing in Cmd.exe.
|
I too am seeing a problem with Unicode. I have a similar problem with Unicode with SSH but when I use Putty everything works fine. There is one strange thing in this example. The string 方思腾.香港 rendered fine until I cursored over it and it didn't recover. This image shows the original version and the version after the cursor moved over it. Running emacs on in Putty on an Ubuntu system does not have the problem. I tried two different fonts and had th same problem (This also raises the question of why the command prompt doesn't to Unicode by default but that's a different topic. |
@BobFrGit that's because those characters aren't monospaced and cmd doesn't really support them(by the way, you can "set" a font to be type monospace and not have really any char on the same size, it's a font property), i have similar problems with glyphs, try to find a font with monospace ones for those unicode chars o you could try process that one with some python script, there is a project called powerline patched fonts(if i recall correctly) that have scripts that could help you |
The monospace assumption is interesting. I use Epsilon on Windows and when I go done it goes to the same nth character on a line. But using Emacs in Ubuntu when I do down it goes to the characters visually below. The question then is why does Emacs in Putty do it "right" or, at least, doesn't get confsued while using it in bash fails. If I use Emacs with SSH I get a different result -- substitution characters. (Same for DigitialOcean's own access tool) In exploring Unicode I found that it can be far more complicated so I'm not trying to solve the general case -- just observing that PUTTY is an existence prove of a better approach. |
Using ConEmu terminal can also help with this |
Thanks. For now epsilon and Putty work sufficiently well for me. Just wanted to flag the problem for now. |
@BobFrGit i mean the kanjis o whatever they are, you can see that they are double spaced, so when you move the cursor over you can see that they are splitted in half and you see the part that would match if they where 1 width each, like... |
(Actually they are hanzi 汉字 but don't worry about it.) As I mentioned above both Emacs on Ubuntu via Putty and Epsilon on the PC don't have the problem though they take different approaches in dealing with the fact that those characters are not monospace. This is not a big deal for me now -- just wanted to flag it. One feature is that I found that if I change the CMD font the dir listing will show 汉字 file names properly. |
I'm not sure if this is related, but it seems to be - I'm finding a lot of characters/symbols aren't rendering as expected in Bash on Windows 10. Although not necessary, a lot of build tools use special characters to show status of tests, etc. I understand emoji is a whole different issue... so not raising that here. |
Yeah -- been playing with the 32 bit Unicode and that's a challenge in its own right. As an FYI Word seems to do a pretty good job on Emojis and I discovered that Alt-x let's me enter them. (At least some -- when I tried to enter ancient Chinese rod number I didn't find a font that had them). |
Hey @mrmckeb same here, unable to use unicode emoji's / symbols... |
@antlatrille Similar to what I've seen. Hopefully we can get more support for this in future releases! |
Still encountering this issue using Bash on Windows over 1 year since the initial report. Is there any fix? |
Hey all. It's important to note that Console not being able to display a given symbol or set of symbols is a many-sided-blade! ;) Alas, because the Console's text renderer is GDI-based, we're unable to support features like font-fallback which would allow us to support fonts that contain a specific set of symbols (e.g. Emoji, Klingon), but gradually fall-back on a more expansive font sets for other chars. We have a goal to replace our renderer with a more modern DirectWrite renderer at some point in the (increasingly near) future. When we do, we'll be able to do A LOT of very cool, modern, fancy things with text that we're simply unable to do right now. Bear with us ;) |
Thanks for the update on this. |
@bitcrazed Use Direct2D rewrite Console ? |
A side effect of revisiting this thread is that I realize i can use escape sequences in NodeJS console.log. I presume the new capabilities will be available via escape sequences so they can be used without needing to update libraries to take advantage of the new features. |
It's not just bash, it's a windows problem it seems.....just tried the same fonts with notepad or wordpad. |
It appears we've recently passed nine months since the last collaborator update, and this issue is still unresolved. Or, at least, I'm experiencing the same issues described above. Is there any news of progress on fixing this, or a more clear definition of what "(increasingly near) future" means? Any update would be greatly appreciated. |
I'm in agreement with @hwaldstein, but I have seen that unicode characters work using Hyper as the terminal for WSL instead of the default. I'm not as happy with it's ANSI colors, but that's on Hyper, not WSL. Is there a better repo for this issue than WSL? |
There are rumors of new command processor and/or shell.. If so does it moot this and instead shift the focus to feature requests and betas? |
@fcharlie - you can count on that :) |
@BobFrGit Yes, our guidance (we'll be publishing some in the coming weeks) is to SetConsoleMode enabling |
@dernyn - as I pointed out above, GDI based display tech struggles with several mechanisms (esp. font-fallback) that are essential for displaying complex modern glyphs, including ninjacat emoji 🐱👤. In the future, we plan on replacing the Console's current GDI based renderer with a renderer that uses DirectWrite (directly or indirectly) which will eliminate almost all our rendering, and many of our internationalization issues in one fell swoop! |
Hey @hwaldstein - thanks for your continued patience. While it may appear that we've been rather quiet over the last year or so, we've actually been cranking away, modernizing and overhauling much of the Console's internals, paving the way for us to start delivering user-visible improvements in future releases. The 18H2 (2018, 2nd half) release that we're currently working on will deliver some pretty cool improvements, esp. for anyone building 3rd party terminals, and command-line shells, tools, and apps. We have a long list of Console features queud up for subsequent OS releases too. |
@jacoby - thanks for your patience; I also refer you to my reply to @hwaldstein above. Re. repo choices: We'll be moving many of these Console related issues over to the new Console issues repo in the coming months - feel free to post new issues over there from now onwards though.. |
@BobFrGit - I am not aware of any new shell being created at Microsoft. We already have Cmd and PowerShell, and of course bash/zsh/fish/etc. in your favorite Linux distro(s) running atop WSL. |
@bitcrazed I'm glad to see your decision, and I'm looking forward to the new console. |
@bitcrazed And of course the Cygwin-based Bash that is used in Git4Win, etc. Can hardly wait for summer and the new Console. I like all about Hyper except the lag. |
@jacoby - yes, but Cygwin isn't a Microsoft shell. And to be clear, we're not shipping a "new" Console this summer - it's the same Console, with significantly improved internals, and several bug fixes and improvements. |
Gotcha. |
@bardware VERY likely that code-point isn't included in your console's currently selected font. As mentioned above/elsewhere, Console renders using GDI which cannot perform font-fallback, so if your font doesn't contain the glyph for ➪ then we can only display the unprintable char glyph. |
I played around a bit and tried some fonts alread, but I'll keep looking. |
Quoth from the top:
The ➪ glyph is in the "kinda not" category. So, don't burn too much time downloading every fixed width font you can find on the Interwebs. It isn't going to help (or call me 😮 if it does). Like Rich alludes a bunch of posts back, getting from a given unicode sequence to a particular glyph is "a process". I'm sure all will be golden with the new engine. But in this instance, not likely with a different font; which one could reasonably misinterpret "currently selected" in the previous post as implying. Bonne chance. |
Can you please share with us - because you were rather vague 3 weeks ago - will "the same Console, with significantly improved internals, and several bug fixes and improvements" support UTF-8? Or will you only start working on it after "18H2 (2018, 2nd half)", meaning we should gather more patience? Thank you very much, tons of kudos for your work! |
All I can share right now is that we're working hard to make all the changes necessary to support UTF-8 which then enables us to work on adding rendering support for emoji, complex scripts, etc. Not going to put dates on things until we're confident that a) things are working, b) we understand which releases our stuff lines up for. It's a complex process, but bear with us - we're on it. |
My sympathy -- Unicode can get amazingly complex. |
Thanks a lot @bitcrazed |
@BobFrGit .. and people wonder why I've got so much more gray hair these days ;) @stereokai Thanks 😀 |
Hey all. Thanks for the discussion re. this issue. We're right in the middle of a ton of Console internals re-engineering that'll allow the Console to accurately support Unicode & UTF-8 text. Closing this issue since:
If you have further asks/issues, please file new issues on our Console GitHub Repo. |
vanilla windows terminal (cmd.exe) does not have robust unicode support beyond the old ibm codepage 437. the console team is aware and working on it.. - microsoft/WSL#75 - microsoft/terminal#104 - microsoft/terminal#190 - microsoft/terminal#226 - microsoft/terminal#306
Examples below use the UTF-8 demo file.
Some of the rendering issues could be attributed to the font (Consolas), but some cannot.
Here's Consolas with MinTTY (Cygwin):
And here's Consolas with "Bash on Windows":
Consolas simply doesn't do well on the box drawing tests.
One of the best monospace fonts I've found is DejaVu Sans Mono. But cmd.exe's properties page doesn't allow me to select that font when it's installed. It has a static list of fonts that appear in the Windows Registry. In order to use fonts other than Lucida Console, Consolas, or raster fonts, I need to replace one of the fonts listed in the registry under
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Console\TrueTypeFont
. In my case, I replaced Consolas with DejaVu Sans Mono for another test:DejaVu Sans Mono with MinTTY (Cygwin):
DejaVu Sans Mono with "Bash on Windows":
Now the box drawing tests are fine, but there are numerous UTF-8 glyphs that are unavailable for use.
So the problems are:
The text was updated successfully, but these errors were encountered: