-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default all G-sets to ASCII unless ISO-2022 is requested (was: Terminal breaks on \x1b O) #10408
Comments
ESC 06/15 is LS3 - LOCKING-SHIFT THREE. That was implemented in #4496. Have you tried the same command in a different terminal, such as xterm or in the Linux console? I expect you'd get the same result with those. |
I have no problem running mentioned command on the same server while using putty as my gui ssh client / winsshterm (which is putty)/penguinet and no problems with Debian's default console. |
From https://www.man7.org/linux/man-pages/man4/console_codes.4.html, I get the impression that the Linux console does not implement LS3, while xterm does. |
Is there already a feature request for an action that resets the character sets to the defaults? Similar to #1882 for the scrollback. |
FYI, typing |
Yes, thank you, typing reset in bash indeed worked, but unfortunately it won't work in the environments like python console, one in which I initially discovered the issue |
If it happens again in a python console, you can use |
Can you bump your Win32-OpenSSH to 8.6? https://github.com/PowerShell/Win32-OpenSSH/releases Before 8.1, it was handling VT sequences itself instead of allowing conhost to do so instead. |
Sadly, it didn't make a difference:
|
This comment has been minimized.
This comment has been minimized.
I'm fairly certain OpenSSH has nothing to do with this. As KalleOlaviNiemitalo pointed out at the start, LS3 is a standard control sequence that is intended to change the character set - it's working as designed. You'll get the exact same behaviour in a local shell. What we're missing, is a way for users to easily reset the terminal when it gets into a state that they weren't expecting. |
It seems totally reasonable to me for us to have a "Reset Terminal" command ~~that's just a It can't just be a sendInput command - it'll need to immediately trigger it like it was sent as client output, but yea it's a good idea. |
I'm not sure LS3 is really designed to work here, because it is defined in ECMA-35 (which claims to be identical to ISO/IEC 2022:1994), and UTF-8 is registered as a "Coding system different from that of ISO 2022". Perhaps a terminal that starts in UTF-8 mode could ignore LS3 and other ECMA-35 features until it receives DOCS (DESIGNATE OTHER CODING SYSTEM) for switching to ECMA-35 mode. On the other hand, if Windows Terminal impersonates xterm, and xterm supports LS3, then I suppose Windows Terminal has to likewise support LS3. |
I had considered that, but the problem is that a lot of existing software expects a certain amount of ISO-2022 functionality even in UTF-8 mode - the most common case being to select the DEC line drawing character set. |
Note that this has similar problems to #1882, in that we need the reset to occur on the conhost side of the conpty pipe. So we either need something like #1193, or in the long term #1173. |
It is stated in the roadmap that WT VT (Virtual Terminal) relies on ECMA-48, which does appeal to ECMA-35. If UTF8 is usable with VT, then some ECMA-35 fidgets might not be directly possible, since UTF8 commandeers the higher order bit of octets and the client/shell might need to cope with that. Is the line-drawing different than the pixel-graphics that have also been asked for? There are line-drawing code points in Unicode though. [I have not dug deep into this, just pondering that the ECNA-48 and UTF8 and Virtual Terminal need to be examined for exactly how they apply for Window Terminal.] |
I'm of two minds on this. The one that's currently winning the debate for me right now is that these are control sequences. They're not supposed to be emitted unless an attempt is being made to control a thing.
Yep- line drawing (emitting full-cell glyphs from the unicode range U+2500) is a tried-and-true method of drawing TUI-like interfaces. One of the character sets, the DEC Technical Set, allows you to access them in the sub-0x7F range. I'm inclined to follow what other terminal emulators do here. What do gnome-terminal, Konsole and Alacritty do here? |
Gnome-terminal uses VTE, from which most ISO 2022 support was removed in 2014 after discussion at https://bugzilla.gnome.org/show_bug.cgi?id=732586 vte/src/parser.cc has a lot of code for parsing the escape sequences that designate or invoke character sets, but vte/src/vteseq.cc now supports only the default charset and line drawing. It uses SI=LS0 and SO=LS1 for this purpose and ignores all other shifts and all character set designations. Also, vte/src/vte.cc decodes UTF-8 and passes Unicode scalar values to the parser, so the line-drawing charset does not disable UTF-8 decoding. Instead, if the line-drawing charset is in use, then vte/src/vte.cc replaces some ASCII characters with line-drawing characters after they come from the parser. |
I thought of a couple of possible solutions to this.
This gives us the best of both worlds. Users that don't care about character set designations will be much less likely to accidentally switch them, but that functionality is still there for anyone that actually needs it. |
This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment. |
It seems like if we do 1, we can later do 2 with no perceived change to the user experience. Should we be storing the user preference supplemental set somewhere, when we do do 2? |
Yeah, that make sense to me. Although we may want to skip the bit about the setting the G-sets correctly when switching to ISO-2022 mode if we want to remain forward-compatible with
If you mean in a setting, then ideally yes. It was certainly a user-controlled setting in the original hardware terminals, and I suspect most real terminal emulators would have a setting for it. If combined with a way to start up in ISO-2022 mode, it would enable users to easily run legacy 8-bit applications designed for a particular code page. That said, I don't think there's a huge demand for that sort of thing anymore, so it's not essential if you don't want WT to go that route. Also, it would have to wait until we have a proper conpty pass-through implementation, since all the character set mapping is currently handled on the conhost side. |
## Summary of the Pull Request There is a non-zero subset of applications that randomly output _Locking Shift_ escape sequences which will invoke a character set from G2 or G3 into the left half of the code table. If those G-sets are mapped to Latin1, that can result in the terminal producing output that appears to be broken. This PR now defaults all G-sets to ASCII, to prevent an unintentional _Locking Shift_ from having any effect. ## PR Checklist * [x] Closes #10408 * [x] CLA signed. * [ ] Tests added/passed * [ ] Documentation updated. * [ ] Schema updated. * [x] I've discussed this with core contributors already. Issue number where discussion took place: #10408 ## Detailed Description of the Pull Request / Additional comments Most other modern terminals also default to ASCII in all G-sets, so this shouldn't break any modern applications. Legacy 8-bit applications may still expect the G2 and G3 sets mapped to Latin1, but they would also need to have the ISO-2022 encoding enabled, so we can keep them happy by setting G2 and G3 correctly when the ISO-2022 encoding is requested. ## Validation Steps Performed I've manually confirmed that `echo -e "\en"` and `echo -e "\eo"` no longer have any visible effect on the output (at least without first invoking another character set into G2 or G3). I've also confirmed that they do still work as expected (i.e. selecting Latin1) after enabling the ISO-2022 encoding.
## Summary of the Pull Request There is a non-zero subset of applications that randomly output _Locking Shift_ escape sequences which will invoke a character set from G2 or G3 into the left half of the code table. If those G-sets are mapped to Latin1, that can result in the terminal producing output that appears to be broken. This PR now defaults all G-sets to ASCII, to prevent an unintentional _Locking Shift_ from having any effect. ## PR Checklist * [x] Closes #10408 * [x] CLA signed. * [ ] Tests added/passed * [ ] Documentation updated. * [ ] Schema updated. * [x] I've discussed this with core contributors already. Issue number where discussion took place: #10408 ## Detailed Description of the Pull Request / Additional comments Most other modern terminals also default to ASCII in all G-sets, so this shouldn't break any modern applications. Legacy 8-bit applications may still expect the G2 and G3 sets mapped to Latin1, but they would also need to have the ISO-2022 encoding enabled, so we can keep them happy by setting G2 and G3 correctly when the ISO-2022 encoding is requested. ## Validation Steps Performed I've manually confirmed that `echo -e "\en"` and `echo -e "\eo"` no longer have any visible effect on the output (at least without first invoking another character set into G2 or G3). I've also confirmed that they do still work as expected (i.e. selecting Latin1) after enabling the ISO-2022 encoding. (cherry picked from commit 27e042b)
## Summary of the Pull Request There is a non-zero subset of applications that randomly output _Locking Shift_ escape sequences which will invoke a character set from G2 or G3 into the left half of the code table. If those G-sets are mapped to Latin1, that can result in the terminal producing output that appears to be broken. This PR now defaults all G-sets to ASCII, to prevent an unintentional _Locking Shift_ from having any effect. ## PR Checklist * [x] Closes #10408 * [x] CLA signed. * [ ] Tests added/passed * [ ] Documentation updated. * [ ] Schema updated. * [x] I've discussed this with core contributors already. Issue number where discussion took place: #10408 ## Detailed Description of the Pull Request / Additional comments Most other modern terminals also default to ASCII in all G-sets, so this shouldn't break any modern applications. Legacy 8-bit applications may still expect the G2 and G3 sets mapped to Latin1, but they would also need to have the ISO-2022 encoding enabled, so we can keep them happy by setting G2 and G3 correctly when the ISO-2022 encoding is requested. ## Validation Steps Performed I've manually confirmed that `echo -e "\en"` and `echo -e "\eo"` no longer have any visible effect on the output (at least without first invoking another character set into G2 or G3). I've also confirmed that they do still work as expected (i.e. selecting Latin1) after enabling the ISO-2022 encoding. (cherry picked from commit 27e042b)
🎉This issue was addressed in #11658, which has now been successfully released as Handy links: |
🎉This issue was addressed in #11658, which has now been successfully released as Handy links: |
Windows Terminal version (or Windows build number)
1.8.1521.0
Other Software
OpenSSH_5.3p1
Steps to reproduce
echo -e '\x1bo'
Expected Behavior
I expect that terminal will continue to work properly
Actual Behavior
I can no longer type "actual" text in terminal, every key press transforms in a mess seen on screenshot. CTRL+Something sequences are also no longer working.
Stumbled upon this issue when I was debugging gsm7 encodings in python program (\x1b is a common byte for this encoding). On yet another 'print()' terminal became a mess.
Issue is not reproduced if i'm using powershell.exe or cmd.exe without it being embeded in WT.
The text was updated successfully, but these errors were encountered: