Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default all G-sets to ASCII unless ISO-2022 is requested (was: Terminal breaks on \x1b O) #10408

Closed
iamdbychkov opened this issue Jun 11, 2021 · 24 comments · Fixed by #11658
Closed
Labels
Area-VT Virtual Terminal sequence support Help Wanted We encourage anyone to jump in on these. Issue-Bug It either shouldn't be doing this or needs an investigation. Issue-Task It's a feature request, but it doesn't really need a major design. Priority-2 A description (P2) Product-Conhost For issues in the Console codebase Product-Terminal The new Windows Terminal. Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.

Comments

@iamdbychkov
Copy link

iamdbychkov commented Jun 11, 2021

Windows Terminal version (or Windows build number)

1.8.1521.0

Other Software

OpenSSH_5.3p1

Steps to reproduce

  1. SSH to linux server (for me it's CentOS, if it matters)
  2. echo -e '\x1bo'

Expected Behavior

I expect that terminal will continue to work properly

Actual Behavior

I can no longer type "actual" text in terminal, every key press transforms in a mess seen on screenshot. CTRL+Something sequences are also no longer working.

изображение

Stumbled upon this issue when I was debugging gsm7 encodings in python program (\x1b is a common byte for this encoding). On yet another 'print()' terminal became a mess.

Issue is not reproduced if i'm using powershell.exe or cmd.exe without it being embeded in WT.

@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Jun 11, 2021
@KalleOlaviNiemitalo
Copy link

ESC 06/15 is LS3 - LOCKING-SHIFT THREE. That was implemented in #4496.

Have you tried the same command in a different terminal, such as xterm or in the Linux console? I expect you'd get the same result with those.

@iamdbychkov
Copy link
Author

iamdbychkov commented Jun 11, 2021

ESC 06/15 is LS3 - LOCKING-SHIFT THREE. That was implemented in #4496.

Have you tried the same command in a different terminal, such as xterm or in the Linux console? I expect you'd get the same result with those.

I have no problem running mentioned command on the same server while using putty as my gui ssh client / winsshterm (which is putty)/penguinet and no problems with Debian's default console.

@KalleOlaviNiemitalo
Copy link

From https://www.man7.org/linux/man-pages/man4/console_codes.4.html, I get the impression that the Linux console does not implement LS3, while xterm does.

@KalleOlaviNiemitalo
Copy link

Is there already a feature request for an action that resets the character sets to the defaults? Similar to #1882 for the scrollback.

@iamdbychkov iamdbychkov changed the title Winodws terminal breaks on \x1b O if connected to remote server using ssh. Windows terminal breaks on \x1b O if connected to a remote server using ssh. Jun 11, 2021
@j4james
Copy link
Collaborator

j4james commented Jun 11, 2021

FYI, typing reset in bash should fix this. I also sometimes include a soft reset (\x1b[!p) in my prompt, which fix most issues like this automatically. But an action would be good too - I think most terminal emulators have something like that. I'm sure we've discussed this before actually, but I can't find an existing issue for it now, so maybe I'm misremembering.

@iamdbychkov
Copy link
Author

Yes, thank you, typing reset in bash indeed worked, but unfortunately it won't work in the environments like python console, one in which I initially discovered the issue

@j4james
Copy link
Collaborator

j4james commented Jun 13, 2021

If it happens again in a python console, you can use print '\x1bc' to reset. Not suggesting that's a good long term solution, but it's better than having to exit and restart.

@WSLUser
Copy link
Contributor

WSLUser commented Jun 14, 2021

Other Software
OpenSSH_5.3p1

Can you bump your Win32-OpenSSH to 8.6? https://github.com/PowerShell/Win32-OpenSSH/releases

Before 8.1, it was handling VT sequences itself instead of allowing conhost to do so instead.

@iamdbychkov
Copy link
Author

Can you bump your Win32-OpenSSH to 8.6? https://github.com/PowerShell/Win32-OpenSSH/releases

Before 8.1, it was handling VT sequences itself instead of allowing conhost to do so instead.

Sadly, it didn't make a difference:

PS C:\Program Files\OpenSSH> ./ssh -V
OpenSSH_for_Windows_8.6p1, LibreSSL 3.3.3
PS C:\Program Files\OpenSSH> ./ssh dev35p4
[dbychkov@dev35p4 ~]$ echo -e '\x1bo'

ÛäâùãèëïöÀäåö³µð´ þݤ 

@WSLUser

This comment has been minimized.

@j4james
Copy link
Collaborator

j4james commented Jun 15, 2021

I'm fairly certain OpenSSH has nothing to do with this. As KalleOlaviNiemitalo pointed out at the start, LS3 is a standard control sequence that is intended to change the character set - it's working as designed. You'll get the exact same behaviour in a local shell. What we're missing, is a way for users to easily reset the terminal when it gets into a state that they weren't expecting.

@zadjii-msft
Copy link
Member

It seems totally reasonable to me for us to have a "Reset Terminal" command ~~that's just a sendInput("\x1bc") command,~~bound by default. Good thinking.

It can't just be a sendInput command - it'll need to immediately trigger it like it was sent as client output, but yea it's a good idea.

@zadjii-msft zadjii-msft added Area-TerminalControl Issues pertaining to the terminal control (input, selection, keybindings, mouse interaction, etc.) Help Wanted We encourage anyone to jump in on these. Issue-Task It's a feature request, but it doesn't really need a major design. Product-Terminal The new Windows Terminal. labels Jun 15, 2021
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label Jun 15, 2021
@zadjii-msft zadjii-msft added this to the Terminal Backlog milestone Jun 15, 2021
@KalleOlaviNiemitalo
Copy link

I'm not sure LS3 is really designed to work here, because it is defined in ECMA-35 (which claims to be identical to ISO/IEC 2022:1994), and UTF-8 is registered as a "Coding system different from that of ISO 2022". Perhaps a terminal that starts in UTF-8 mode could ignore LS3 and other ECMA-35 features until it receives DOCS (DESIGNATE OTHER CODING SYSTEM) for switching to ECMA-35 mode.

On the other hand, if Windows Terminal impersonates xterm, and xterm supports LS3, then I suppose Windows Terminal has to likewise support LS3.

@j4james
Copy link
Collaborator

j4james commented Jun 15, 2021

Perhaps a terminal that starts in UTF-8 mode could ignore LS3 and other ECMA-35 features until it receives DOCS.

I had considered that, but the problem is that a lot of existing software expects a certain amount of ISO-2022 functionality even in UTF-8 mode - the most common case being to select the DEC line drawing character set.

@j4james
Copy link
Collaborator

j4james commented Jun 15, 2021

It seems totally reasonable to me for us to have a "Reset Terminal" command ~~that's just a sendInput("\x1bc") command,~~bound by default.

Note that this has similar problems to #1882, in that we need the reset to occur on the conhost side of the conpty pipe. So we either need something like #1193, or in the long term #1173.

@orcmid
Copy link

orcmid commented Jun 15, 2021

@KalleOlaviNiemitalo LS3 is defined in ECMA-35 (which claims to be identical to ISO/IEC 2022:1994), and UTF-8 is registered as a "Coding system different from that of ISO 2022". Perhaps a terminal that starts in UTF-8 mode could ignore LS3 and other ECMA-35 features until it receives [a change of scheme sequence]

On the other hand, if Windows Terminal impersonates xterm, and xterm supports LS3, then I suppose Windows Terminal has to likewise support LS3.

It is stated in the roadmap that WT VT (Virtual Terminal) relies on ECMA-48, which does appeal to ECMA-35. If UTF8 is usable with VT, then some ECMA-35 fidgets might not be directly possible, since UTF8 commandeers the higher order bit of octets and the client/shell might need to cope with that.

Is the line-drawing different than the pixel-graphics that have also been asked for? There are line-drawing code points in Unicode though.

[I have not dug deep into this, just pondering that the ECNA-48 and UTF8 and Virtual Terminal need to be examined for exactly how they apply for Window Terminal.]

@DHowett
Copy link
Member

DHowett commented Jun 17, 2021

I'm of two minds on this. The one that's currently winning the debate for me right now is that these are control sequences. They're not supposed to be emitted unless an attempt is being made to control a thing.

Is the line-drawing different than the pixel-graphics that have also been asked for? There are line-drawing code points in Unicode though.

Yep- line drawing (emitting full-cell glyphs from the unicode range U+2500) is a tried-and-true method of drawing TUI-like interfaces. One of the character sets, the DEC Technical Set, allows you to access them in the sub-0x7F range.

I'm inclined to follow what other terminal emulators do here. What do gnome-terminal, Konsole and Alacritty do here?

@DHowett DHowett added the Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something label Jun 17, 2021
@KalleOlaviNiemitalo
Copy link

KalleOlaviNiemitalo commented Jun 20, 2021

Gnome-terminal uses VTE, from which most ISO 2022 support was removed in 2014 after discussion at https://bugzilla.gnome.org/show_bug.cgi?id=732586

vte/src/parser.cc has a lot of code for parsing the escape sequences that designate or invoke character sets, but vte/src/vteseq.cc now supports only the default charset and line drawing. It uses SI=LS0 and SO=LS1 for this purpose and ignores all other shifts and all character set designations. Also, vte/src/vte.cc decodes UTF-8 and passes Unicode scalar values to the parser, so the line-drawing charset does not disable UTF-8 decoding. Instead, if the line-drawing charset is in use, then vte/src/vte.cc replaces some ASCII characters with line-drawing characters after they come from the parser.

@j4james
Copy link
Collaborator

j4james commented Jun 20, 2021

I thought of a couple of possible solutions to this.

  1. We just default all the G-sets to ASCII, instead of having Latin1 in G2 and G3 (even XTerm seems to do this). For anyone that needs strict DEC character set compatibility we could always set them correctly when ISO-2022 mode is enabled.
  2. We add support for DECAUPSS (assign user preference supplemental set), and default that to ASCII. Since G2 and G3 are meant to be initialized with UPSS, they'd be ASCII by default, but users would still have a way to change that if necessary.

This gives us the best of both worlds. Users that don't care about character set designations will be much less likely to accidentally switch them, but that functionality is still there for anyone that actually needs it.

@ghost ghost added the No-Recent-Activity This issue/PR is going stale and may be auto-closed without further activity. label Jun 24, 2021
@ghost
Copy link

ghost commented Jun 24, 2021

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.

@DHowett DHowett removed No-Recent-Activity This issue/PR is going stale and may be auto-closed without further activity. Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something labels Jun 24, 2021
@DHowett
Copy link
Member

DHowett commented Jul 6, 2021

It seems like if we do 1, we can later do 2 with no perceived change to the user experience. Should we be storing the user preference supplemental set somewhere, when we do do 2?

@DHowett DHowett changed the title Windows terminal breaks on \x1b O if connected to a remote server using ssh. Default all G-sets to ASCII unless ISO-2022 is requested (was: Terminal breaks on \x1b O) Jul 6, 2021
@DHowett DHowett added Area-VT Virtual Terminal sequence support Product-Conhost For issues in the Console codebase Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-2 A description (P2) and removed Area-TerminalControl Issues pertaining to the terminal control (input, selection, keybindings, mouse interaction, etc.) Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Jul 6, 2021
@j4james
Copy link
Collaborator

j4james commented Jul 6, 2021

It seems like if we do 1, we can later do 2 with no perceived change to the user experience.

Yeah, that make sense to me. Although we may want to skip the bit about the setting the G-sets correctly when switching to ISO-2022 mode if we want to remain forward-compatible with DECAUPSS being added later. I'm not totally sure about that, but either way is OK with me. As long as there is some series of escape sequences a user can send to get things the way they want, then I think we're good.

Should we be storing the user preference supplemental set somewhere, when we do do 2?

If you mean in a setting, then ideally yes. It was certainly a user-controlled setting in the original hardware terminals, and I suspect most real terminal emulators would have a setting for it. If combined with a way to start up in ISO-2022 mode, it would enable users to easily run legacy 8-bit applications designed for a particular code page.

That said, I don't think there's a huge demand for that sort of thing anymore, so it's not essential if you don't want WT to go that route. Also, it would have to wait until we have a proper conpty pass-through implementation, since all the character set mapping is currently handled on the conhost side.

@ghost ghost added the In-PR This issue has a related PR label Oct 30, 2021
@ghost ghost closed this as completed in #11658 Nov 3, 2021
@ghost ghost added Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. and removed In-PR This issue has a related PR labels Nov 3, 2021
ghost pushed a commit that referenced this issue Nov 3, 2021
## Summary of the Pull Request

There is a non-zero subset of applications that randomly output _Locking Shift_ escape sequences which will invoke a character set from G2 or G3 into the left half of the code table. If those G-sets are mapped to Latin1, that can result in the terminal producing output that appears to be broken. This PR now defaults all G-sets to ASCII, to prevent an unintentional _Locking Shift_ from having any effect.

## PR Checklist
* [x] Closes #10408
* [x] CLA signed.
* [ ] Tests added/passed
* [ ] Documentation updated.
* [ ] Schema updated.
* [x] I've discussed this with core contributors already. Issue number where discussion took place: #10408

## Detailed Description of the Pull Request / Additional comments

Most other modern terminals also default to ASCII in all G-sets, so this shouldn't break any modern applications. Legacy 8-bit applications may still expect the G2 and G3 sets mapped to Latin1, but they would also need to have the ISO-2022 encoding enabled, so we can keep them happy by setting G2 and G3 correctly when the ISO-2022 encoding is requested.

## Validation Steps Performed

I've manually confirmed that `echo -e "\en"` and `echo -e "\eo"` no longer have any visible effect on the output (at least without first invoking another character set into G2 or G3). I've also confirmed that they do still work as expected (i.e. selecting Latin1) after enabling the ISO-2022 encoding.
DHowett pushed a commit that referenced this issue Dec 13, 2021
## Summary of the Pull Request

There is a non-zero subset of applications that randomly output _Locking Shift_ escape sequences which will invoke a character set from G2 or G3 into the left half of the code table. If those G-sets are mapped to Latin1, that can result in the terminal producing output that appears to be broken. This PR now defaults all G-sets to ASCII, to prevent an unintentional _Locking Shift_ from having any effect.

## PR Checklist
* [x] Closes #10408
* [x] CLA signed.
* [ ] Tests added/passed
* [ ] Documentation updated.
* [ ] Schema updated.
* [x] I've discussed this with core contributors already. Issue number where discussion took place: #10408

## Detailed Description of the Pull Request / Additional comments

Most other modern terminals also default to ASCII in all G-sets, so this shouldn't break any modern applications. Legacy 8-bit applications may still expect the G2 and G3 sets mapped to Latin1, but they would also need to have the ISO-2022 encoding enabled, so we can keep them happy by setting G2 and G3 correctly when the ISO-2022 encoding is requested.

## Validation Steps Performed

I've manually confirmed that `echo -e "\en"` and `echo -e "\eo"` no longer have any visible effect on the output (at least without first invoking another character set into G2 or G3). I've also confirmed that they do still work as expected (i.e. selecting Latin1) after enabling the ISO-2022 encoding.

(cherry picked from commit 27e042b)
DHowett pushed a commit that referenced this issue Dec 13, 2021
## Summary of the Pull Request

There is a non-zero subset of applications that randomly output _Locking Shift_ escape sequences which will invoke a character set from G2 or G3 into the left half of the code table. If those G-sets are mapped to Latin1, that can result in the terminal producing output that appears to be broken. This PR now defaults all G-sets to ASCII, to prevent an unintentional _Locking Shift_ from having any effect.

## PR Checklist
* [x] Closes #10408
* [x] CLA signed.
* [ ] Tests added/passed
* [ ] Documentation updated.
* [ ] Schema updated.
* [x] I've discussed this with core contributors already. Issue number where discussion took place: #10408

## Detailed Description of the Pull Request / Additional comments

Most other modern terminals also default to ASCII in all G-sets, so this shouldn't break any modern applications. Legacy 8-bit applications may still expect the G2 and G3 sets mapped to Latin1, but they would also need to have the ISO-2022 encoding enabled, so we can keep them happy by setting G2 and G3 correctly when the ISO-2022 encoding is requested.

## Validation Steps Performed

I've manually confirmed that `echo -e "\en"` and `echo -e "\eo"` no longer have any visible effect on the output (at least without first invoking another character set into G2 or G3). I've also confirmed that they do still work as expected (i.e. selecting Latin1) after enabling the ISO-2022 encoding.

(cherry picked from commit 27e042b)
@ghost
Copy link

ghost commented Dec 14, 2021

🎉This issue was addressed in #11658, which has now been successfully released as Windows Terminal Preview v1.12.3472.0.:tada:

Handy links:

@ghost
Copy link

ghost commented Dec 14, 2021

🎉This issue was addressed in #11658, which has now been successfully released as Windows Terminal v1.11.3471.0.:tada:

Handy links:

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-VT Virtual Terminal sequence support Help Wanted We encourage anyone to jump in on these. Issue-Bug It either shouldn't be doing this or needs an investigation. Issue-Task It's a feature request, but it doesn't really need a major design. Priority-2 A description (P2) Product-Conhost For issues in the Console codebase Product-Terminal The new Windows Terminal. Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants