-
Notifications
You must be signed in to change notification settings - Fork 767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to _not_ use UTF-8 (connection to server with LANG="en_US.iso885915")? #1855
Comments
😕 ... and how do you apply that to an ssh connection to a Linux host?
... @mgkuhn and your solution is? |
Ah sorry, I had misunderstood your scenario (too quick reading). OpenSSH for Windows only supports UTF-8, so you need to use an encoding translation tool on Linux. There were some developed about 15-20 years ago, when Linux migrated from ISO 8859 and other 8-bit encodings to UTF-8. The main one I recall from that time is luit (Juliusz Chroboczek, ~2001). I haven't used that in more than a decade, but it still seems to work fine on Ubuntu:
(Instead of using the So I don't think OpenSSH for Windows needs to implement anything here. |
I've guessed so - and the point was how to let OpenSSH for Windows correctly recognize the encoding/LANG from the server it connects to or the client it connects from (so just set it, then run ssh.exe). Would it be possible for OpenSSH to check LANG and use for example
As you seem to have experience with that: What should I tell ssh.exe to use luit to convert its "only supports UTF8" encoding so that the client get ISO-8859-15 when OpenSSH for Windows sends an UTF-8 character? I currently use it as follows:
Any ideas / insights? |
I would check (with Also, you may want to play with adding ssh.exe option -t in the last three examples, to force allocation of a pseudo-TTY device (pty) on the server, such that luit still thinks it runs inside a real terminal. |
Not a good idea for at least two reasons: a) This forum is just about the Win32 port of OpenSSH. It is not good practice for the Windows porters to add new functionality here that goes beyond what is specifically only required for Windows. Adding anything else here to the port would just create an ongoing maintenance overhead, i.e. more work when merging in each new upstream release. So if you wanted to make that suggestion, you should take it upstream. I would give the suggestion little chance though, as "luit ssh" probably does already what is needed on Unix-style operating systems. b) Good software architecture keeps things modular and avoids cramming all possible functionality into every single tool. For example, it would seem to me far neater to port Or you could talk to the authors of Windows Terminal or other Windows terminal emulators and ask if one of them might be interested in adding ISO 8859 support (for "retro computing" ;-). Keep in mind that PuTTY is both a terminal emulator and an SSH client in one program, and the multiple-character-encoding support is really more part of the terminal emulator part of PuTTY. The SSH part just transports bytes across the wire and therefore cares little about which flavour of 8-bit ASCII extension you might be using. |
Recent Windows Terminal now support a lot of 8bit character code encoding, including all iso8859 cp. Even we change to e.g. iso8859-2 (chcp 28592) OpenSSH switch back to UTF-8 :-( |
@tgauth Can you please check if this can make it to a backlog entry for further integration, especially as @zadjii-msft closed other entries in favor of this one? |
Was chcp 28592 run before and after running the ssh command? |
That question is to be answered by @szaszg. The original issue was connecting from Windows Server (tested with several chcp values) to a server that has |
The OpenSSH for Windows tools currently always set their console output encoding to UTF-8, by calling This currently makes a lot of sense: the vast majority of SSH communication today uses UTF-8, however the two terminal emulators that Microsoft provides (cmd.exe and Windows Terminal) both currently default to CP437, for historic (MS-DOS) backwards-compatibility reasons, and since that encoding has never been used by any other platform, it is not at all a useful default for SSH users. Once some future Windows version changes its terminal emulators to run in UTF-8 by default, the above call can hopefully be removed again. At that point, SSH could again remain agnostic of which ASCII extension is used over its connection, and it would just pass on bytes transparently between application and terminal emulator, like it always has on Unix-like systems. There are of course always more HACKs possible that could be added to
to something like
One would, of course, also have to look at how input is dealt with. And I don't know if there are any other code pages than CP_437 used by default in some localized versions of the platform, in which case they should be added as well. I'm not suggesting actually doing this. |
Thank you for the time to analyze and share this issue. Wouldn't it be possible to change the unconditional call to first check an environment variable and/or win32 specific OpenSSH setting (I don't know if there are others already) an only if nothing is set do the call to This would allow to:
and it should "just work", no?
Yes, that's an open point - but with the change suggested above someone can actually test how this works (it is not unlikely that this is already "enough"). |
If you look yourself through the code in Many of these UTF-16 wide-character APIs will also have 8-bit equivalents, but (not being a seasoned Win32 developer myself) I have no idea what fraction of these could deal with a multi-byte encoding like UTF-8. So I assume there may be good reasons for why OpenSSH for Windows currently does a lot of character encoding conversion itself, as opposed to just passing on UTF-8 (or whatever other 8-bit encoding you want) to a multi-byte API. |
"OpenSSH for Windows" version
OpenSSH_for_Windows_8.6p1, LibreSSL 3.3.3
Server OperatingSystem
RHEL7, CentOS8 (likely doesn't matter)
Client OperatingSystem
Windows Server 2019 Datacenter
What is failing
The server and a bunch of tools installed must use iso885915 encoding, so that is set globally via
LANG="en_US.iso885915
.When connecting to the server with PuTTY and configured "Remote character set"
ISO-8859-15: 1999 (Latin-9, "euro")
(found under Configuration->Window->Translation) then everything works fine - tools generate output in that character set, the client can display them properly and when entering data it gets in as expected.But that isn't the case with OpenSSH :-(
Expected output
With PuTTY
Actual output
With OpenSSH
Is there any way with this OpenSSH version to not use UTF-8?
The text was updated successfully, but these errors were encountered: