-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some more characters missing for codepage 437 and 850 console compatibility #205
Comments
I asked this question on the other thread, but probably better to continue discussion here: For the control characters—I've never actually included those in a font—and am not sure I fully understand their purpose. Can you provide more info? Thanks! |
CP437 and CP850 map |
(Copy of the answer in the other thread, just in case someone else is reading) @aaronbell You can leave #142 closed as I didn't include much details about the 0x01 to 0x1F range besides "providing all 256 characters from CP437 would highly improve compatibility with original PC console/terminal outputs" in that bug report. About the C0 control characters, these are not glyphs in a font, and not characters in the ASCII standard, but in-band control for CUI apps. All this means that these characters would not be usable by CUI apps that simply write to the console like a stream, they are even less common than high-ASCII, but could be used by apps that want to control the whole screen buffer for text-based GUI apps for example. They are not part of ASCII, but were usable since the early IBM-PC days. From a font point of view, you don't need to care about the C0 control characters, but only to know that these values have been reused by the original MDA and CGA adapters to provide a few more characters to apps that handle the text screen buffer directly, and therefore these glyphs need to be included in your set if you want your font to be compatible with the original console. |
Thanks @PhMajerus. I'm a bit confused though since you say both "these are not glyphs in a font" and "therefore these glyphs need to be included in your set". Unless you mean that they don't need to have any content, and can just be blank zero-width glyphs? Also, per @ExE-Boss' comment, it sounds like the specific codepages already map to those control characters—so this is more of general console support than CP support? |
@aaronbell If sent to a terminal over a serial link or WriteConsoleA API function, they are in-band control that will be processed and will not render anything, so your font doesn't need anything for C0 control characters because they do not display anything, and you don't really need to understand how C0 control characters work any more than you'd need to understand how VT control sequences work to set the colors and such. However, when writing directly to the text screen memory of a display adapter, every value can be a character, so the original IBM-PC MDA reused these values to provide 31 more characters (not 32 because 0x00 is a special one, rendered as a space because it is used as a string delimiter in the C language, even if technically they could have used it as well), and every graphics adapter and text-screen emulator like the NT console have inherited that behavior since then. So you don't need to worry about C0 control characters as these are not characters from a font point of view, but you do need to care about the MDA extra characters reusing the same values, as apps that access video memory directly can take advantage of them since the very first IBM PC. As far as I know, localized codepages only change the high-ASCII 0x80 to 0xFF characters, so the 31 characters added in the C0 area (0x00 to 0x1F) should be the same for all codepages. Did I manage to make it a bit more clear? don't hesitate to let me know, it's a tricky thing to understand even for software developers, legacy stuff based on physical serial links and video memory. |
I think I understand 😅. Anyway, sounds like with the addition of the last remaining symbol codepoints that everything will be good! |
@aaronbell Also, remember that these are just to complete the support for CUI apps that use 8-bit characters for the codepages you choose to support. If you want to make sure all CUI apps work as intended with Cascadia, you probably should try to include all the characters found in Lucida Console at some point. See https://docs.microsoft.com/en-us/typography/font-list/lucida-console for the details and list of codepages supported by Lucida Console. |
This version does not yet support codepages 437 and 850 completely.
I guess I should have included more details in #142, but the missing characters are not easy to generate with easy to reproduce steps in cmd.exe.
The last missing glyphs are in the characters for values 0x01 to 0x1F, which are typically used as C0 control characters, but can be output as characters by CUI apps that directly control the screen buffer.
The same characters with Consolas font:
So for complete compatibility with older CUI apps, the following characters are also required:
Hex 0123456789ABCDEF
0_ ☺☻♥♦♣♠•◘○◙♂♀♪♫☼
1_ ►◄↕‼¶§▬↨↑↓→←∟↔▲▼
Their Unicode codepoints are as follows:
0_ : (NULL), U+263A, U+263B, U+2665, U+2666, U+2663, U+2660, U+2022, U+25D8, U+25CB, U+25D9, U+2642, U+2640, U+266A, U+266B, U+263C
1_ : U+25BA, U+25C4, U+2195, U+203C, U+00B6, U+00A7, U+25AC, U+21A8, U+2191, U+2193, U+2192, U+2190, U+221F, U+2194, U+25B2, U+25BC
As far as I can tell, the characters in 0x01, 0x02, 0x0E and 0x0F are missing, codepoints U+263A, U+263B, U+266B and U+263C, which are the two smiley faces, the double music note and the sun.
It would be great to have these included for full console compatibility.
The text was updated successfully, but these errors were encountered: