-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some PUA characters don't show #275
Comments
Well, since Unicode does not define the characteristics of PUA characters, it's not possible to determine the printable size of each character. Any PUA character could be a normal one-space printable character, or it could be a combining or control character (zero width) or a double-width character, or anything else. Treating them as binary seems the safest as far as maintaining the screen display correctly. However I see your point that in most cases the user would want the characters to display directly. Perhaps there could be an extension to the LESSCHARDEF syntax that would allow the user to specify how each PUA character should be treated. |
Commit dc4fa8c adds environment variable LESSUTFCHARDEF that can be used to set the type of Private Use (or any) characters. |
…hars by default. cf. gwsw/less#275
Can you point out where Unicode says a PUA codepoint can be a control character? I believe this is not the case. I think a much more sane approach would be to handle all PUAs as ordinary printable character because that is the usual case. If people have it somehow different in their client (terminal emulator), they could use |
Section 23.5 in the Unicode Core Specification says
and
I interpret this to mean that any given PU character might or might not advance one space in the terminal. Before LESSUTFCHARDEF was implemented, the safe approach was to treat all PU chars as control (that is, with unknown behavior). Now that LESSUTFCHARDEF is available as an override, it may make sense to treat PU chars as normal by default, requiring LESSUTFCHARDEF to be used only when a char is not a one-space printable char. It does seem plausible that in most cases PU chars are printable rather than control, combining, etc. |
Thanks for the answer! And I believe you are right! In the paragraph
only 'Graphic' ('printables') (Lu, Nd) are mentioned, but it seems possible to set gc = Cc 🤔 Out in the wild I have not seen such a thing (which might say something, or maybe it's completely irrelevant); it were always Graphic characters; sometimes Ligatures that should have a different codepoint (like putting 'fi-lig' onto Thank you again, Fini |
Hi!
Using less, some Private Use Area characters don't show.
OS: macOS 12.5.1 Monterey
less: v590
Unicode - Private Use Area
It seem that PUA characters are treat as Binary.
PUA should not be treat as Binary Because Unicode specification don't define its use purpose.
I would expect that PUA characters display as it is.
FWIW
Click to expand
Running following script,
there seems to be a problem with range definition of Binary too.
note: using nerd-fonts for screenshot.
The text was updated successfully, but these errors were encountered: