Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"\200" and "\377" show as "?" on SPARC Solaris 11 #2194

Closed
dimpase opened this issue Feb 19, 2018 · 4 comments · Fixed by #4022
Closed

"\200" and "\377" show as "?" on SPARC Solaris 11 #2194

dimpase opened this issue Feb 19, 2018 · 4 comments · Fixed by #4022
Labels
status: wontfix Issues that it has been decided will not be addressed

Comments

@dimpase
Copy link
Member

dimpase commented Feb 19, 2018

Observed behaviour

gap> x:="\0xFF";
"�"
gap> "\377";
"�"
gap> "\200";
"�"

Expected behaviour

gap> x:="\0xFF";
"\377"
gap> "\377";
"\377"
gap> "\200";
"\200"
 *********   GAP 4.8.8-5085-gb6d4e8e-dirty of today
 *  GAP  *   https://www.gap-system.org
 *********   Architecture: sparc-sun-solaris2.11-default64
 Configuration:  gmp 6.1.2, KernelDebug
 Loading the library and packages ...
 Packages:   AClib 1.2, Alnuth 3.1.0, AtlasRep 1.5.1, AutPGrp 1.8, CRISP 1.4.4, Cryst 4.1.13, CrystCat 1.1.6, CTblLib 1.2.2, FactInt 1.6.1, FGA 1.3.1,
             GAPDoc 1.6.1, IRREDSOL 1.4, LAGUNA 3.8.0, Polenta 1.3.8, Polycyclic 2.11, PrimGrp 3.3.0, RadiRoot 2.7, ResClasses 4.7.1, smallgrp 1.2,
             Sophus 1.23, SpinSym 1.5, TomLib 1.2.6, TransGrp 2.0.2, utils 0.53

this is master git of 19.02.2018, built with gcc 5.4.0.
(same with 4.9 beta of 01st of Feb).


\0x7E and \0x7F (i.e. octal 176 and 177) still work, but \0x80 (i.e. octal 200) does not.
So this looks like that only rightmost 17 bits work, and setting any bit to the left of
the rightmost 17 does not work correctly.

Note that the chars '\200', etc work as they should!


In case, here are locale settings:

LC_ALL=
LC_COLLATE=
LC_CTYPE=
LC_MESSAGES=
LC_MONETARY=
LC_NUMERIC=
LC_TIME=

Also, here is what I see running tests in testinstall for the 4.9 beta.

Diff in /datapool/dima/Sage/gap-4.9.0/tst/testinstall/strings.tst:44
# Input is:
x:="\0xFF";
# Expected output:
"\377"
# But found:
"<FF>"
########
########> Diff in /datapool/dima/Sage/gap-4.9.0/tst/testinstall/strings.tst:48
# Input is:
x:="A string with \0xFF Hex stuff \0x42 in it";
# Expected output:
"A string with \377 Hex stuff B in it"
# But found:
"A string with <FF> Hex stuff B in it"
########
########> Diff in /datapool/dima/Sage/gap-4.9.0/tst/testinstall/strings.tst:
189
# Input is:
List("abcxyzABCXYZ\n\t !019?\377\000$", LowercaseChar);
# Expected output:
"abcxyzabcxyz\n\t !019?\377\000$"
# But found:
"abcxyzabcxyz\n\t !019?<FF>\000$"
########
########> Diff in /datapool/dima/Sage/gap-4.9.0/tst/testinstall/strings.tst:
191
# Input is:
List("abcxyzABCXYZ\n\t !019?\377\000$", UppercaseChar);
# Expected output:
"ABCXYZABCXYZ\n\t !019?\377\000$"
# But found:
"ABCXYZABCXYZ\n\t !019?<FF>\000$"
########
########> Diff in /datapool/dima/Sage/gap-4.9.0/tst/testinstall/strings.tst:
193
# Input is:
UppercaseString("abcxyzABCXYZ\n\t !019?\377\000$");
# Expected output:
"ABCXYZABCXYZ\n\t !019?\377\000$"
# But found:
"ABCXYZABCXYZ\n\t !019?<FF>\000$"
########
########> Diff in /datapool/dima/Sage/gap-4.9.0/tst/testinstall/strings.tst:
195
# Input is:
LowercaseString("abcxyzABCXYZ\n\t !019?\377\000$");
# Expected output:
"abcxyzabcxyz\n\t !019?\377\000$"
# But found:
"abcxyzabcxyz\n\t !019?<FF>\000$"
########
@dimpase dimpase changed the title "\377" shows as "?" on SPARC Solaris 11 "\200" and "\377" show as "?" on SPARC Solaris 11 Feb 19, 2018
@fingolfin
Copy link
Member

My guess would be that isalpha behaves differently on Solaris than it does on Linux, OS X etc.

What happens if you force the locale to C, i.e. LC_ALL=C?

A simple fix would be to stop using isalpha and instead change our IsAlpha to a custom function. Then it'll be guaranteed to work the same across all platforms.

@fingolfin
Copy link
Member

Actually, looking at the code some more, I think now we don't do anything about non-printable characters, and just leave it to the OS to format them somehow. Seems that on Linux, macOS etc., it produces garbage (at least in an UTF-8 terminal), while Solaris handles it differently.

I don't see an easy fix for that. Of course we could add custom pre-processing before sending anything to the output, but that would require changing a lot of places, slow things down, and I am not sure I consider it a good idea to start with.

So I am tempted to say: "tough luck for solaris", and close this as "wontfix".

@dimpase
Copy link
Member Author

dimpase commented Nov 27, 2019

Note that even on Linux if you set LC_CTYPE=C you get (in a UTF terminal)

gap> "\377";
"�"

whereas with LC_CTYPE=en_US.UTF-8 or LC_CTYPE= everyhing is OK.

That is to say, these tests in testinstallg. are not LC_TYPE-invariant.

@fingolfin
Copy link
Member

See also gap-system/homebrew-gap#2

@fingolfin fingolfin reopened this Nov 27, 2019
fingolfin added a commit to fingolfin/gap that referenced this issue May 10, 2020
These caused issues in various places, but test nothing useful.

Resolves gap-system#2194
Resolves gap-system#3979
fingolfin added a commit to fingolfin/gap that referenced this issue May 12, 2020
These caused issues in various places, but test nothing useful.

Resolves gap-system#2194
Resolves gap-system#3979
fingolfin added a commit to fingolfin/gap that referenced this issue May 12, 2020
These caused issues in various places, but test nothing useful.

Resolves gap-system#2194
Resolves gap-system#3979
fingolfin added a commit that referenced this issue May 13, 2020
These caused issues in various places, but test nothing useful.

Resolves #2194
Resolves #3979
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: wontfix Issues that it has been decided will not be addressed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants