You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While building source code, compilers and build tools should ALWAYS process each and every source file using the encoding with which the file was written by its authors and released by its maintainers, and NEVER process any of those files with the locale inherited from the end-user when it introduces any discrepancy whatsoever. The proper thing to do is thus to NEVER, EVER heed the user-inherited locale variables LANG and LC_* — the very idea flies in the face of the determinism aimed at by Nix. Not even in an impure shell should use the user locale, unless explicitly requested — and any discrepancy in encoding should then lead to a prominent warning unless explicitly hushed.
The only imaginable defaults that make any sense for the locale are: POSIX, C.UTF-8, and en_US.UTF-8. The POSIX default would impose needless pain for no gain whatsoever in a day where UTF-8 is now a widely accepted and supported standard, so the only sensible and useful defaults are C.UTF-8 and en_US.UTF-8. Between those two, the actual default should be chosen based on which is easiest to deploy in the base Nix environment while breaking the fewest programs. I suspect it's en_US.UTF-8.
I was faced with this bug while building with stack a haskell program that depended on language-javascript, and had a painful debug session until I found how to configure a suitable shell.nix for stack.yaml. Drilling to root causes led me to find that it's a fundamental bug in all of Nix, Cabal, Hackage and Stack. Remarkably, I fixed the very same issue in Common Lisp, where the build system ASDF assumes that all source code is UTF-8 by default, unless overridden by the library maintainers, and never ever heeding user locale. The switch was slightly painful, hounding maintainers of tens of libraries and actually pulling the switch only a year after warning everyone. The switch should be simpler in Nix, where all software is in the same repository.
-- While building package language-javascript-0.6.0.12 using:
/home/fare/.stack/setup-exe-cache/x86_64-linux-nix/Cabal-simple_mPHDZzAJ_2.4.0.1_ghc-8.6.5 --builddir=.stack-work/dist/x86_64-linux-nix/Cabal-2.4.0.1 build --ghc-options " -ddump-hi -ddump-to-file -fdiagnostics-color=always"
Process exited with code: ExitFailure 1
Logs have been written to: /home/fare/.stack/global-project/.stack-work/logs/language-javascript-0.6.0.12.log
Configuring language-javascript-0.6.0.12...
Preprocessing library for language-javascript-0.6.0.12..
happy: src/Language/JavaScript/Parser/Grammar7.y: hGetContents: invalid argument (invalid byte sequence)
Technical details
Please run nix-shell -p nix-info --run "nix-info -m" and paste the
results.
system: "x86_64-linux"
host os: Linux 5.0.3, NixOS, 19.03.172627.c21f08bfedd (Koi)
multi-user?: yes
sandbox: yes
version: nix-env (Nix) 2.2.2
channels(root): "nixos-19.03.172866.4649b6ef4b5"
channels(fare): ""
nixpkgs: /home/fare/src/nixos/nixpkgs
NB: my nixpkgs was an unmodified nixpkgs-unstable at 61f0936.
The text was updated successfully, but these errors were encountered:
Issue description
While building source code, compilers and build tools should ALWAYS process each and every source file using the encoding with which the file was written by its authors and released by its maintainers, and NEVER process any of those files with the locale inherited from the end-user when it introduces any discrepancy whatsoever. The proper thing to do is thus to NEVER, EVER heed the user-inherited locale variables
LANG
andLC_*
— the very idea flies in the face of the determinism aimed at by Nix. Not even in an impure shell should use the user locale, unless explicitly requested — and any discrepancy in encoding should then lead to a prominent warning unless explicitly hushed.The only imaginable defaults that make any sense for the locale are:
POSIX
,C.UTF-8
, anden_US.UTF-8
. ThePOSIX
default would impose needless pain for no gain whatsoever in a day where UTF-8 is now a widely accepted and supported standard, so the only sensible and useful defaults areC.UTF-8
anden_US.UTF-8
. Between those two, the actual default should be chosen based on which is easiest to deploy in the base Nix environment while breaking the fewest programs. I suspect it'sen_US.UTF-8
.I was faced with this bug while building with
stack
a haskell program that depended onlanguage-javascript
, and had a painful debug session until I found how to configure a suitableshell.nix
forstack.yaml
. Drilling to root causes led me to find that it's a fundamental bug in all of Nix, Cabal, Hackage and Stack. Remarkably, I fixed the very same issue in Common Lisp, where the build system ASDF assumes that all source code is UTF-8 by default, unless overridden by the library maintainers, and never ever heeding user locale. The switch was slightly painful, hounding maintainers of tens of libraries and actually pulling the switch only a year after warning everyone. The switch should be simpler in Nix, where all software is in the same repository.See also:
https://www.snoyman.com/blog/2016/12/beware-of-readfile
agda/agda#2922
input-output-hk/cardano-sl@ed8c892
Steps to reproduce
Result:
Technical details
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste theresults.
"x86_64-linux"
Linux 5.0.3, NixOS, 19.03.172627.c21f08bfedd (Koi)
yes
yes
nix-env (Nix) 2.2.2
"nixos-19.03.172866.4649b6ef4b5"
""
/home/fare/src/nixos/nixpkgs
NB: my nixpkgs was an unmodified nixpkgs-unstable at 61f0936.
The text was updated successfully, but these errors were encountered: