Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ExitCodeException Show instance #83

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

9999years
Copy link
Contributor

Before, the arrangement of newlines in the ExitCodeException Show instance grouped stdout closer to the stderr header than the stdout header:

ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
*** Exception: Received ExitFailure 1 when running
Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"
Standard output:

this is stdout
Standard error:

this is stderr

If there was no trailing newline for the stdout, the output would be formatted with no newline between the end of the stdout and the start of the stderr header:

ghci> readProcess_ $ proc "sh" ["-c", "nix path-info --json nixpkgs#agda && false"]
*** Exception: Received ExitFailure 1 when running
Raw command: sh -c "nix path-info --json nixpkgs#agda && false"
Standard output:

[{"path":"/nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3","valid":false}]Standard error:

these 5 paths will be fetched (18.30 MiB download, 133.19 MiB unpacked):
  /nix/store/5q0kb0nqnqcfs7a0ncsjq4fdppwirpxa-Agda-2.6.4.3-bin
  /nix/store/xmximjjnkn0hm4gw7akc9f20ydz6msmk-Agda-2.6.4.3-data
  /nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3
  /nix/store/b49sa2q0yb3fd14ppzh6j6rm8vvgr9n6-ghc-9.6.6-with-packages
  /nix/store/vharimf7f2glj4fyhiglzws0qyv4xrry-libraries

Now, the output is grouped more consistently and displays nicely regardless of trailing or leading newlines in the output:

ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
*** Exception: Received ExitFailure 1 when running
Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"

Standard output:
this is stdout

Standard error:
this is stderr

ghci> readProcess_ $ proc "sh" ["-c", "nix path-info --json nixpkgs#agda && false"]
*** Exception: Received ExitFailure 1 when running
Raw command: sh -c "nix path-info --json nixpkgs#agda && false"

Standard output:
[{"path":"/nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3","valid":false}]

Standard error:
these 5 paths will be fetched (18.30 MiB download, 133.19 MiB unpacked):
  /nix/store/5q0kb0nqnqcfs7a0ncsjq4fdppwirpxa-Agda-2.6.4.3-bin
  /nix/store/xmximjjnkn0hm4gw7akc9f20ydz6msmk-Agda-2.6.4.3-data
  /nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3
  /nix/store/b49sa2q0yb3fd14ppzh6j6rm8vvgr9n6-ghc-9.6.6-with-packages
  /nix/store/vharimf7f2glj4fyhiglzws0qyv4xrry-libraries

The Show instance for ProcessConfig has also been touched up, removing edge cases like an empty "Modified environment" header:

ghci> putStrLn $ show $ setEnv [] $ proc "sh" []
Raw command: sh
Modified environment:

Extraneous trailing newlines in Show instances have also been removed.

@tomjaguarpaw
Copy link
Collaborator

It seem like this uses text but it has not been added to the dependencies. (If you made changes to the .cabal file please note that the changes should be made to packages.yaml and then the .cabal file regenerated (I think using hpack)).

But I don't really understand why we strip anyway. Isn't that misleading? It seems to me it would be better to add a newline after printing the output regardless of whether it also ended with a newline.

@tomjaguarpaw
Copy link
Collaborator

I pushed a commit that, I think, corrects a test (I didn't really understand why that was broken) and another one that does no stripping. Personally I prefer the no stripping behavior, because of the principle of least surprise. However, if you want to add showExitCodeExceptionStripped then that's fine by me.

@tomjaguarpaw
Copy link
Collaborator

(But preferably stripping done using ASCII, not text)

@9999years
Copy link
Contributor Author

@tomjaguarpaw How about only stripping the end of the output? This would normalize tools writing zero, one, or more newlines but keep leading whitespace intact.

Before, the arrangement of newlines in the `ExitCodeException` `Show`
instance grouped stdout closer to the stderr header than the stdout
header:

    ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"
    Standard output:

    this is stdout
    Standard error:

    this is stderr

If there was no trailing newline for the stdout, the output would be
formatted with no newline between the end of the stdout and the start of
the stderr header:

    ghci> readProcess_ $ proc "sh" ["-c", "nix path-info --json nixpkgs#agda && false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "nix path-info --json nixpkgs#agda && false"
    Standard output:

    [{"path":"/nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3","valid":false}]Standard error:

    these 5 paths will be fetched (18.30 MiB download, 133.19 MiB unpacked):
      /nix/store/5q0kb0nqnqcfs7a0ncsjq4fdppwirpxa-Agda-2.6.4.3-bin
      /nix/store/xmximjjnkn0hm4gw7akc9f20ydz6msmk-Agda-2.6.4.3-data
      /nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3
      /nix/store/b49sa2q0yb3fd14ppzh6j6rm8vvgr9n6-ghc-9.6.6-with-packages
      /nix/store/vharimf7f2glj4fyhiglzws0qyv4xrry-libraries

Now, the output is grouped more consistently and displays nicely
regardless of trailing or leading newlines in the output:

    ghci> readProcess_ $ proc "sh" ["-c", "echo this is stdout; echo this is stderr >&2; false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "echo this is stdout; echo this is stderr >&2; false"

    Standard output:
    this is stdout

    Standard error:
    this is stderr

    ghci> readProcess_ $ proc "sh" ["-c", "nix path-info --json nixpkgs#agda && false"]
    *** Exception: Received ExitFailure 1 when running
    Raw command: sh -c "nix path-info --json nixpkgs#agda && false"

    Standard output:
    [{"path":"/nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3","valid":false}]

    Standard error:
    these 5 paths will be fetched (18.30 MiB download, 133.19 MiB unpacked):
      /nix/store/5q0kb0nqnqcfs7a0ncsjq4fdppwirpxa-Agda-2.6.4.3-bin
      /nix/store/xmximjjnkn0hm4gw7akc9f20ydz6msmk-Agda-2.6.4.3-data
      /nix/store/sj2z0h5ywlflqv50dfphwia6p0ij0mlj-agdaWithPackages-2.6.4.3
      /nix/store/b49sa2q0yb3fd14ppzh6j6rm8vvgr9n6-ghc-9.6.6-with-packages
      /nix/store/vharimf7f2glj4fyhiglzws0qyv4xrry-libraries

The `Show` instance for `ProcessConfig` has also been touched up,
removing edge cases like an empty "Modified environment" header:

    ghci> putStrLn $ show $ setEnv [] $ proc "sh" []
    Raw command: sh
    Modified environment:

Extraneous trailing newlines in `Show` instances have also been
removed.
@9999years 9999years force-pushed the fix-exitcodeexception-show branch 2 times, most recently from 2a56450 to dcadf7f Compare September 7, 2024 00:14
@9999years
Copy link
Contributor Author

Alright, I've pushed a commit to remove the whitespace stripping behavior from the Show instance, but I do think it makes the exceptions harder to read in a lot of cases and less consistent across the board.

Here's the normal case, where standard output ends with a newline. print doesn't expect a Show instance to end with a newline, so it outputs a blank line at the end:

ghci> e stdout stderr = ExitCodeException { ... }
ghci> print $ e "<STDOUT>\n" ""
Received ExitFailure 1 when running
Raw command: echo

Standard output:
<STDOUT>

ghci>

Meanwhile, if we have a command that includes both stdout and stderr and doesn't output a newline at the end of its stdout, the blank line separating the Standard output: and Standard error: sections disappears:

ghci> print $ e "<STDOUT>" "<STDERR>"
Received ExitFailure 1 when running
Raw command: echo

Standard output:
<STDOUT>
Standard error:
<STDERR>

Also, Show instances that end with newlines make values that contain them print quite clumsily (see the line break before the comma here):

ghci> data Foo = Foo { a :: Int, b :: ExitCodeException, c :: String } deriving Show
ghci> Foo 1 (e "<STDOUT>\n" "") "hello"
Foo {a = 1, b = Received ExitFailure 1 when running
Raw command: echo

Standard output:
<STDOUT>
, c = "hello"}

@tomjaguarpaw
Copy link
Collaborator

Thanks. I appreciate this version doesn't do everything you want, but I'm much more comfortable with it, so if you consider it an improvement let's go for it. You are welcome to subsequently continue to advocate for your desired end goal.

However, this still uses text and assumes UTF-8, which I am not comfortable with. I don't understand why it makes this assumption. It's essentially doing decodeUtf8 and then immediately T.unpacking into a String. Why not just keep the L8.unpack?

@9999years
Copy link
Contributor Author

9999years commented Sep 7, 2024

Why not just keep the L8.unpack?

What encoding does L8.unpack presume? It doesn't appear to be documented. Picking an encoding is not optional — the semantics of converting bytes to codepoints needs to be defined! I believe UTF-8 is the most reasonable choice here, and I believe it's much better to use UTF-8 explicitly than whatever L8.unpack does implicitly. UTF-8 firmly won the encoding war, both on the web and on macOS and Linux, where it is the default encoding (and is used by many many programs regardless of locale and encoding settings).

This Show ExitCodeException instance is optimistic — it does not need to always be correct (there will always be niche programs which use different encodings and need specialized logic) but should instead provide the best results for the most cases possible. UTF-8 is (in my opinion) the obvious correct choice here.

To (hopefully) give some weight to my opinion here, I'm a co-author for the L2/21-235 proposal which added the Symbols for Legacy Computing Unicode block.

@9999years
Copy link
Contributor Author

9999years commented Sep 7, 2024

And in fact L8.unpack does do something obviously wrong and mangles any codepoint higher than U+007F, leading to mojibake errors equivalent to decodeLatin1 . encodeUtf8:

ghci> import Data.ByteString.Lazy.Char8 (unpack)
ghci> import Data.Text.Lazy.Encoding (encodeUtf8)
ghci> import Data.Text.Lazy (pack)
ghci> write text = putStrLn $ unpack $ encodeUtf8 $ pack text
ghci> write "café"
café
ghci> write "hello 🥺"
hello �

This is a bug, this is almost always the wrong behavior on any computer newer than the 1990s, and it's easy to fix.

Copy link
Contributor

@sol sol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comments on the code itself, but some ideas on how to make the tests easier on the eye.

test/System/Process/TypedSpec.hs Outdated Show resolved Hide resolved
test/System/Process/TypedSpec.hs Outdated Show resolved Hide resolved
test/System/Process/TypedSpec.hs Outdated Show resolved Hide resolved
test/System/Process/TypedSpec.hs Outdated Show resolved Hide resolved
test/System/Process/TypedSpec.hs Outdated Show resolved Hide resolved
test/System/Process/TypedSpec.hs Outdated Show resolved Hide resolved
test/System/Process/TypedSpec.hs Outdated Show resolved Hide resolved
test/System/Process/TypedSpec.hs Outdated Show resolved Hide resolved
test/System/Process/TypedSpec.hs Outdated Show resolved Hide resolved
test/System/Process/TypedSpec.hs Outdated Show resolved Hide resolved
Co-authored-by: Simon Hengel <[email protected]>
@tomjaguarpaw
Copy link
Collaborator

tomjaguarpaw commented Sep 12, 2024

[Sorry, pressed Enter too early. Comment to follow.]

@tomjaguarpaw
Copy link
Collaborator

in fact L8.unpack does do something obviously wrong

This is a fair point. I take the point that if we're going to choose anything then UTF-8 is the most inclusive choice. The current version doesn't strip terminal control codes, for example!

My personal preference would to be to make the choice explicit by not displaying stdout and stderr in the Show instance at all, and only showing them through specific functions showErrorCodeExceptionUtf8 / showErrorCodeExceptionUtf16 etc.

I went back to look at where the choice of L8.unpack was made, and it was by @snoyberg eight years ago: 84dac77

Since he is the primary maintainer of the repository I'll leave the final call to him. Thank you for your patience so far @9999years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants