Fixes #35: fd{Read,Write} with ByteString payload (Breaking change) #219

mmhat · 2022-07-05T17:28:35Z

Version of #186 with API breakage. Feel free to close this one or the other one.

mmhat · 2022-07-05T17:46:47Z

For the record: I favor this one.

hasufell · 2022-07-14T21:40:13Z

If this is merged, the AFPP variant also needs fixing.

Bodigrim · 2022-07-14T21:45:10Z

Could someone review: is it the only place where *.ByteString modules use String?

hasufell · 2022-07-15T16:52:07Z

I think only fdRead and fdWrite are interesting. The other issue is that there's no bytestring variant of https://hackage.haskell.org/package/unix-2.7.2.2/docs/System-Posix-User.html ...wich includes two types GroupEntry and UserEntry, which have String records.

hasufell · 2022-07-15T21:50:52Z

I think this PR is fine as-is and doesn't need a wider scope.

@Bodigrim @hs-viktor let's vote on this issue:

accept this PR
accept PR Fixes #35: fd{Read,Write} with ByteString payload #186
accept none of them

I clearly vote for 1. 👍

hs-viktor

If we're introducing an API break, we could also deprecate the String variants, they are semantically dubious.

hs-viktor · 2022-07-16T18:47:51Z

System/Posix/IO.hsc

+-- | Read data from an 'Fd' and convert it to a 'String' using the locale encoding.
+-- Throws an exception if this is an invalid descriptor, or EOF has been
+-- reached.
+fdRead :: Fd


It may be worth noting that unless the input is known to be ASCII data, or an 8-bit locale is in use, reading n bytes and expecting a String is rather dubious. One can read and decode lines (in an UTF-8 locale) because UTF-8 is self-synchronising, and LF or CRLF is never in the middle of a multi-byte sequence. Otherwise, reading "n-bytes" from a file into a string is a recipe for failure. Use of this function should be discouraged.

I'm not sure I follow.

This issues seems to be present everywhere, including in the handling of filepaths. It's broken: https://gist.github.com/hasufell/c600d318bdbe010a7841cc351c835f92

Anything that uses getLocaleEncoding, getFileSystemEncoding or getForeignEncoding to convert CString to String potentially runs into bugs, no?

Following that, we'd need to deprecate half of base and most of the unix String based API.

(note: I'm up for that... but it's probably out of scope of this PR)

Anything that uses getLocaleEncoding, getFileSystemEncoding or getForeignEncoding to convert CString to String potentially runs into bugs, no?

No, the issue here is that we're not reading a complete string, we're reading n bytes from a file. These n bytes may well end in the middle of a multi-byte grapheme.

Converting entire file names or similar octet data (rather than n byte fragments of strings) to Strings is much safer under reasonable assumptions.

These n bytes may well end in the middle of a multi-byte grapheme.

Right, got it. And when you append the strings, you get garbage.

These n bytes may well end in the middle of a multi-byte grapheme.

Right, got it. And when you append the strings, you get garbage.

Yes, the description of the action says it all: read n bytes. That's a ByteString operation, not a String operation. Strings don't consist of bytes.

hs-viktor · 2022-07-16T18:55:31Z

I'm tempted to accept this PR, but have no idea how much breakage would ensue downstream. How widely used are these functions?

Bodigrim · 2022-07-16T19:05:10Z

I think this is an acceptable level of breakage. The usage is quite limited, e. g., https://hackage-search.serokell.io/?q=%5CbfdRead%5Cb - and most of these are for non-ByteString version.

I vote to accept.

hs-viktor · 2022-07-16T19:21:22Z

I will accept this PR, modulo additional documentation discouraging the String variants, if not outright deprecating them.

Bodigrim · 2022-07-16T21:57:54Z

Let's deprecate String variants as unsafe then. @mmhat could you please add this to the PR?

hasufell · 2022-07-16T22:02:02Z

Let's deprecate String variants as unsafe then. @mmhat could you please add this to the PR?

I can make a follow-up PR. I'd argue the scope of this PR is just about the ByteString variants.

Fixes haskell#35: fd{Read,Write} with ByteString payload

f2038bb

mmhat changed the title ~~Fixes #35: fd{Read,Write} with ByteString payload~~ Fixes #35: fd{Read,Write} with ByteString payload (Breaking change) Jul 5, 2022

mmhat mentioned this pull request Jul 5, 2022

Fixes #35: fd{Read,Write} with ByteString payload #186

Closed

hasufell mentioned this pull request Jul 10, 2022

Release Planning: unix-2.8.0.0 (GHC 9.6.x) #164

Closed

5 tasks

hasufell added this to the 2.8.0.0 milestone Jul 10, 2022

hasufell added the API: breaking label Jul 10, 2022

hs-viktor reviewed Jul 16, 2022

View reviewed changes

hasufell approved these changes Jul 16, 2022

View reviewed changes

Bodigrim approved these changes Jul 16, 2022

View reviewed changes

Bodigrim merged commit e0f4580 into haskell:master Jul 16, 2022

mmhat deleted the issue-35-bytestring-payload-breaking branch July 16, 2022 22:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes #35: fd{Read,Write} with ByteString payload (Breaking change) #219

Fixes #35: fd{Read,Write} with ByteString payload (Breaking change) #219

mmhat commented Jul 5, 2022

mmhat commented Jul 5, 2022

hasufell commented Jul 14, 2022

Bodigrim commented Jul 14, 2022

hasufell commented Jul 15, 2022 •

edited

Loading

hasufell commented Jul 15, 2022

hs-viktor left a comment

hs-viktor Jul 16, 2022

hasufell Jul 16, 2022 •

edited

Loading

hs-viktor Jul 16, 2022

hasufell Jul 16, 2022

hs-viktor Jul 16, 2022

hs-viktor commented Jul 16, 2022

Bodigrim commented Jul 16, 2022

hs-viktor commented Jul 16, 2022

Bodigrim commented Jul 16, 2022

hasufell commented Jul 16, 2022

Fixes #35: fd{Read,Write} with ByteString payload (Breaking change) #219

Fixes #35: fd{Read,Write} with ByteString payload (Breaking change) #219

Conversation

mmhat commented Jul 5, 2022

mmhat commented Jul 5, 2022

hasufell commented Jul 14, 2022

Bodigrim commented Jul 14, 2022

hasufell commented Jul 15, 2022 • edited Loading

hasufell commented Jul 15, 2022

hs-viktor left a comment

Choose a reason for hiding this comment

hs-viktor Jul 16, 2022

Choose a reason for hiding this comment

hasufell Jul 16, 2022 • edited Loading

Choose a reason for hiding this comment

hs-viktor Jul 16, 2022

Choose a reason for hiding this comment

hasufell Jul 16, 2022

Choose a reason for hiding this comment

hs-viktor Jul 16, 2022

Choose a reason for hiding this comment

hs-viktor commented Jul 16, 2022

Bodigrim commented Jul 16, 2022

hs-viktor commented Jul 16, 2022

Bodigrim commented Jul 16, 2022

hasufell commented Jul 16, 2022

hasufell commented Jul 15, 2022 •

edited

Loading

hasufell Jul 16, 2022 •

edited

Loading