Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute length at compile time for literal strings #191

Merged
merged 9 commits into from
Aug 25, 2020

Conversation

andrewthad
Copy link
Contributor

Do not merge this unless MR 2165 lands.

Add unsafePackLiteral to Data.ByteString.Internal. With GHC 8.10+, use known-key variant of C strlen from GHC.CString that supports constant folding. Also in GHC 8.10, another data constructor of ForeignPtrContents becomes available: LiteralPtr. For string literals, this is now used. It saves space when there are lots of literals, and it improves opportunities for case-of-known data constructor optimizations when a function scrutinizes the length of a ByteString.

This can result in massive optimization opportunities that were previously missed. The following example is contrived but illustrative:

{-# language OverloadedStrings #-}
{-# OPTIONS_GHC -O2 -fforce-recomp -ddump-simpl -dsuppress-all -ddump-to-file #-}

module ConstantLength
  ( stringOne
  , stringTwo
  , biggestStringLength
  ) where

import Data.ByteString (ByteString)
import qualified Data.ByteString as B

stringOne, stringTwo :: ByteString
stringOne = "hello beautiful world"
stringTwo = "howdy"

biggestStringLength :: Int
biggestStringLength = max (B.length stringOne) (B.length stringTwo)

After this PR, GHC is able to optimize the Core to this (I've omitted the irrelevant parts like the module name):

-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
stringTwo1 = "howdy"#

-- RHS size: {terms: 5, types: 0, coercions: 0, joins: 0/0}
stringTwo = PS stringTwo1 LiteralPtr 0# 5#

-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
stringOne1 = "hello beautiful world"#

-- RHS size: {terms: 5, types: 0, coercions: 0, joins: 0/0}
stringOne = PS stringOne1 LiteralPtr 0# 21#

-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
biggestStringLength = I# 21#

In bytestring today, this is what you get when you compile this code:

-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
stringTwo_addr# = "howdy"#

-- RHS size: {terms: 17, types: 28, coercions: 0, joins: 0/0}
stringTwo
  = case newMutVar# NoFinalizers realWorld# of
    { (# ipv_a9Jk, ipv1_a9Jl #) ->
    case {__pkg_ccall main Addr#
                  -> State# RealWorld -> (# State# RealWorld, Word# #)}_d9tJ
           stringTwo_addr# realWorld#
    of
    { (# ds_d9tH, ds2_d9tG #) ->
    PS
      stringTwo_addr# (PlainForeignPtr ipv1_a9Jl) 0# (word2Int# ds2_d9tG)
    }
    }

-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
stringOne_addr# = "hello beautiful world"#

-- RHS size: {terms: 17, types: 28, coercions: 0, joins: 0/0}
stringOne
  = case newMutVar# NoFinalizers realWorld# of
    { (# ipv_a9Jk, ipv1_a9Jl #) ->
    case {__pkg_ccall main Addr#
                  -> State# RealWorld -> (# State# RealWorld, Word# #)}_d9tJ
           stringOne_addr# realWorld#
    of
    { (# ds_d9tH, ds2_d9tG #) ->
    PS
      stringOne_addr# (PlainForeignPtr ipv1_a9Jl) 0# (word2Int# ds2_d9tG)
    }
    }

-- RHS size: {terms: 16, types: 11, coercions: 0, joins: 0/0}
biggestStringLength
  = case stringOne of { PS dt_djMS dt1_djMT dt2_djMU dt3_djMV ->
    case stringTwo of { PS dt4_XjNp dt5_XjNr dt6_XjNt dt7_XjNv ->
    case <=# dt3_djMV dt7_XjNv of {
      __DEFAULT -> I# dt3_djMV;
      1# -> I# dt7_XjNv
    }
    }
    }

In practice, I don't expect many users will experience the big win that shows up in the contrived example of biggestStringLength. I think the more common win (a smaller one) is going to be that ByteStrings that originate as literals result in less generated code. So, slightly smaller binaries, better opportunities for the the memory caches to be used effectively, the usual jazz. But maybe I'm wrong. Maybe there are a bunch of people that will get some scrutinize-a-known-int optimizations kicking in.

@andrewthad
Copy link
Contributor Author

On gitlab, @bgamari asked why LiteralPtr is needed, that is, why PlainForeignPtr is insufficient. I had done this because it was something that seemed like a good idea. In an appeal to symmetry, I would argue something like "Well, we've already got the PlainPtr optimization for MallocPtr's without finalizers, why not do the same for PlainForeignPtr?". But with a little experimenting, I've discovered that the benefits of LiteralPtr exceed what I had expected. In andrewthad@25a6db8, which is not part of this branch, I've made bytestring use PlainForeignPtr (as it does today) rather than the newer LiteralPtr machinery. Let's pull back up ConstantLength.hs with some additional dumping flags:

{-# language OverloadedStrings #-}
{-# OPTIONS_GHC -O2 -fforce-recomp -ddump-simpl -dsuppress-all -ddump-to-file -ddump-cmm -ddump-asm #-}

module ConstantLength
  ( stringOne
  , stringTwo
  , biggestStringLength
  ) where

import Data.ByteString (ByteString)
import qualified Data.ByteString as B

stringOne :: ByteString
stringOne = "hello beautiful world"

stringTwo :: ByteString
stringTwo = "howdy"

biggestStringLength :: Int
biggestStringLength = max (B.length stringOne) (B.length stringTwo)

In the previous comment, I presented the resulting Core for LiteralPtr, but here it is again anyway:

-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
stringTwo1 = "howdy"#
-- RHS size: {terms: 5, types: 0, coercions: 0, joins: 0/0}
stringTwo = PS stringTwo1 LiteralPtr 0# 5#
-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
stringOne1 = "hello beautiful world"#
-- RHS size: {terms: 5, types: 0, coercions: 0, joins: 0/0}
stringOne = PS stringOne1 LiteralPtr 0# 21#
-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
biggestStringLength = I# 21#

If we switch to PlainForeignPtr, we get this instead:

-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
stringTwo1 = "howdy"#

-- RHS size: {terms: 11, types: 17, coercions: 0, joins: 0/0}
stringTwo
  = case newMutVar# NoFinalizers realWorld# of
    { (# ipv_i7b6, ipv1_i7b7 #) ->
    PS stringTwo1 (PlainForeignPtr ipv1_i7b7) 0# 5#
    }

-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
stringOne1 = "hello beautiful world"#

-- RHS size: {terms: 11, types: 17, coercions: 0, joins: 0/0}
stringOne
  = case newMutVar# NoFinalizers realWorld# of
    { (# ipv_i7b6, ipv1_i7b7 #) ->
    PS stringOne1 (PlainForeignPtr ipv1_i7b7) 0# 21#
    }

-- RHS size: {terms: 16, types: 11, coercions: 0, joins: 0/0}
biggestStringLength
  = case stringOne of { PS dt_i79x dt1_i79y dt2_i79J dt3_i79K ->
    case stringTwo of { PS dt4_X7a3 dt5_X7a5 dt6_X7ah dt7_X7aj ->
    case <=# dt3_i79K dt7_X7aj of {
      __DEFAULT -> I# dt3_i79K;
      1# -> I# dt7_X7aj
    }
    }
    }

Even though the lengths have been computed at compile time, GHC floats the call to newMutVar# (which by the way, it is totally unsound to actually do anything with the resulting MutVar# since the compiler is free to common these up as it sees fit) to outside of the data constructor PS (I wrote it inside of the data constructor in the PlainForeignPtr-variant of unsafePackLiteral), and it is unwilling to inline the resulting expressions. Consequently, biggestStringLength does not get solved like it did when we used LiteralPtr. Sad day.

But maybe that's not convincing enough. One might argue that that's just a bug in the optimizer, that really GHC should just do a better job optimizing code with unsound productions of MutVar# (to be read a little tongue-in-cheek). The good news is that there are even more benefits conferred by LiteralPtr. If we remove biggestStringLength from the source code and compile again, dumping cmm, we get this:

// There are 91 lines total in this file. Again, that's with biggestStringLength was removed.
// Most lines are omitted so we can focus on the difference. We show one stringOne.
[section ""data" . stringOne_closure" {
     stringOne_closure:
         const PS_con_info;
         const LiteralPtr_closure+2;
         const stringOne1_bytes;
         const 0;
         const 21;
         const 3;
 }]

With PlainForeignPtr instead, we get this:

// There are 188 lines total in this file. Again, we omit most of them to focus on the
// difference between the two approaches. We show only stringOne.
[stringOne_entry() { //  [R1]
         { info_tbls: [(cBOI,
                        label: block_cBOI_info
                        rep: StackRep []
                        srt: Nothing),
                       (cBOM,
                        label: stringOne_info
                        rep: HeapRep static { Thunk }
                        srt: Nothing)]
           stack_info: arg_space: 8 updfr_space: Just 8
         }
     {offset
       cBOM:
           if ((Sp + -24) < SpLim) (likely: False) goto cBON; else goto cBOO;
       cBON:
           R1 = R1;
           call (stg_gc_enter_1)(R1) args: 8, res: 0, upd: 8;
       cBOO:
           (_cBOF::I64) = call "ccall" arg hints:  [PtrHint,
                                                    PtrHint]  result hints:  [PtrHint] newCAF(BaseReg, R1);
           if (_cBOF::I64 == 0) goto cBOH; else goto cBOG;
       cBOH:
           call (I64[R1])() args: 8, res: 0, upd: 8;
       cBOG:
           I64[Sp - 16] = stg_bh_upd_frame_info;
           I64[Sp - 8] = _cBOF::I64;
           I64[Sp - 24] = cBOI;
           R1 = NoFinalizers_closure+1;
           Sp = Sp - 24;
           call stg_newMutVar#(R1) returns to cBOI, args: 8, res: 8, upd: 24;
       cBOI:
           Hp = Hp + 56;
           if (Hp > HpLim) (likely: False) goto cBOR; else goto cBOQ;
       cBOR:
           HpAlloc = 56;
           R1 = R1;
           call stg_gc_unpt_r1(R1) returns to cBOI, args: 8, res: 8, upd: 24;
       cBOQ:
           I64[Hp - 48] = PlainForeignPtr_con_info;
           P64[Hp - 40] = R1;
           I64[Hp - 32] = PS_con_info;
           P64[Hp - 24] = Hp - 47;
           I64[Hp - 16] = stringOne1_bytes;
           I64[Hp - 8] = 0;
           I64[Hp] = 21;
           R1 = Hp - 31;
           Sp = Sp + 8;
           call (P64[Sp])(R1) args: 24, res: 0, upd: 24;
     }
 },
 section ""data" . stringOne_closure" {
     stringOne_closure:
         const stringOne_info;
         const 0;
         const 0;
         const 0;
 }]

I won't bother showing the assembly since I'm no good at reading or analyzing it, but I can offer a hand-wavy approximation in the binary bloat that PlainForeignPtr leads to. Here is a gist with haskell source code for ten bytestring literals. I'm building with the NCG on x86_64. Inspecting the size of the resulting object files with wc -c shows:

  • LiteralPtr: 4280 bytes
  • PlainForeignPtr: 8808 bytes

The delta here is 4528 bytes. Divide by ten and round a little, and we can approximate that every ByteString literal bloats the binary by 450 additional bytes when we are using PlainForeignPtr. In projects I work on, it's not uncommon to see at least 1000 bytestring literals throughout a project. That's 450KB of bloat that are trimmed off by LiteralPtr. These are usually 50-100MB executables that I see, so 450KB isn't huge, but it's not nothing.

There's another angle to that I don't have any hard numbers on, another way that LiteralPtr could possibly help: the GHC runtime. At runtime, every time someone uses one of these ByteStrings at least once, newMutVar# gets invoked. The resulting MutVar#s take up 16 bytes each on a 64-bit architecture because they are represented as

typedef struct {
    StgHeader   header;
    StgClosure *var;
} StgMutVar;

So with 1000 literal bytestrings that actually got used at some point in time as a program ran, you'd be paying 16KB that you didn't need to. And then maybe on top of that, during garbage-collection ... mumble mumble ... GC roots ... something something ... mutable list ... errrr ... longer pause times, maybe? My understanding of the garbage collector is not great. Perhaps someone else might be able to predict whether or not it would be reasonable to expect any improvement there.

Hopefully, this has been somewhat convincing. I suspect that at least one of these three arguments (more Core optimization opportunities, small binaries, smaller heap at runtime) will resonate with most readers. I don't have any numbers from any real-world applications, but if that's important, I can try to come up with something.

@hvr hvr added the blocked: ghc This is blocked on a feature or primitive not yet available in a released GHC version label Dec 19, 2019
@vdukhovni
Copy link
Contributor

Have you considered combining this with (rebasing onto) #175 ?

Ideally, both optimizations would happen, and one or the other may yield more benefits depending on the application, and the two should not a priori be in conflict. Perhaps #175 should happen first, #175 requires at least GHC 8.0 to provide compatibility in the Internal module, and this PR will require 8.10 or later, so it may be fine to start with #175 as a base.

@andrewthad
Copy link
Contributor Author

I've not rebased onto that because it's not clear to me that it will actually be merged. Both PRs improve performance, but #175 will break a lot of downstream dependencies, and progress appears to have stalled on it. I'm not convinced that it would be a good idea that it would be good to chain this PR to that one given that they are orthogonal, and given that this one may very well end up being merged first.

@vdukhovni
Copy link
Contributor

I've not rebased onto that because it's not clear to me that it will actually be merged. Both PRs improve performance, but #175 will break a lot of downstream dependencies, and progress appears to have stalled on it.

It was my impression that #175 remains backwards-compatible provided pattern-synonyms (8.0) are available. If progress has stalled, it perhaps mostly reticence. If, as it seems, this PR requires 8.10 or later, when that is released, it would IMHO be a reasonable opportunity to drop support for GHC < 8.0, and then both could be merged?

I do agree that the two are orthogonal in their feature set, and either or both could be adopted, but I really think that if we're focused on optimizing ByteString overhead it should be both. As for who goes first and who then needs to rebase, that's less clear, perhaps this PR does go first in the end, but it seemed to me like in the grand scheme of things, if I were merging both, I'd first merge #175 .

@andrewthad
Copy link
Contributor Author

I'm going to guard with CCP so that GHCs older than 8.12 can still build bytestring (they just won't get any performance benefit from these changes). Also, bytestring probably won't drop support for GHC 8.0 for a while. It has a huge support window, so if that's what the other PR is waiting on, it might be two or three years before it can be merged.

@vdukhovni
Copy link
Contributor

Also, bytestring probably won't drop support for GHC 8.0 for a while. It has a huge support window, so if that's what the other PR is waiting on, it might be two or three years before it can be merged.

For the record, #175 looks compatible with 8.0, it is 7.10 and earlier support that would have to be dropped. Dropping 7.10 by the time 8.12 ships seems plausible...

@vdukhovni
Copy link
Contributor

@andrewthad On a separate note, I was curious (ignorant) of how the proposed GHC MR handles string literals with embedded NULs. At first glance they are truncated unexpectedly, but I may be missing something that excludes such strings from the proposed string literal optimisation:
https://gitlab.haskell.org/ghc/ghc/merge_requests/2165#note_256273
Would you care to comment?

@sjakobi
Copy link
Member

sjakobi commented Feb 25, 2020

For the record, #175 looks compatible with 8.0, it is 7.10 and earlier support that would have to be dropped. Dropping 7.10 by the time 8.12 ships seems plausible...

What's the compatibility problem with 7.10 and earlier?

@vdukhovni
Copy link
Contributor

For the record, #175 looks compatible with 8.0, it is 7.10 and earlier support that would have to be dropped. Dropping 7.10 by the time 8.12 ships seems plausible...

What's the compatibility problem with 7.10 and earlier?

The internal representation of ByteStrings changes (dropping the offset), which means that Data.ByteString.Internal is no longer compatible for applications that directly peek at the PS constructor payload. This is solved for 8.0 by making PS a pattern synonym that manufactures the right shape with a 0 offset. For 7.x and earlier, pattern synonyms are not available.

@sjakobi
Copy link
Member

sjakobi commented Feb 25, 2020

Ah, I had thought that you were pointing out that #175 itself wouldn't build with GHC <= 7.10. Sorry for the noise!

@vdukhovni
Copy link
Contributor

Ah, I had thought that you were pointing out that #175 itself wouldn't build with GHC <= 7.10. Sorry for the noise!

No, #175 will build fine with GHC 7.x, just won't offer a backwards compatible PS to code that peeks under the covers via Data.ByteString.Internal.

FWIW, on a whim I rebuilt my DANE survey engine with ByteString patched a la #175 and GHC 8.8.2. This meant also rebuilding AttoParsec, Hasql, Conduit, DNS, TLS, ... all to use the new ByteStrings. Everything built cleanly and runs just fine. So #175 just works across a decent swath of the library ecosystem with 8.x. My code does not support 7.x, so I can't speak to what would break without attempting a backport that is likely not a good use of my time.

The engine spends most of its time waiting for DNS replies from the network (~4k DNS qps, going much faster would likely trigger remote rate limits), so no visible change in throughput. Testing just a simple DNS RRset -> Builder codebase streaming out cooked DNS records, I see about a 3.5% reduction in runtime with #175, but this does not feel much memory pressure, it runs in a small constant space, so dramatic savings are not expected. Given all the other costs, a measurable ~3.5% is not bad at all.

@vdukhovni
Copy link
Contributor

Having finally resolved my confusion about MR2165, I should say that I support this pull request, once the pre-requisite MR lands. Whether that would also be a good opportunity to revisit #175 is I guess a separate decision.

FWIW, in the 2019 survey we see that GHC 7.x was used by just 4% of responders, and the accompanying commentary was:

Although the version numbers obviously change, the distribution remains remarkably consistent. The three most recent major versions (8.8, 8.6, 8.4) cover the vast majority of users. I’ve said this before and I’ll say it again: Don’t spend too much time maintaining support for older versions of GHC. Especially if it lets you simplify your package description by removing conditionals or simplify your code by removing CPP.

So, as a separate matter, it is perhaps time to consider merging #175 at some point.

Cc: @cartazio

Add unsafePackLiteral to Data.ByteString.Internal. With GHC-8.10+,
use known-key variant of C `strlen` from `GHC.CString` that supports
constant folding. Also in GHC 8.10, another data constructor of
ForeignPtrContents becomes available: LiteralPtr. For string literals,
this is now used. It saves space when there are lots of literals, and
it improves opportunities for case-of-known data constructor optimizations
when a function scrutinizes the length of a ByteString.
@andrewthad andrewthad force-pushed the constant_fold_literal_length branch from cdc5d36 to 103727b Compare May 24, 2020 00:10
@andrewthad
Copy link
Contributor Author

MR 2165 has landed in GHC. This PR is ready for final review.

Data/ByteString.hs Outdated Show resolved Hide resolved
Data/ByteString/Internal.hs Outdated Show resolved Hide resolved
@vdukhovni
Copy link
Contributor

Good to see that nothing breaks tests with GHC 8.10.1 and earlier, but sadly we don't yet have tests for (yesterday's) GHC 8.11, so I am not sure this can be merged just yet (except of course by code inspection without tests).

Speaking of tests, I've built GHC 8.11 for myself, and bytestring with this PR, but I'm having a bit of a struggle getting the dependencies needed for testing built...

@vdukhovni
Copy link
Contributor

Speaking of tests, I've built GHC 8.11 for myself, and bytestring with this PR, but I'm having a bit of a struggle getting the dependencies needed for testing built...

After a bunch of effort I got tests built, and ran into a problem. The unsafeFinalize function was failing tests (for values supplied by QuickCheck), when trying to clean up PlainPtr values. I have a work-around, but perhaps the better solution is per the linked note?

@sjakobi
Copy link
Member

sjakobi commented May 24, 2020

@andrewthad What would it take to extend this optimization to ShortByteString literals?

@vdukhovni
Copy link
Contributor

@andrewthad What would it take to extend this optimization to ShortByteString literals?

Separately, when I built GHC-itself with the patched bytestring library, I managed to get a compiler panic trying to later build the bytestring testsuite, that I'm still following up, perhaps a bug in GHC master, but just in case I'm now testing a version of GHC built with the stock bytestring...

@vdukhovni
Copy link
Contributor

vdukhovni commented May 24, 2020

@andrewthad What would it take to extend this optimization to ShortByteString literals?

Separately, when I built GHC-itself with the patched bytestring library, I managed to get a compiler panic trying to later build the bytestring testsuite, that I'm still following up, perhaps a bug in GHC master, but just in case I'm now testing a version of GHC built with the stock bytestring...

Good news for bytestring at least, the compiler panic is there also without this MR, so we'll just have to wait a bit for that to get sorted...

@andrewthad
Copy link
Contributor Author

@sjakobi The best way to get better ShortByteString literals is more complicated. It requires implementing the byte array literals proposal. I've started on this in MR 2971. Tangentially, there are several interesting optimizations that the compiler can perform on ByteArray# that it cannot perform on Addr#, all stemming from that ByteArray# supports a much more useful notion of equality than Addr# does. I have an always-bytearray-backed variant of bytestring at byteslice, and I'm trying to improve GHC make it faster, which conveniently makes ShortByteString better as well.

@vdukhovni I've addressed the requested changes. That GHC panic is a bummer, but I'm sure that'll get sorted out soon.

@sjakobi
Copy link
Member

sjakobi commented May 24, 2020

Thanks for working on this stuff @andrewthad! :)

I was curious about ShortByteString because I'm using it to improve some string matching code in dhall in the context of dhall-lang/dhall-haskell#1804. I had noticed that the core was fairly terrible, comparing the scrutinee against each pattern in a series of case-expressions as if the patterns were completely opaque! I first went from Text to ByteArray in dhall-lang/dhall-haskell@666aae0 and then tried to reduce the number of length comparisons with dhall-lang/dhall-haskell@0cc0e9c. If you have any suggestions how to improve this further I'm all ears! :)

EDIT: The PR including these patches has some more info: dhall-lang/dhall-haskell#1810

@sjakobi sjakobi removed the blocked: ghc This is blocked on a feature or primitive not yet available in a released GHC version label May 24, 2020
@vdukhovni
Copy link
Contributor

Good news for bytestring at least, the compiler panic is there also without this MR, so we'll just have to wait a bit for that to get sorted...

Back to this MR, are you looking to have it merged into 0.10.10.1 (a bug-fix + CI + backlog of safe/backwards-compatible PRs) or just reviewed and approved for a later release?

If releases of bytestring are timed to coincide with GHC releases, I am not sure this will make into GHC 8.12 if it does not go into 0.10.10.1. It should be a noop in earlier GHC versions (and the CI is green), but testing to make sure it ready for 8.12 is still rather difficult. Are you planning to add the guards for unsafeFinalize or do we expect that base will have that covered (either not have FinalPtr or have it only with the relaxed pure () finalisation behaviour)?

Cc: @hvr, @bgamari, @cartazio

Co-authored-by: Simon Jakobi <[email protected]>
Data/ByteString/Internal.hs Outdated Show resolved Hide resolved
@Bodigrim
Copy link
Contributor

Looks good to me, but since the code depends on yet unreleased features of GHC, I would rather delay merging until GHC 9.0.

@andrewthad
Copy link
Contributor Author

Agreed. At the least, I think we should wait until there's a release candidate, but I don't mind waiting until an actual release either.

@sjakobi sjakobi removed this from the 0.10.12.0 milestone Aug 1, 2020
@vdukhovni
Copy link
Contributor

While this did not land in 0.10.12.0 (makes sense), now that 0.11.0.0 is coming up, perhaps this is ready?

@andrewthad
Copy link
Contributor Author

andrewthad commented Aug 20, 2020

Except for the change log merge conflict, this PR is ready to be merged. Let me know if you're ready to merge it into master, and if so, I'll resolve the conflict and squash.

@Bodigrim
Copy link
Contributor

@andrewthad yes, please proceed.

@Bodigrim Bodigrim added blocked: patch-needed somebody needs to write a patch or contribute code and removed blocked: needs-review labels Aug 21, 2020
@Bodigrim Bodigrim added this to the 0.11.0.0 milestone Aug 21, 2020
@Bodigrim
Copy link
Contributor

@andrewthad sorry for reminding, but I intend to cut a new release soon, ideally by the end of the week. Actually I can rebase your PR myself, if you are busy.

@andrewthad
Copy link
Contributor Author

Go ahead and do it. I tried to this morning but then got stuck because there’s a merge commit in there.

@vdukhovni
Copy link
Contributor

Go ahead and do it. I tried to this morning but then got stuck because there’s a merge commit in there.

Speaking of merge-commits. Can we please BAN THEM from this project. They make rebasing patches essentially impossible, and make it difficult to understand the commit history. There's a pull request for OpenSSL I've been unable to help the author to progress, because he updated it a few times by "merging", rather than rebasing, and now it is completely unusable.

In the actual OpenSSL repository, merge commits are disallowed. The history is (correctly) forced to be linear.

@sjakobi
Copy link
Member

sjakobi commented Aug 25, 2020

I've addressed the merge conflicts with yet another merge commit – I hope I got it right.

In other repositories, I usually try to keep master linear and clean of merge commits. PRs can freely use merges as long as they are squashed in the end.

Copy link
Member

@sjakobi sjakobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few copy-and-paste mistakes on my part. :/

Data/ByteString/Internal.hs Outdated Show resolved Hide resolved
Data/ByteString/Internal.hs Outdated Show resolved Hide resolved
Data/ByteString/Internal.hs Outdated Show resolved Hide resolved
Copy link
Contributor

@hsyl20 hsyl20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Bodigrim Bodigrim removed the blocked: patch-needed somebody needs to write a patch or contribute code label Aug 25, 2020
@Bodigrim Bodigrim merged commit 371f224 into haskell:master Aug 25, 2020
@vdukhovni
Copy link
Contributor

This got merged into master on 2020-08-25, and the necessary supporting features are going to be in GHC 9.0.1, but whether this is an oversight or an explicit choice, this MR is not (presently) in the bytestring snapshot that scheduled to be shipped as a boot library with GHC 9.0.1.

I am guessing that at this point it may be too late to change that, but if not, or if there's something we should have done or should still do to facilitate its inclusion at the most appropriate time, I thought it may be sensible to bring this up for discussion...

Should this have been handled differently? Is this MR deliberately postponed to 9.2? Just lack of cycles? ...

@sjakobi
Copy link
Member

sjakobi commented Jan 22, 2021

@vdukhovni I was expecting this patch to be bundled with GHC 9.0.1 as a part of bytestring-0.11.0.0.

However, when we discussed using bytestring-0.11.0.0 in GHC-9.0.1 with Ben Gamari in early October, Ben expected that updating GHC and the core libraries to use bytestring-0.11.0.0 would take too long.

I think I have underestimated how long it would take to update the other core libraries for compatibility with bytestring-0.11. The underlying problem is of course that some core libraries don't have very responsive maintainers.

@vdukhovni
Copy link
Contributor

I think I have underestimated how long it would take to update the other core libraries for compatibility with bytestring-0.11. The underlying problem is of course that some core libraries don't have very responsive maintainers.

Do you remember which packages were blockers? At this point both text and unix have merged commits into their master branches that allow bytestring-0.11.

It would sure be great if at least the boot packages were mutually responsive to support ongoing evolution amongst their peers. Perhaps helping to coordinate this better is a worthy topic for the foundation to explore...

@sjakobi
Copy link
Member

sjakobi commented Jan 22, 2021

Do you remember which packages were blockers?

I know that Cabal, text and unix required patches. @Bodigrim might be aware of others.

It would sure be great if at least the boot packages were mutually responsive to support ongoing evolution amongst their peers. Perhaps helping to coordinate this better is a worthy topic for the foundation to explore...

Yeah, it would be great if the Haskell foundation could help with this.

@sjakobi
Copy link
Member

sjakobi commented Feb 2, 2021

@vdukhovni You can check the progress regarding using bytestring-0.11 in GHC here: https://gitlab.haskell.org/ghc/ghc/-/issues/19091

@Bodigrim Bodigrim mentioned this pull request May 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants