Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace boilerplate Monoid/Semigroup instances with generics #3169

Closed
23Skidoo opened this issue Feb 19, 2016 · 20 comments
Closed

Replace boilerplate Monoid/Semigroup instances with generics #3169

23Skidoo opened this issue Feb 19, 2016 · 20 comments

Comments

@23Skidoo
Copy link
Member

We have a large number of boilerplate Monoid/Semigroup instances for large records in D.S.Setup and D.C.Setup (as well as in #3156). Those should be replaced with generics.

Relevant Hackage package: https://hackage.haskell.org/package/generic-deriving-1.10.1/docs/Generics-Deriving-Monoid.html.

@hvr
Copy link
Member

hvr commented Feb 20, 2016

@23Skidoo the generic-deriving package may be overkill here... we need as little as (stolen from semigroups) to be added to Distribution.Compat.Semigroup:

import Data.Semigroup
import GHC.Generics

-- | Generically generate a 'Semigroup' ('<>') operation for any type
-- implementing 'Generic'. This operation will append two values
-- by point-wise appending their component fields. It is only defined
-- for product types.
--
-- @
-- 'gmappend' a ('gmappend' b c) = 'gmappend' ('gmappend' a b) c
-- @
gmappend :: (Generic a, GSemigroup (Rep a)) => a -> a -> a
gmappend x y = to (gmappend' (from x) (from y))

class GSemigroup f where
  gmappend' :: f p -> f p -> f p

instance GSemigroup U1 where
  gmappend' _ _ = U1

instance GSemigroup V1 where
  gmappend' x y = x `seq` y `seq` error "GSemigroup.V1: gmappend'"

instance Semigroup a => GSemigroup (K1 i a) where
  gmappend' (K1 x) (K1 y) = K1 (x <> y)

instance GSemigroup f => GSemigroup (M1 i c f) where
  gmappend' (M1 x) (M1 y) = M1 (gmappend' x y)

instance (GSemigroup f, GSemigroup g) => GSemigroup (f :*: g) where
  gmappend' (x1 :*: x2) (y1 :*: y2) = gmappend' x1 y1 :*: gmappend' x2 y2

-- | Generically generate a 'Monoid' 'mempty' for any product-like type
-- implementing 'Generic'.
--
-- It is only defined for product types.
--
-- @
-- 'gmappend' 'gmempty' a = a = 'gmappend' a 'gmempty'
-- @

gmempty :: (Generic a, GMonoid (Rep a)) => a
gmempty = to gmempty'

class GSemigroup f => GMonoid f where
  gmempty' :: f p

instance GMonoid U1 where
  gmempty' = U1

instance (Semigroup a, Monoid a) => GMonoid (K1 i a) where
  gmempty' = K1 mempty

instance GMonoid f => GMonoid (M1 i c f) where
  gmempty' = M1 gmempty'

instance (GMonoid f, GMonoid g) => GMonoid (f :*: g) where
  gmempty' = gmempty' :*: gmempty'

@23Skidoo
Copy link
Member Author

@hvr Thanks! That link was just for reference in case someone wanted to tackle this ticket.

hvr added a commit to hvr/cabal that referenced this issue Feb 26, 2016
hvr added a commit that referenced this issue Feb 26, 2016
@hvr
Copy link
Member

hvr commented Feb 26, 2016

@23Skidoo I've pushed some WIP changes to https://github.com/haskell/cabal/commits/wip/issue-3169 in case you want to take a peek...

This results in the following diff-stats:

 Cabal/Distribution/PackageDescription.hs |  17 ++---
 Cabal/Distribution/Simple/Haddock.hs     |  44 ++-----------
 Cabal/Distribution/Simple/InstallDirs.hs |  17 +----
 Cabal/Distribution/Simple/Program/GHC.hs | 117 ++-------------------------------
 Cabal/Distribution/Simple/Setup.hs       | 248 +++++++++++-----------------------------------------------------------
 5 files changed, 56 insertions(+), 387 deletions(-)

There's a huge block of boilerplate we could exterminate if the following FIXME was addressed:

instance Monoid ConfigFlags where
   mempty = ConfigFlags {
      configPrograms      = error "FIXME: remove configPrograms",
      ...
  }

This could be workarounded by moving the error into a dummy Monoid instance for ProgramConfiguration...

@hvr
Copy link
Member

hvr commented Feb 26, 2016

@dcoutts suggests to change the field to be configPrograms :: Last ProgramConfiguration

that way we'd get a proper/less-hacky Monoid ConfigFlags and move the error-ing to the use-sites when getLast results in Nothing.

hvr added a commit that referenced this issue Feb 27, 2016
This implements the suggestions mentioned at
#3169 (comment)

The main benefit of this change is turning 'ConfigFlags' into a uniform
product-type suitable for generic derivation of pointwise
`Semigroup`/`Monoid` instances.
hvr added a commit that referenced this issue Feb 27, 2016
This implements the suggestions mentioned at
#3169 (comment)

The main benefit of this change is turning 'ConfigFlags' into a uniform
product-type suitable for generic derivation of pointwise
`Semigroup`/`Monoid` instances.
hvr added a commit to hvr/cabal that referenced this issue Feb 27, 2016
This implements the suggestions mentioned at
haskell#3169 (comment)

The main benefit of this change is turning 'ConfigFlags' into a uniform
product-type suitable for generic derivation of pointwise
`Semigroup`/`Monoid` instances.

NB: This changes the `Binary` serialisation of `ConfigFlags` since there's
now an additional `Maybe` inserted in `configPrograms`'s type
hvr added a commit to hvr/cabal that referenced this issue Feb 27, 2016
This implements the suggestions mentioned at
haskell#3169 (comment)

The main benefit of this change is turning 'ConfigFlags' into a uniform
product-type suitable for generic derivation of pointwise
`Semigroup`/`Monoid` instances.

NB: This changes the `Binary` serialisation of `ConfigFlags` since there's
now an additional `Maybe` inserted in `configPrograms`'s type
hvr added a commit to hvr/cabal that referenced this issue Feb 27, 2016
This implements the suggestions mentioned at
haskell#3169 (comment)

The main benefit of this change is turning 'ConfigFlags' into a uniform
product-type suitable for generic derivation of pointwise
`Semigroup`/`Monoid` instances.

NB: This changes the `Binary` serialisation of `ConfigFlags` since there's
now an additional `Maybe` inserted in `configPrograms`'s type
23Skidoo added a commit that referenced this issue Feb 27, 2016
Add `gmappend`/`gmempty` Generics-helpers (re #3169)
@23Skidoo
Copy link
Member Author

@hvr

56 insertions(+), 387 deletions(-)

👍
And there will be more in cabal-install and nix-local-build code.

@hvr
Copy link
Member

hvr commented Feb 27, 2016

indeed... :-)

23Skidoo pushed a commit that referenced this issue Feb 27, 2016
hvr added a commit to hvr/cabal that referenced this issue Feb 27, 2016
This implements the suggestions mentioned at
haskell#3169 (comment)

The main benefit of this change is turning 'ConfigFlags' into a uniform
product-type suitable for generic derivation of pointwise
`Semigroup`/`Monoid` instances.

NB: This changes the `Binary` serialisation of `ConfigFlags` since there's
now an additional `Maybe` inserted in `configPrograms`'s type
@23Skidoo
Copy link
Member Author

For some reason I don't see a reference to #3193 here, so here it is.

hvr added a commit that referenced this issue Feb 27, 2016
@hvr
Copy link
Member

hvr commented Feb 27, 2016

I've just updated the wip/issue-3169 branch; 19f2e5c now reports

 Cabal/Distribution/PackageDescription.hs |  17 +++-------
 Cabal/Distribution/Simple/Haddock.hs     |  44 ++++---------------------
 Cabal/Distribution/Simple/InstallDirs.hs |  17 +---------
 Cabal/Distribution/Simple/Program/GHC.hs | 117 ++++-------------------------------------------------------------
 Cabal/Distribution/Simple/Setup.hs       | 343 +++++++++++++++++++++-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 5 files changed, 56 insertions(+), 482 deletions(-)

@23Skidoo
Copy link
Member Author

BTW, it looks like -O2 is required to generate sufficiently tight Core.

Given the following definition:

data Foo = Foo {
  foo :: String, bar :: Maybe [Int], baz :: [Int]
  } deriving Generic

Without -O2 I get:

mymappend :: Foo -> Foo -> Foo
[GblId, Arity=2, Str=DmdType]
mymappend =
  \ (x_a1lm :: Foo) (y_a1ln :: Foo) ->
    to
      @ Foo
      Main.$fGenericFoo
      @ GHC.Prim.Any
      (case (from @ Foo Main.$fGenericFoo @ GHC.Prim.Any x_a1lm)
            `cast` ...
       of _ [Occ=Dead] { :*: x1_a1pc x2_a1pd ->
       case (from @ Foo Main.$fGenericFoo @ GHC.Prim.Any y_a1ln)
            `cast` ...
       of _ [Occ=Dead] { :*: y1_a1pe y2_a1pf ->
       (GHC.Generics.:*:
          @ (M1 S Main.S1_0_0Foo (Rec0 String))
          @ (S1 Main.S1_0_1Foo (Rec0 (Maybe [Int]))
             :*: S1 Main.S1_0_2Foo (Rec0 [Int]))
          @ GHC.Prim.Any
          ((<>
              @ [Char]
              (GHC.Base.$fMonoid[] @ Char)
              (x1_a1pc `cast` ...)
              (y1_a1pe `cast` ...))
           `cast` ...)
          (case x2_a1pd of _ [Occ=Dead] { :*: x4_X1qQ x5_X1qS ->
           case y2_a1pf of _ [Occ=Dead] { :*: y4_X1qX y5_X1qZ ->
           GHC.Generics.:*:
             @ (M1 S Main.S1_0_1Foo (Rec0 (Maybe [Int])))
             @ (M1 S Main.S1_0_2Foo (Rec0 [Int]))
             @ GHC.Prim.Any
             ((<>
                 @ (Maybe [Int])
                 $dMonoid_r1K0
                 (x4_X1qQ `cast` ...)
                 (y4_X1qX `cast` ...))
              `cast` ...)
             ((<>
                 @ [Int]
                 (GHC.Base.$fMonoid[] @ Int)
                 (x5_X1qS `cast` ...)
                 (y5_X1qZ `cast` ...))
              `cast` ...)
           }
           }))
       `cast` ...
       }
       })

With -O2 this becomes:

mymappend =
  \ (w_s4hW :: Foo) (w1_s4hX :: Foo) ->
    case w_s4hW of _ [Occ=Dead] { Foo ww1_s4i0 ww2_s4i1 ww3_s4i2 ->
    case w1_s4hX of _ [Occ=Dead] { Foo ww5_s4i6 ww6_s4i7 ww7_s4i8 ->
    Main.Foo
      (++ @ Char ww1_s4i0 ww5_s4i6)
      (Main.$s$fMonoidMaybe1 ww2_s4i1 ww6_s4i7)
      (++ @ Int ww3_s4i2 ww7_s4i8)
    }
    }

@23Skidoo
Copy link
Member Author

OK, looks like -O (which is what Cabal uses by default) is sufficient.

@23Skidoo
Copy link
Member Author

The types/casts this small example generates are nuts, however, I can see why generics increase compile time so much.

@hvr
Copy link
Member

hvr commented Feb 27, 2016

@23Skidoo have you compared how more compact the resulting Core is compared to a manual definition

instance Semigroup Foo where
    a <> b = Foo { foo = combine foo
                 , bar = combine bar
                 , baz = combine baz
                 }
       where combine field = field a `mappend` field b

?

@23Skidoo
Copy link
Member Author

@hvr Well, if you look at the optimised Core I posted above, you'll see it's basically equivalent.

@hvr
Copy link
Member

hvr commented Feb 27, 2016

I was looking at what GHC 8.0 generates with -O1 for the manual definition I gave above, and it's:

-- RHS size: {terms: 44, types: 75, coercions: 0}
$w$c<>
$w$c<> =
  \ w_s4ij w1_s4ik ->
    (# case w_s4ij of _ { Foo ds_d45s ds1_d45t ds2_d45u ->
       ++
         ds_d45s
         (case w1_s4ik of _ { Foo ds3_X46j ds4_X46l ds5_X46n -> ds3_X46j })
       },
       case w_s4ij of _ { Foo ds_d45k ds1_d45l ds2_d45m ->
       case ds1_d45l of wild1_a49t {
         Nothing ->
           case w1_s4ik of _ { Foo ds3_X46B ds4_X46D ds5_X46F -> ds4_X46D };
         Just ipv_a49x ->
           case w1_s4ik of _ { Foo ds3_X46C ds4_X46E ds5_X46G ->
           case ds4_X46E of _ {
             Nothing -> wild1_a49t;
             Just ipv1_a49D -> Just (++ ipv_a49x ipv1_a49D)
           }
           }
       }
       },
       case w_s4ij of _ { Foo ds_d45o ds1_d45p ds2_d45q ->
       ++
         ds2_d45q
         (case w1_s4ik of _ { Foo ds3_X46f ds4_X46h ds5_X46j -> ds5_X46j })
       } #)

-- RHS size: {terms: 11, types: 18, coercions: 0}
$fSemigroupFoo_$c<>
$fSemigroupFoo_$c<> =
  \ w_s4ij w1_s4ik ->
    case $w$c<> w_s4ij w1_s4ik
    of _ { (# ww1_s4iT, ww2_s4iU, ww3_s4iV #) ->
    Foo ww1_s4iT ww2_s4iU ww3_s4iV
    }

-- RHS size: {terms: 4, types: 1, coercions: 0}
$fSemigroupFoo
$fSemigroupFoo =
  C:Semigroup
    $fSemigroupFoo_$c<>
    $fSemigroupFoo_$csconcat
    $fSemigroupFoo_$cstimes

which looks less compact than what I get for the Generics based (<>) = gmappend one...

@23Skidoo
Copy link
Member Author

@hvr

So I double-checked, and the following Haskell fragment:

instance Monoid Foo where
  mempty = gmempty
  mappend = gmappend

generates the following Core with -O1:

Main.$fMonoidFoo_$sgmempty =
  Main.Foo
    (GHC.Types.[] @ Char)
    (GHC.Base.Nothing @ [Int])
    (GHC.Types.[] @ Int)

Main.$fMonoidFoo_$cmappend =
  \ (w_s4jT :: Foo) (w1_s4jU :: Foo) ->
    case w_s4jT of _ [Occ=Dead] { Foo ww1_s4jX ww2_s4jY ww3_s4jZ ->
    case w1_s4jU of _ [Occ=Dead] { Foo ww5_s4k3 ww6_s4k4 ww7_s4k5 ->
    Main.Foo
      (++ @ Char ww1_s4jX ww5_s4k3)
      (Main.$fGMonoid:*:3 ww2_s4jY ww6_s4k4)
      (++ @ Int ww3_s4jZ ww7_s4k5)
    }
    }

This is with GHC 7.10.3.

@23Skidoo
Copy link
Member Author

@hvr With 7.10 and -O1

instance Monoid Foo where
  mempty = gmempty
  a `mappend` b = Foo { foo = combine foo
                      , bar = combine bar
                      , baz = combine baz
                      }
    where combine field = field a `mappend` field b

gives rise to

Main.$fMonoidFoo_$cmappend =
  \ (w_s4jT :: Foo) (w1_s4jU :: Foo) ->
    case w_s4jT of _ [Occ=Dead] { Foo ww1_s4jX ww2_s4jY ww3_s4jZ ->
    case w1_s4jU of _ [Occ=Dead] { Foo ww5_s4k3 ww6_s4k4 ww7_s4k5 ->
    Main.Foo
      (++ @ Char ww1_s4jX ww5_s4k3)
      (Main.$fGMonoid:*:3 ww2_s4jY ww6_s4k4)
      (++ @ Int ww3_s4jZ ww7_s4k5)
    }
    }

Main.$fGMonoid:*:3 =
  \ (ds_a1X4 :: Maybe [Int]) (m_a1X5 :: Maybe [Int]) ->
    case ds_a1X4 of wild_a1X6 {
      Nothing -> m_a1X5;
      Just ipv_a1Xa ->
        case m_a1X5 of _ [Occ=Dead] {
          Nothing -> wild_a1X6;
          Just ipv1_a1Xg ->
            GHC.Base.Just @ [Int] (++ @ Int ipv_a1Xa ipv1_a1Xg)
        }
    }

So it looks like they're the same.

@hvr
Copy link
Member

hvr commented Feb 27, 2016

@23Skidoo ok, now I'm totally confused; even with GHC 7.10.3 I don't seem to get what you get: https://gist.github.com/hvr/cdc4996fde37814d1d10

@23Skidoo
Copy link
Member Author

@hvr Apparently, gmempty in my example affects the result somehow. When I change it to Foo mempty mempty mempty, Core looks like what you're getting.

hvr added a commit to hvr/cabal that referenced this issue Feb 28, 2016
hvr added a commit to hvr/cabal that referenced this issue Feb 28, 2016
This is preparatory work for implementing haskell#3169
it's kept in a different commit in order to facilitate
comparing code-generation.
hvr added a commit to hvr/cabal that referenced this issue Feb 28, 2016
This increases compile-time (until GHC becomes more clever) but the
generated code is expected to be at least as good (if not better) than
the manually generated code.

This addresses haskell#3169
hvr added a commit to hvr/cabal that referenced this issue Feb 28, 2016
This increases compile-time (until GHC becomes more clever) but the
generated code is expected to be at least as good (if not better) than
the manually generated code.

This addresses haskell#3169
hvr added a commit to hvr/cabal that referenced this issue Feb 28, 2016
This is preparatory work for implementing haskell#3169
it's kept in a different commit in order to facilitate
comparing code-generation.
hvr added a commit to hvr/cabal that referenced this issue Feb 28, 2016
This increases compile-time (until GHC becomes more clever) but the
generated code is expected to be at least as good (if not better) than
the manually generated code.

This addresses haskell#3169
hvr added a commit to hvr/cabal that referenced this issue Feb 28, 2016
This increases compile-time (until GHC becomes more clever) but the
generated code is expected to be at least as good (if not better) than
the manually generated code.

While at it, this removes -XCPP usage from all modules touched.

This addresses haskell#3169
hvr added a commit to hvr/cabal that referenced this issue Feb 28, 2016
This increases compile-time (until GHC becomes more clever) but the
generated code is expected to be at least as good (if not better) than
the manually generated code.

While at it, this removes -XCPP usage from all modules touched.

This addresses haskell#3169
@23Skidoo
Copy link
Member Author

ok, now I'm totally confused; even with GHC 7.10.3 I don't seem to get what you get:

Turns out I was looking at unoptimised code. With -O1 I also see unboxed pair construction. Doesn't actually look like an optimisation with all the repeated evaluations of Foo _ _ _.

@23Skidoo 23Skidoo changed the title Replace Boilerplate Monoid/Semigroup instances with generics Replace boilerplate Monoid/Semigroup instances with generics Feb 29, 2016
23Skidoo pushed a commit that referenced this issue Mar 1, 2016
This implements the suggestions mentioned at
#3169 (comment)

The main benefit of this change is turning 'ConfigFlags' into a uniform
product-type suitable for generic derivation of pointwise
`Semigroup`/`Monoid` instances.

NB: This changes the `Binary` serialisation of `ConfigFlags` since there's
now an additional `Maybe` inserted in `configPrograms`'s type

(cherry picked from commit 62c3aa6)
23Skidoo pushed a commit that referenced this issue Mar 1, 2016
This is preparatory work for implementing #3169
it's kept in a different commit in order to facilitate
comparing code-generation.

(cherry picked from commit dd5fe69)
23Skidoo pushed a commit that referenced this issue Mar 1, 2016
This is preparatory work for implementing #3169
it's kept in a different commit in order to facilitate
comparing code-generation.

(cherry picked from commit 9b38b38)
23Skidoo pushed a commit that referenced this issue Mar 1, 2016
This increases compile-time (until GHC becomes more clever) but the
generated code is expected to be at least as good (if not better) than
the manually generated code.

This addresses #3169

(cherry picked from commit ec6dd74)
23Skidoo pushed a commit that referenced this issue Mar 1, 2016
This increases compile-time (until GHC becomes more clever) but the
generated code is expected to be at least as good (if not better) than
the manually generated code.

While at it, this removes -XCPP usage from all modules touched.

This addresses #3169

(cherry picked from commit 3bcae47)
garetxe pushed a commit to garetxe/cabal that referenced this issue Mar 5, 2016
garetxe pushed a commit to garetxe/cabal that referenced this issue Mar 5, 2016
This implements the suggestions mentioned at
haskell#3169 (comment)

The main benefit of this change is turning 'ConfigFlags' into a uniform
product-type suitable for generic derivation of pointwise
`Semigroup`/`Monoid` instances.

NB: This changes the `Binary` serialisation of `ConfigFlags` since there's
now an additional `Maybe` inserted in `configPrograms`'s type
garetxe pushed a commit to garetxe/cabal that referenced this issue Mar 5, 2016
This is preparatory work for implementing haskell#3169
it's kept in a different commit in order to facilitate
comparing code-generation.
garetxe pushed a commit to garetxe/cabal that referenced this issue Mar 5, 2016
This is preparatory work for implementing haskell#3169
it's kept in a different commit in order to facilitate
comparing code-generation.
garetxe pushed a commit to garetxe/cabal that referenced this issue Mar 5, 2016
This increases compile-time (until GHC becomes more clever) but the
generated code is expected to be at least as good (if not better) than
the manually generated code.

This addresses haskell#3169
garetxe pushed a commit to garetxe/cabal that referenced this issue Mar 5, 2016
This increases compile-time (until GHC becomes more clever) but the
generated code is expected to be at least as good (if not better) than
the manually generated code.

While at it, this removes -XCPP usage from all modules touched.

This addresses haskell#3169
@23Skidoo
Copy link
Member Author

This has been fixed in #3196 (thanks again, @hvr!), so closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants