Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ($) fully representation polymorphic #132

Closed
MangoIV opened this issue Feb 11, 2023 · 99 comments
Closed

Make ($) fully representation polymorphic #132

MangoIV opened this issue Feb 11, 2023 · 99 comments
Labels
approved Approved by CLC vote base-4.19 Implemented in base-4.19 (GHC 9.8)

Comments

@MangoIV
Copy link

MangoIV commented Feb 11, 2023

Motivation

At the moment the type of ($) is:

($) :: forall r a (b :: TYPE r). (a -> b) -> a -> b

which means that a is not representation polymorphic. It is indeed not possible to make it fully representation polymorphic with the current implementation, which is:
f $ x = f x due to us not knowing what representation x would be, see this note in the ghc user's guide, however, in the case of ($) this is an implementation detail, i.e. it could be implemented as ($) f = f which would make it unnecessary to pass a representation polymorphic argument.

For some more motivation on why this is necessary, please refer to this paste.

Proposal

Like this, we could rewrite the ($) to:

($)
  :: forall
    repa
    repb
    (a :: TYPE repa)
    (b :: TYPE repb)
   . (a -> b)
  -> a
  -> b
($) f = f
{-# INLINE ($) #-}

(Indeed, if we could not write ($) like this, we would even fail to prove (lifted) id in Haskell, which would be really bad)


Note
I have done impact assessment on the clc-hackage, built with a patched ghc927
I have opened a PR to GHC


Warning
as pointed out by @monoidal, this will make ($) more strict, i.e.
seq (undefined $) () == () will turn into seq (undefined $) () = _|_
This means, that this proposal is a breaking change
This proposal may also affect arity analysis

Alternatives

As pointed out by participants to the conversation, there are multiple alternatives in different directions

Changes to the proposal

  • make the type ($) :: forall a. a -> a, this will make the function passed to ($) possibly multiplicity polymorphic, this type might however be hard to understand for beginners
  • make the type also multiplicity polymorphic, I consider this out of scope of this proposal

Alternatives without the proposal

  • alternative unlifted-base which doesn't require the change to actual base
  • redefining ($) locally: for this to be convenient, this will require an alternative Prelude, e.g. with cabal mixins and some conveniences will fall away e.g. the special rule that makes ($) impredicative everywhere
  • not using ($) but parens: inconvenient
@Bodigrim
Copy link
Collaborator

Why not ($) = id?

@MangoIV
Copy link
Author

MangoIV commented Feb 11, 2023

I thought it would be nicer not to rely on the properties of id, e.g. if we wanted to change ($) to be multiplicity polymorphic, we would first have to change id to be. Also, same number of chars, so, we neither win nor lose anything.

@phadej
Copy link

phadej commented Feb 11, 2023

Why not ($) = id?

I remember discussion why we don't have ($) :: a -> a; ($) = \x -> x. That would solve plenty of problems (e.g. it becomes multiplicity polymorphic as well, i.e. maps ordinary functions to ordinary functions and linear functions to linear functions). ghc-proposals/ghc-proposals#274 (comment) - And I actually don't buy an argument that having ($) with "restricted" type helps type-inference/errors that much.

@MangoIV
Copy link
Author

MangoIV commented Feb 11, 2023

And I actually don't buy an argument that having ($) with "restricted" type helps type-inference/errors that much.

it might be worth considering that there might be beginners who do not immediately understand how ($) :: a -> a can be specialized to the current type of ($). Given that the implementer of the current ($) neither seemed to have realized this, that doesn't seem to be completely absurd.

@phadej
Copy link

phadej commented Feb 11, 2023

it might be worth considering that there might be beginners who do not immediately understand how ($) :: a -> a can be specialized to the current type of

Beginner argument can be used to argue against you as well: ($) type should stay at (be reverted to) forall (a b :: Type). (a -> b) -> (a -> b) type. If you need fancier representation, multiplicity, etc generalization, define your own.

E.g. in haddocks it is currently displayed in its full glory
Screenshot from 2023-02-11 19-11-49

Terrible for beginners, and you want to make it even worse!

@MangoIV
Copy link
Author

MangoIV commented Feb 11, 2023

E.g. in haddocks it is currently displayed in its full glory

Right, I’d really consider this a problem in the UI of haddock, tbh, given that :t doesn’t yield the most general type without flags, but I get your point.

@dixonary
Copy link

dixonary commented Feb 11, 2023

To be honest, the haddocks for ($) are already not great even outside of the displayed type signature. It's flatly wrong and confusing to say that ($) is redundant.

@Kleidukos
Copy link
Member

Kleidukos commented Feb 11, 2023

@dixonary please ping me when you open a merge request to improve these docs, I'll review it :)

@tomjaguarpaw
Copy link
Member

It's flatly wrong and confusing to say that ($) is redundant

Right, one may as well say any function is redundant, since you can replace it with its body.

@treeowl
Copy link

treeowl commented Feb 11, 2023

I very much like ($) :: a -> a. One thing to look out for is how this change will affect arity analysis. Today, (f $) is analysed as having an arity of at least 1, even if nothing is known about f. If ($) = id, then (f $) = f. That's sure to be an improvement in many cases, but might it be worse in others?

@Kleidukos
Copy link
Member

Kleidukos commented Feb 11, 2023

@dixonary documentation ticket opened at https://gitlab.haskell.org/ghc/ghc/-/issues/22963

@Vlix
Copy link

Vlix commented Feb 12, 2023

I very much like ($) :: a -> a. One thing to look out for is how this change will affect arity analysis. Today, (f $) is analysed as having an arity of at least 1, even if nothing is known about f. If ($) = id, then (f $) = f. That's sure to be an improvement in many cases, but might it be worse in others?

Would that break something like ($ 5) <$> [(5 +), (5 -), (5 *)]? Or does that just get reduced to (\x -> ($) x 5)?

@treeowl
Copy link

treeowl commented Feb 12, 2023

@Vlix It wouldn't break anything (aside from potential type inference issues). The question is how it could affect performance. It won't change performance in your example.

@hasufell
Copy link
Member

hasufell commented Feb 12, 2023

aside from potential type inference issues

Seems like a huge footnote.


Wrt the actual proposal, can we get the input from @goldfirere since he's been involved in that?

I also find there's lack of motivation. The discourse thread links to a theoretical code example.

However, that's not really sufficient motivation, IMO. Do users use existing workarounds (such as re-defining ($)) right now due to this issue? What is the actual value, regardless of "we could have been more general"?

Even if it doesn't break anything and has zero performance impact... it will change the type signature of one of the most used functions in Haskell... again. This shouldn't be something we regularly iterate over.

@phadej
Copy link

phadej commented Feb 12, 2023

I run into this issue in a non+theoretical example recently.

The definition of ResultType in ghc was changed to be unlifted type (using UnliftedNewtypes and UnboxedSums), the pattern synonyms made it look almost like the previous definition. However I had to change

-fromParseResult dflags $ parseFile (toFilePath modpath) dflags contents'
+fromParseResult dflags (parseFile (toFilePath modpath) dflags contents')

Yes, there's an easy work-around, but I did wonder why the previous definition just not work out of the box.


This shouldn't be something we regularly iterate over.

That's why change to ($) :: a -> a is tempting!

@MangoIV
Copy link
Author

MangoIV commented Feb 12, 2023

I also find there's lack of motivation. The discourse thread links to a theoretical code example.

@hasufell its not purely theoretic (although I guess I should make a better example), I encountered this issue when I tried to pass a Proxy# instead of a Proxy and use ($) for that. So the motivation is not theoretical but really practical. The reason why I even thought about it is because of something I need not kindchecking.


Wrt changing it to a -> a, I just have to reiterate, it's tempting but I think it wouldn't solve as many issues as @phadej is saying at the cost of being really confusing to newcomers. E.g. it would now allow to pass multiplicity polymorphic functions but it would itself not be multiplicity polymorphic. I'd say my proposal is a lot less intrusive than that. I understand that it's annoying that we'd have to change the implementation of ($) but I think the signature as it is now is a bug, it's inconsistent and it's unnecessary and it can cause unintuitive issues.


Also wrt @phadej's point, consider that we, after all, even have monadic versions of some functions that only need Applicative, I think this falls in the same range - although the simplification would be possible, it might cause confusion so we keep it.

@MangoIV
Copy link
Author

MangoIV commented Feb 12, 2023

I yesterday built GHC 9.4.4 with this change, I've yet to try it out on the clc-hackage though, as it doesn't support ghc 9.4.4 yet (haskell/clc-stackage#6).

@phadej
Copy link

phadej commented Feb 12, 2023

even have monadic versions of some functions that only need Applicative

This is not as simple as that.

  • We keep their names.
  • liftM2 is defined using >>=, which is different than using <$> and <*>; for some Monads that makes a difference.
  • mapM is a method of Traversable, as there can be more efficient implementation (e.g. Vector)

OTOH, e.g. forever was generalized to use Applicative instead of Monad, as having >> as something else than *> is probably not a good idea.


($) :: a -> a serves the purpose of $ has low, right-associative binding precedence, so it sometimes allows parentheses to be omitted as well.

Instead of thinking of f $ ... like "f applied to ...", you may think it as id f (...) (which is f (...)) but in infix form to omit parentheses.

@MangoIV
Copy link
Author

MangoIV commented Feb 12, 2023

As this was requested by @hasufell, here is the actual motivation for this and the reasoning summarized. I will attach this to the original post.

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE TypeFamilies #-}
{-# OPTIONS_GHC -Wall #-}

module Dollar where

import Data.Kind (Type)
import GHC.Exts (Proxy#, TYPE, proxy#)
import Prelude hiding (($))

-- allows @a@ and @b@ to be representation polymorphic
-- doesn't allow the arrow from @a@ to @b@ to be multiplicity
-- polymorphic
($)
  :: forall
    repa
    repb
    (a :: TYPE repa)
    (b :: TYPE repb)
   . (a -> b)
  -> a
  -> b
($) f = f
{-# INLINE ($) #-}

infixr 0 $

-- does allow for a function @a -> b@ to be multiplicity and
-- representation polymorphic, but is itself not multiplicity
-- polymorphic until we have something like @id :: forall m a. a %m -> a@
-- this consideration would go away if we'd implement it like
-- @\x -> x@ (same argument for why to use @($) f = f@ instead of
-- @($) = id@)
dollar :: forall (a :: Type). a -> a
dollar = id
{-# INLINE dollar #-}

infixr 0 `dollar`

-- for a motivation on why this is useful and how I encountered
-- this issue with @($)@, consider the following (more contrived,
-- but less complicated than the original) example

type family TF a

class C a where
  ca :: Proxy# a -> TF a

-- this would err when using the @($)@ provided by Prelude, although
-- it should literally act as  @`id`@
ca' :: forall {proxy} a. C a => proxy a -> TF a
ca' _ = ca $ proxy# @a

https://paste.sr.ht/~mangoiv/6fef7284c59202b63de7f26c962ad605b39f7b7f

@MangoIV
Copy link
Author

MangoIV commented Feb 13, 2023

evil idea to solve @phadej issues - how about additionally proposing to give id an infixr 0?

@treeowl
Copy link

treeowl commented Feb 13, 2023

@MangoIV some people would prefer an application operator that's infixl 0.

@Bodigrim
Copy link
Collaborator

I usually find free-flowing discussions very fruitful, but I know that it can be intimidating / overwhelming for proposers. As a point of order, @MangoIV, you are not obliged to answer every suggestion / generalization / counterproposal.

I yesterday built GHC 9.4.4 with this change, I've yet to try it out on the clc-hackage though, as it doesn't support ghc 9.4.4 yet (haskell/clc-stackage#6).

The latter is not going to change by magic. In the meantime you can backport your changes to ghc-9.2 branch and give it a try.

However, that's not really sufficient motivation, IMO. Do users use existing workarounds (such as re-defining ($)) right now due to this issue? What is the actual value, regardless of "we could have been more general"?

I work a lot with low-level primops and it is very annoying indeed that one cannot use ($) with Int#, Word#, etc. I can vouch that the pain is real.

@treeowl
Copy link

treeowl commented Feb 13, 2023

@Bodigrim , how much pain would this change alleviate? Wouldn't (.) still fail to work with those things?

@hasufell
Copy link
Member

I work a lot with low-level primops and it is very annoying indeed that one cannot use ($) with Int#, Word#, etc. I can vouch that the pain is real.

What do you usually do then? Use parentheses or locally define an operator?

@Bodigrim
Copy link
Collaborator

What do you usually do then? Use parentheses or locally define an operator?

I was too stupid to realise that I can redefine ($) locally in a representation-polymorphic way, so I used parentheses.

@davean
Copy link

davean commented Feb 14, 2023

I often have to code around this issue also. Sadly most of the Haskell 'base' library doesn't work well with this stuff, so my code is often very custom.

I don't think you'll see much code that can use this in the wild until you fix the problems because it makes it very expensive to write in Haskell and end up as a custom parallel micro-ecosystem in my experience.

@Bodigrim
Copy link
Collaborator

@MangoIV how would you like to proceed? Please raise an MR with your preferred version of ($) and complete impact assessment to learn if any packages are broken by increased polymorphism.

@MangoIV
Copy link
Author

MangoIV commented Mar 14, 2023

Hi, I'd like to make an impact assessment, but should I build that against ghc 927 or am I blocked until clc stackage has been upgraded to 944 or 961 respectively? If so, I guess I'd volunteer helping to fix it.

@Bodigrim
Copy link
Collaborator

You can use GHC 9.2.7, I would not expect a material difference for this proposal.

@MangoIV
Copy link
Author

MangoIV commented Mar 15, 2023

I guess this is the "success" screen:

[1 of 1] Compiling Lib              ( src/Lib.hs, /home/mangoiv/Devel/clc-stackage/dist-newstyle/build/x86_64-linux/ghc-9.2.7/clc-stackage-0.1.0.0/build/Lib.o, /home/mangoiv/Devel/clc-stackage/dist-newstyle/build/x86_64-linux/ghc-9.2.7/clc-stackage-0.1.0.0/build/Lib.dyn_o )
gcc: fatal error: cannot execute ‘/nix/store/qm03mh5x2j4ls0y80y8h3zsyj4hb52w2-gcc-11.3.0/libexec/gcc/x86_64-unknown-linux-gnu/11.3.0/collect2’: execv: Argument list too long
compilation terminated.
`cc' failed in phase `Linker'. (Exit code: 1)

How do I proceed?

@angerman
Copy link

I'm sympathetic to this change. My understanding is that this change does not make GHC reject code it accepted so far; as such any real world impact should be (assuming all other changes keep the compiler accepting previously accepted code) fairly obvious in alpha/beta/rcs that include this change. I'd still advise to keep an eye out for reported regressions that could be due to this change (or interactions of this change with others). +1 from me.

@Bodigrim
Copy link
Collaborator

The reason it's slower is because with current ($), an arity-2 unknown call to ($) f x hits the fast path. With proposed ($), an arity-2 unknown call to ($) f x is overapplied, so it has to go through the generic apply function. zipWith ($) that doesn't inline is the only case I can think of where this might happen in practice.

Thanks, that's a very interesting observation, a rare case when STG operational semantics becomes visible (https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/compiler/generated-code#dealing-with-generic-application). I agree that in order to be observable this effect requires a combination of factors, most importantly something preventing ($) from being inlined, to remain an unknown call. This is very rare in practice, and basically if ($) has not inlined, performance will be wanting either way.

@treeowl
Copy link

treeowl commented Mar 26, 2023

zipWith for lists will almost always inline, yes, but what about similar higher order functions for other things? What happens if you swap in Data.Sequence.Seq or Data.Primitive.Array.Array?

@treeowl
Copy link

treeowl commented Mar 26, 2023

Actually, the Array one probably inlines too, but I doubt the Seq one will.

@chshersh
Copy link
Member

+1 from me


I still think that the existing documentation for ($) is unacceptable. So I made an attempt to improve it by explaining the intention behind this operator with a simple example.

Discussed in GHC #22963

docs for dollar

@tomjaguarpaw
Copy link
Member

-1


This is an absolutely fundamental Haskell operator and I'm very cautious about the risk we run in changing it without having sufficient experience with it. I think it should live in unlifted-base first, and after a year or so we can think about making a change to ($) in base too. Packages that can't depend on base, such as boot packages, can just use a local definition if they really want it.

I also agree with the strong arguments in favour of this change. I think we should eventually do it, if nothing surprising happens when it's used, but I don't think we have sufficient experience with it to take the risk.

@cartazio
Copy link

cartazio commented Mar 27, 2023 via email

@hasufell
Copy link
Member

@tomjaguarpaw I sympathize, but frankly: the Haskell report demands that both definitions are equivalent.

If this causes programs to crash or performance to regress, I'm calling out a bug in GHC.

@treeowl
Copy link

treeowl commented Mar 28, 2023

@hasufell GHC doesn't implement precise Report Haskell and never has. That's not a bug.

@hasufell
Copy link
Member

@hasufell GHC doesn't implement precise Report Haskell and never has. That's not a bug.

I'm aware. And I'd still consider it a bug that will have to be fixed in GHC.

If we can't swap out f $ x = f x with ($) f = f without causing havoc, then our compiler is broken and I'm not sure why I'm writing Haskell anymore.

@treeowl
Copy link

treeowl commented Mar 28, 2023

@hasufell the fact that those are different is actually required by the report.

@hasufell
Copy link
Member

@hasufell the fact that those are different is actually required by the report.

Please point me to it.

@treeowl
Copy link

treeowl commented Mar 28, 2023

@hasufell Polymorphic seq (which is the root cause of the semantic difference between two argument and one argument ($), and which I would blame for the whole problem) is described in section 6.2:

seq is usually introduced to improve performance by avoiding unneeded laziness. Strict datatypes (see Section 4.2.1) are defined in terms of the $! operator. However, the provision of seq has important semantic consequences, because it is available at every type. As a consequence, is not the same as \x -> ⊥, since seq can be used to distinguish them. For the same reason, the existence of seq weakens Haskell’s parametricity properties.

In the context of the current, two-argument ($),

($)  = \y -> 

whereas with a one-argument ($), we have

($)  = 

The Report indicates that seq distinguishes these.

@hasufell
Copy link
Member

Ah, I thought you're referring to 4.5.5, which I found largely irrelevant here.

I think we established that the different strictness properties is something everyone considers impossible to trigger.

I was mainly pointing towards:

  1. if there's a semantic difference beyond bottom, we're screwed
  2. if there's issues with inlining/optimization, I will blame GHC (inlining is not stable anyway... e.g. almost every major GHC release breaks streamly)

@Bodigrim
Copy link
Collaborator

Thanks all, 5 vote in favour out of 7 possible are enough to approve.

@Bodigrim Bodigrim added the approved Approved by CLC vote label Mar 28, 2023
sthagen pushed a commit to sthagen/ghc-ghc that referenced this issue Apr 1, 2023
- this change was approved by the CLC in [1] following a CLC proposal [2]
- make ($) representation polymorphic (adjust the type signature)
- change ($) implementation to allow additional polymorphism
- adjust the haddock of ($) to reflect these changes
- add additional documentation to document these changes
- add changelog entry
- adjust tests (move now succeeding tests and adjust stdout of some
  tests)

[1] haskell/core-libraries-committee#132 (comment)
[2] haskell/core-libraries-committee#132
@andreasabel
Copy link
Member

Please establish uniformity between ($) and (&). They are currently out of sync (base-4.18). So, please apply the same generalization to (&) as you do to ($).

Current status:
Screenshot 2023-04-11 at 16 28 52

@hasufell
Copy link
Member

hasufell commented Apr 11, 2023

Please establish uniformity between ($) and (&). They are currently out of sync (base-4.18). So, please apply the same generalization to (&) as you do to ($).

Can you create a proposal?

@treeowl
Copy link

treeowl commented Apr 11, 2023

@andreasabel I believe that's impossible with current GHC. There's no way to avoid a representation polymorphic binding in the & you seem to suggest.

@cartazio
Copy link

cartazio commented Apr 11, 2023 via email

@tomjaguarpaw
Copy link
Member

Which analogous type? The type being proposed for $ is a -> a.

@andreasabel
Copy link
Member

I meant that the type of (&) should be as general as ($). Currently, there is a mismatch.
If ($) becomes a -> a, then of course (&) cannot be as general as ($), but it should be as general as ($) was before it became a -> a.

@phadej
Copy link

phadej commented Apr 11, 2023

@tomjaguarpaw Are you sure? You voted on MR which has

-($) :: forall r a (b :: TYPE r). (a -> b) -> a -> b
+($) :: forall repa repb (a :: TYPE repa) (b :: TYPE repb). (a -> b) -> a -> b

We cannot have

(&) :: forall repa repb (a :: TYPE repa) (b :: TYPE repb). a -> (a -> b) -> b
x & f = f x

that would require representation polymorphic code generation.

At best we could have

(&) :: forall r a (b :: TYPE r). a -> (a -> b) -> b
x & f = f x

i.e. analogous to previous type of $.

@andreasabel
Copy link
Member

@tomjaguarpaw
Copy link
Member

Oh whoops, for some reason I thought that the proposal had changed to a -> a (I'm even more opposed to the one we've actually got.)

@Bodigrim Bodigrim added the base-4.19 Implemented in base-4.19 (GHC 9.8) label Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Approved by CLC vote base-4.19 Implemented in base-4.19 (GHC 9.8)
Projects
None yet
Development

No branches or pull requests