Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for GHC-45510 #505

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Kleidukos
Copy link

@Kleidukos Kleidukos commented Jun 6, 2024

@s-and-witch would you mind checking that the explanation is correct?

Copy link

@s-and-witch s-and-witch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to describe this error without going deeply into how GHC works and using "whoa, we introduced this because of implementation reasons".

Perhaps the way forward is to show that illegal examples are syntactically equivalent to legal ones.

introduced: 9.10.1
---

Through using [`RequiredTypeArguments`], Haskell gets support for programming with Dependent Types.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true. forall-> has the same expressiveness as forall., the only difference between them is that forall-> is required by default and forall. is inferred. Consider this example:

idInvis :: forall a. a -> a
idInvis          @t  x  = x :: t

idVis :: forall a -> a -> a
idVis           t    x =  x :: t

Here, @t in idInvis is an explicit form of accepting invisible type argument, while t in idVis should be explicit. However, you can write both examples in implicit form:

idInvis :: forall a. a -> a
idInvis              x  = x

idVis :: forall a -> a -> a
idVis           _    x =  x

The same stands for the call site. These are explicit type applications:

f :: Int -> Int
f = idInvis @Int

g :: Int -> Int
g = idVis Int

And these are implicit:

f :: Int -> Int
f = idInvis

g :: Int -> Int
g = idVis _

Summary: you may opt in inference for forall-> while this is an "opt-out" option for forall., and this is the only difference between them.

Since we don't call forall. dependent, let's not use this word for forall-> either, because they are equivalent.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a much more solid explanation of why forall-> doesn't add dependent types to the Haskell in GHC User's Guide: https://downloads.haskell.org/ghc/latest/docs/users_guide/exts/required_type_arguments.html

With terminology, examples and so on.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think the terminology can be confusing here. Both of these statements are true: (1) forall. and forall -> are dependent quantifiers, and (2) forall a. t and forall a -> t are types. But if you try to combine these facts and say that these are dependent types, then it's no longer correct.

RequiredTypeArguments's visible dependent quantification forall a -> t is a step towards dependent types when it comes to syntax, but its semantics coincide with the old-fashioned forall a. t.

@@ -0,0 +1,19 @@
---
title: Term variables cannot be used here
summary: A variable defined at the level of terms cannot be directly used in type or kind signatures

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot be directly used

And nor indirectly! You just have no option to use term-level variable at the type level.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, GHC is not yet capable of promoting term variables.

---

Through using [`RequiredTypeArguments`], Haskell gets support for programming with Dependent Types.
As such, the variables at the level of terms exist at the level of types.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they aren't, however we can't now syntactically distinguish

f1 :: Int -> a -> a
f1 t a = a :: t

and

f2 :: forall a -> a -> a
f2 t a = a :: t

Notice that f t a = a :: t is the same for both of the examples, so we can't say that f1 is malformed code while f2 is totally OK, until we get type signature into consideration.

So, we have to accept the presence of term-level variables at the type-level by the renamer, but we reject them later, at type checking (thus, producing the error that we want to describe here).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Is there a better way to rephrase this without having to explain internal workings of GHC too much? From an outsider's perspective what does it really look like?

Copy link

@int-index int-index Jul 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an outsider's perspective what does it really look like?

Right, so we definitely shouldn't talk about the GHC renamer, but the user still needs to understand what's going on. It's not just implementation details: it's all observable in the user-facing language.

There are several topics that need to be covered: erasure, namespaces, and implicit quantification. I know it's a lot, but we need to sort this out if we want to minimize confusion and misunderstanding.

Erasure

In Haskell, every variable is either erased or retained. Let's say I define this type:

newtype Phantom (n :: Nat) = MkPhantom Int

I can then call its constructor as MkPhantom @50 42, passing two arguments two it: 50 and 42. But only 42 will be actually stored at runtime, 50 is erased.

We would avoid a great deal of problems if we called those erased arguments and retained arguments. But for historical reasons, we call them type arguments and term arguments (even though 50 is clearly not a type, it is a number).

The same applies to function declarations. Consider:

f :: forall a. a -> ()
f = \x -> ()

The forall-bound variable a is erased, while the lambda-bound variable x is retained. We could call them an erased variable and a retained variable, but we say type variable and term variable. And we say that even when the erased variable does not stand for a type

g :: forall (n :: Nat). Proxy n -> ()
g _ = ()

We still say that n is a type variable, even though n is not a type, it's a number!

We say "type" when we mean "erased", and we say "term" when we mean "retained". Ideally, when DataKinds was added to the language, we should've adjusted how we talk about variables, but we didn't, and now this is the terminology we're stuck with. But it gets worse.

Namespaces

We say that in Haskell we have the "type" namespace and the "term" (or "data") namespace. With RequiredTypeArguments, those are misnomers. Yes we have two namespaces, but those are not really in direct correspondence with types/terms. We might as well talk about "Namespace A" and "Namespace B", or, for ease of visual representation, "Blue Variables" and "Red Variables".

If you look at any piece of Haskell code, there are specific rules for coloring each variable red or blue. Example:

image

The rules are roughly as follows:

  • At binding sites a variable is colored blue in type declarations type T a = ..., in universal quantification forall a. ..., and in type abstractions f @a = .... It is colored red if it's bound in a lambda \a -> ... or simply on the LHS of a function equation f a = ....
  • At use sites we prefer to select blue variables if we are in a type declaration, type annotation, or a type application, otherwise we prefer red variables. I say "prefer" because if a variable of the expected color is not in scope, there are also fallback rules for selecting the other color (which vary depending on the enabled extensions, God help us).

Historically, it used to be the case that all red-colored variables were retained and all blue-colored variables were erased, so it (kind of) made sense to call them term variables and type variables. But this isn't the case anymore. Consider:

idVis :: forall a -> a -> a
idVis t x = x :: t

According to our rules, a is blue-colored, while t and x are red-colored:

image

But from the erasure perspective, t and a both stand for the erased argument of idVis, while x stands for the retained argument!

We are in trouble if we want to stick to our old terminology "type variable" and "term variable", it is inadequate, it no longer describes the reality of Haskell. It's "blue" and "red" from now on, at least within the scope of this GitHub comment :-)

Implicit quantification

Finally, to understand why GHC-45510 occurs, we need to remember how implicit quantification works in Haskell.

Normally, any variable you use must be explicitly brought into scope. So e.g. f = \x -> x is always a valid equation, whereas y = x is only valid if x is already in scope, otherwise you're going to see this error message:

ghci> y = x
<interactive>:1:5: error: [GHC-88464]
    Variable not in scope: x

Common sense, right? Well, in type signatures common sense does not apply. Let's say I write this:

f :: a -> a

What is this a? Where is it coming from? If we used the same logic as y = x, we'd expect to see this error message

<interactive>:1:5: error: [GHC-88464]
    Variable not in scope: a

but instead it is assumed that you meant to implicitly quantify over a with a forall and didn't do it for reasons of brevity, so what you actually meant must've been this:

f :: forall a. a -> a

So the language is designed to guess when you omitted a forall, and when such a guess takes place, it's called implicit quantification. An important aspect of this is that only out-of-scope variables are implicitly quantified. Consider this example:

{-# LANGUAGE ScopedTypeVariables #-}
f :: forall a. Num a => a -> a
f x = g (x+1)
  where
    g :: a -> a
    g = (*2)

This is a type-correct definition, where the type signature g :: a -> a refers to an a that is already in scope. It is bound by f :: forall a. via ScopedTypeVariables. If we ignored this fact, implicit quantification would get overzealous. It would replace g :: a -> a with g :: forall a. a -> a, making the above declaration incorrect.

So the rule is: only out-of-scope variables are implicitly quantified

Very important. If the variable is already in scope, just use it!

But wait, something doesn't add up... there's one last piece to this puzzle

Colorblind quantification

We have just seen how out-of-scope variables are implicitly quantified, whereas in-scope variables aren't. Now consider this example:

a = 42

f :: a -> a
f = \x -> x

Is the name a in scope? Of course, it's defined right there: a = 42. But is it implicitly quantified in f :: a -> a? According to the rule outlined above, it shouldn't be, because in-scope variables are used as they are. And this is exactly what happens with RequiredTypeArguments. In f :: a -> a, the a refers to the one declared in a = 42. However, it is not actually valid to use it there because f :: 42 -> 42 isn't exactly a reasonable type signature, so we get an error

Test.hs:5:6: error: [GHC-45510]
    • Term variable ‘a’ cannot be used here
        (term variables cannot be promoted)
    • In the type signature: f :: a -> a
  |
5 | f :: a -> a
  |      ^

(Note the error code: this is the error you're documenting in this PR)

The weird thing happens when we don't enable RequiredTypeArguments. Without the extension, somehow the a is implicitly quantified anyway and this program is accepted.

So it turns out that by default, the implicit quantification rule only considers blue-colored variables to be in scope for the purposes of implicit quantification. Based on the declaration a = 42, the a is a red-colored variable and implicit quantification pretends it doesn't exist. RequiredTypeArguments fixes this.

With RequiredTypeArguments, we don't discriminate based on color :-)

Through using [`RequiredTypeArguments`], Haskell gets support for programming with Dependent Types.
As such, the variables at the level of terms exist at the level of types.

If you mistakenly refer to an entity that already exists as a term in your kind signature, you will get

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's cover more contexts in the examples, because the explanation mentions a lot of kind signatures, while the error could be produced by every type construct. That's what internal GHC documentation says about the error:

  As a side effect of demotion, the renamer accepts all these examples:
    t = True         -- Ordinary term-level binding
    x = Proxy @t     -- (1) Bad usage in a HsExpr
    type T = t       -- (2) Bad usage in a TyClDecl
    f :: t -> t      -- (3) Bad usage in a SigDecl

  However, GHC doesn't promote arbitrary terms to types. See the "T2T-Mapping"
  section of GHC Proposal #281: "In the type checking environment, the variable
  must stand for a type variable". Even though the renamer accepts these
  constructs, the type checker has to reject the uses of `t` shown above.

  All three examples are rejected with the `TermVariablePE` promotion error.

The (3) can be fixed by forall t. while the others have no such option and their fix depends on what the user actually wanted to express.

@int-index
Copy link

This discussion prompted me to submit ghc-proposals/ghc-proposals#656. Check out the updated proposal text and examples, I think it explains quite clearly what's actually happening.

@int-index
Copy link

int-index commented Jun 7, 2024

Whatever explanation we end up with for GHC-45510, the main ideas that need to be communicated are

  • only out-of-scope type variables are implicitly quantified
  • with RequiredTypeArguments, all variables are in scope in type/kind signatures, regardless of their namespace
  • but the namespace of a variable is decoupled from whether it's actually a type or term variable, so sometimes we end up with term variables accidentally used in type/kind-level contexts, which triggers GHC-45510

@Kleidukos
Copy link
Author

Thanks a lot @s-and-witch and @int-index

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants