Skip to content
This repository has been archived by the owner on Jun 5, 2023. It is now read-only.

Allow Typeclasses to Declare Themselves Coherent #4

Open
odersky opened this issue Mar 3, 2017 · 101 comments
Open

Allow Typeclasses to Declare Themselves Coherent #4

odersky opened this issue Mar 3, 2017 · 101 comments

Comments

@odersky
Copy link

odersky commented Mar 3, 2017

Motivation

Unlike Haskell, Scala allows multiple implicit instances of a type. This makes implicits cover a lot more use cases. Besides type classes, implicits can also express

  • capabilities
  • configurations, which may be locally overridden
  • contexts in their full generality

When used for implementing typeclasses, implicits support different implementations of a type class in different modules. For instance, Boolean could implement Monoid with true and & in one part of a program and with false and | in another.

By contrast, Haskell's type classes have a coherence requirement which states that a type can implement a type class only in one way anywhere in a program. Coherence restricts what one can do with type classes. It feels a bit anti-modular since it establishes a condition for "a whole program" (in Scala or Java one would not even know what that is!). But it also solves two problems of Scala's unrestricted use of implicits.

The first problem is that one can accidentally mix two different implicit implementations where this is non-sensical. The classic example is a priority queue which requires an Ordered element type. If one declares priority queue like this:

class PriorityQueue[T: Ordered] { 
   ...
   def union(other: PriorityQueue[T]): PriorityQueue[T]
}

there's no way to guarantee that the two queues in a union operation agree on the ordering of type T. This can lead to nonsensical results. I believe this is a bit of a red herring however, as it is perfectly possible to design data structures that do not suffer from that problem by using some nesting (or functorizing, to borrow some ML terminology). For priority queues, Chris Okasaki's scads is a nice solution, which generalizes readily. In essence, all that's needed is to move the type parameter T one level further out:

class PriorityQueues[T: Ordered] {
  class Queue {
    ...
    def union(other: Queue): Queue
  }
}

So far, so good. But there's also a second problem which is in a sense the dual of the first. The first problem was "how do we prevent some programs from compiling when implementations are not coherent"? (i.e. we have multiple implementations of the same type class). The second problem is "how do we get some programs to compile when implementations are coherent"? In Scala, implicit resolution may lead to ambiguities, which are compile-time errors. But in a coherent world, ambiguities do not matter: picking any one of multiple applicable alternatives would give the same observable result. So if we know that all implementations of some type are coherent, we could suppress all ambiguity errors related to implicit searches of that type.

This is a real issue in practice, which mars attempts to implement in Scala typeclass hierarchies that are standard in Haskell. #2029 is a good starting point to explore the issue. Here's a somewhat more figurative explanation of the problem: Say you want to model capabilities representing driving permits as implicit values. There's a basic capability "has drivers license", call it DL. There are two more powerful capabilities "has truck driver's license" (TL) and "has cab license" (CL). These would be modelled as subtypes of the "drivers license" type. Now, take a function that requires the capabilities to drive a truck and to drive a cab, i.e. it takes two implicit parameters of type TL and CL. The body of the function needs at some point the capability of driving a car (DL). In the real world, this would be no problem since either of the function's capabilities TL and CL imply the capability DL. But in current Scala, you'd get an ambiguity, precisely because both implicit parameters of types TL and CL match the required type DL. In actual Scala code, this is expressed as follows:

trait DL
trait TL extends DL
trait CL extends DL

object Driving {
  def drive(implicit dl: DL) = ()
  def f(implicit tl: TL, cl: CL) = drive
}

Compiling this results in an ambiguity error:

-- Error: coherent.scala -------------------------------------------------------
33 |  def f(implicit tl: TL, cl: CL) = drive
   |                                        ^
   |ambiguous implicits: both value tl and value cl match type DL of parameter dl of method drive in object Driving

#2029 and @adelbertc's note contain discussions of possible workarounds. One workaround uses aggregation instead of subtyping. The other tries to bundle competing implicit parameters in a single implicit in order to avoid ambiguities. Neither approach feels both simple and scalable in a modular fashion.

Proposal

The proposal is to introduce a new marker trait scala.typeclass.Coherent.

package scala.typeclass
trait Coherent

If a type T extends directly or indirectly the Coherent trait, implicit searches for instances of type T will never issue ambiguity errors. Instead, if there are several implicit values of type T (where no value is more specific than any other) an arbitrary value among them will be chosen.

For instance, the Driving example above can be made to compile by changing the definition of DL to

trait CanDrive extends scala.typeclass.Coherent

Soundness

The rule for dealing with coherence makes some implicit selections unspecified. A global coherence condition is needed to ensure that such underspecifications cannot lead to undetermined behavior of a program as its whole. Essentially, we need to ensure that if there is more than one implicit value implementing a coherent type T, all such values are indistinguishable from each other. (In)distinguishability depends on the operations performed by the program on the implicit values. For instance, if the program compares implicit values using reference equality (eq), it can always distinguish them. One good rule to follow then is to avoid calling eq, as well as all other operations defined in AnyRef on values of coherent types. Furthermore, one needs to also ensure that all members of a coherent type are implemented in the same way in all implicit instances of that type.

It's conceivable that sufficient coherence conditions can be established and checked by a tool on a project-wide basis. This is left for future exploration, however. The present proposal defers to the programmer to check that coherence conditions are met.

Mixing coherent and normal abstractions

Sometimes one has conflicting requirements whether a type should be coherent or not. Coherency avoids ambiguity errors but it also prohibits possibly useful alternative implementations of a type. Fortunately, there's a way to have your cake and eat it too: One can use two types - a non-coherent supertype together with a coherent subtype.

For instance one could have a standard Monoid trait that admits multiple implementations:

package kernel
trait Monoid[T] {
   def empty: T
   def combine(x: T, y: T): Monoid[T]
}

One can then declare a coherent subtrait:

object Canonical {
   trait Monoid[T] extends kernel.Monoid[T] with Coherent
}

One could also several coherency domains by using a class instead of an object for Canonical and instantiating that class for each domain.

Alternatives

@djspiewak has proposed a different approach to coherence based on parameterizing implicit definitions with coherency domains. #2029 contains a discussion of this approach in relation to the present proposal.

Implementation Status

The proposal is implemented in #2046.

@rkuhn
Copy link

rkuhn commented Mar 3, 2017

This looks like a good solution for capabilities. It would be good, though, to document the factorization approach prominently such that nobody is tempted to make Ordering coherent, otherwise we’ll soon need ReverseInt or all sorts of reverseX functions on ordered collections.

@alexandru
Copy link

alexandru commented Mar 3, 2017

I don't believe this would be a good thing. If a type-class has a coherence requirement, I would expect the compiler to throw an error or at least a warning instead of picking arbitrarily.

We already have big problems with implicit resolution where pieces of code change behaviour depending on the order of imports. Dealing with @Coherent type-class instances would make this problem worse, especially because it's going to be an easy way to fix compiler ambiguity errors.

You folks talked of Monoid and Ordering that shouldn't come with a Coherent flag, but Applicative and Monad can have the same problem as well.

An Applicative in the context of effect triggering data types, like IO or Task, can decide how map2 and ap behaves - does it do parallelism or not? This is why in Monix (and I believe in Scalaz as well) one has to do a special import in order to get parallel execution by means of Applicative, because parallelism is expensive and isn't described by Applicative's laws - all Applicative specifies by map2 and ap is data independence, not how things get evaluated:

import monix.eval.Task.nondeterminism

// Now it does parallel execution
Applicative[Task].map2(task1, task2)(_ + _)

But if Applicative ends up being Coherent, well, this will no longer work, because the compiler will be totally silent about conflicts. To make it worse, the implementation of mapBoth that powers this parallel Applicative is actually pretty complicated, since it has to deal with synchronisation in a multi-threaded environment. I can see the need for users to override how this operation behaves for their own purposes, which I thought is one of the perks of going with type-class based polymorphism in Scala.

One proposed alternative, mentioned by @adelbertc, doesn't get enough attention: not using subtype relationships, but use composition instead. I first saw this idea in Scato by @aloiscochard, which is the design that's being pushed in Scalaz 8. I have also used a variation of it in the type-class hierarchy described in monix-types, which can be seen in action in a stream-like type, a work in progress making use of higher-kinded polymorphism (see implementation).

Of course, there are legitimate concerns against this approach. One of them is that you'd have to explicitly import the relationships in scope:

import monix.types._
import monix.types.syntax._

def eval[F[_], A](a: => A)(implicit F: Monad[F]): F[A] = {
  import F.{applicative => A}
  A.pure(()).flatMap(_ => A.pure(a))
}

def filteredEvalSomething[F[_],A](a: => A, f: A => Boolean)
  (implicit MF: MonadFilter[F], ME: MonadError[F,Throwable]) = {

  import MF.{applicative => A, monad => M}
  A.pure(())
    .flatMap(_ => try A.pure(a) catch { case NonFatal(ex) => ME.raiseError(ex) })  
    .filter(f)
}

It's annoying for sure, but then working with higher-kinded polymorphism is annoying too. In my opinion this proposal would make MTL-style code easier but at the expense of use-cases that Scala already handles well.

@djspiewak
Copy link

I don't believe this would be a good thing. If a type-class has a coherence requirement, I would expect the compiler to throw an error or at least a warning instead of picking arbitrarily.

If it does this, then MTL is impossible again. Motivating example:

def foo[F[_]](implicit MS: MonadState[F, String], MO: MonadOption[F]) =
  Monad[F] pure 42

There's a clear ambiguity here. The goal is to get rid of the compile error, not add more of them.

An Applicative in the context of effect triggering data types, like IO or Task, can decide how map2 and ap behaves - does it do parallelism or not? This is why in Monix (and I believe in Scalaz as well) one has to do a special import in order to get parallel execution by means of Applicative, because parallelism is expensive and isn't described by Applicative's laws - all Applicative specifies by map2 and ap is data independence, not how things get evaluated

Whether or not parallel Applicative is valid is a longer discussion with a lot of nuance, and I subjectively don't think this is a valid application of incoherence. But it's clearly an application of incoherence, and one which should be resolvable in the same way that Monoid is. I'm not sure I see why that encoding doesn't apply.

One proposed alternative, mentioned by @adelbertc, doesn't get enough attention: not using subtype relationships, but use composition instead.

There's a lot of problems with Scalaz 8's encoding. It solves many, many problems, but the encoding is quite bulky, the type errors are unclear, and as you said it comes with other use-site problems which are even more annoying. Also, fundamentally, it doesn't really solve the problem of incoherence in so much as it encodes a type level semi-lattice which may be used to resolve conflicts.

@puffnfresh
Copy link

It's not just MTL, this happens with simple stuff like:

trait Functor[F[_]]
trait Bind[F[_]] extends Functor[F]
trait Apply[F[_]] extends Functor[F]
trait Foldable[F[_]]
trait Traverse[F[_]] extends Foldable[F] with Functor[F]

case class Id[A](a: A)

object Id {
  implicit val IdTraverse: Traverse[Id] = new Traverse[Id] {}
  implicit val IdApply: Apply[Id] = new Apply[Id] {}
}

object Example {
  def example[F[_]](implicit T: Traverse[F], A: Apply[F]): Int = {
    println(implicitly[Functor[F]])
    0
  }

  example[Id]
}

Adding the Coherent typeclass to Functor allows the above to compile. That's very useful.

It doesn't solve all of the problems with coherency, it's still possible to pass dictionaries explicitly. It's still possible to make a higher priority implicit:

trait Default[A] { def value: A }
object Default {
  implicit val IntDefault: Default[Int] = new Default[Int] {
    val value = 0
  }
}

object Example {
  implicit val MyDefaultInt: Default[Int] = new Default[Int] {
    val value = 1
  }

  println(implicitly[Default[Int]].value) // prints 1
}

So I wouldn't call this the "Coherent typeclass" but instead the "ignore ambiguities annotation" - this is useful, I would use it 👍

Using implicits for things like changing evaluation order causes problems with refactoring, which is the usual argument for coherency. The best way to handle that situation is to create a new type.

:shipit:

@djspiewak
Copy link

djspiewak commented Mar 3, 2017

So I wouldn't call this the "Coherent typeclass" but instead the "ignore ambiguities annotation" - this is useful, I would use it 👍

Absent a more involved language revision to allow constructively coherent definitions (Miles has some ideas on this front), I think this is by far the best approach we could have. And I would very much use this as well.

Using implicits for things like changing evaluation order causes problems with refactoring, which is the usual argument for coherency. The best way to handle that situation is to create a new type.

I'm generally not in favor of newtyping just to get different implicit semantics, but I 100% agree that it's the most appropriate solution here. (edit: where by "here" I mean w.r.t. the Applicative example)

@adelbertc
Copy link

There's a lot of problems with Scalaz 8's encoding. It solves many, many problems, but the encoding is quite bulky, the type errors are unclear, and as you said it comes with other use-site problems which are even more annoying. Also, fundamentally, it doesn't really solve the problem of incoherence in so much as it encodes a type level semi-lattice which may be used to resolve conflicts.

It's certainly bulky, but only because it has to work with what is currently given in Scala. If you imagine a hypothetical language extension, similar to what we have here with Coherent, maybe we can come up with another proposal that achieves the same thing modulo the syntactic bulk. Something that gets you the said semi-lattice of type classes.

Though perhaps this is what Coherent solves.

@nafg
Copy link

nafg commented Mar 5, 2017

I'm guessing the plan is that in the coherent case, where it will not error on ambiguous Implicits, instead there will be some well-defined (if objectively arbitrary) and specified set of rules used to select the implicit?

If yes, that might mitigate the "what if I want to override an implicit that a library shortsightedly marked coherent" concern...

@adelbertc
Copy link

One concern I have about this approach is it seems to be an ad-hoc mechanism nailed on top of the existing implicit resolver. Maybe such a concern is ill-founded, this is just my intuition speaking at this point.

@twanvanderschoot
Copy link

Though it seems an interesting and promising concept, relying on a programmer's discipline to check the conditions for coherence is not scalable; managing programmer team discipline over organisational change, scope an time is extremely difficult. Without embedding this check into the compiler, I strongly believe this will increase the problems with implicits even more!

I inherited a Scala code base of about 5 years old with several "historical" layers of maturity.
The biggest issue is the abuse of implicits combined with an immature type hierarchy (too deep, too much generic types) , effectively making it a "go to". That is, when locally inspecting code, it is very difficult to find the actual conversion or implicit used: It has become the new global.

Nor the IntelliJ Scala plugin or the Scala Eclipse version are able to offer reliable support; they can't handle the depth of the implied type proofs and either halt or crash or misidentify the culprit. Only compiling all code can "prove" what happens.

I do see a use for implicits, but it is too easy to abuse them. Therefore any proposal extending their use not accompanied with compiler support enforcing the rule of application will invariably incur heaps of technical debt.

@ritschwumm
Copy link

i recently run into problems using cats where i imported the same syntax extension (via an implicit class) over two different inheritance paths and had a hard time figuring out why the compiler told me the extension method does not exist.

i think those two problems are related: the compiler does not know two things (a parent typeclass or an extension) are actually the same, codewise. it might make more sense to attack the problem at the point where the confusion occurs instead of where it originates from: in the functor/traversable/monad example it might make more sense to attach the information that they are inheriting "the same" functor to travesable and monad, instead telling the compiler that whereever the functor occurs it's probably the same one.

@caente
Copy link

caente commented Mar 5, 2017

perhaps this has been discussed before, and discarded, http://docs.scala-lang.org/sips/pending/trait-parameters.html, but what about using implicit arguments on the trait to achieve composition of typeclasses?

trait DL
trait TL(implicit dl : DL)
trait CL(implicit dl : DL)

object Driving {
  def drive(implicit dl: DL) = ()
  def f(implicit tl: TL, cl: CL) = drive // you'll have to pass dl "explicitly" as an implicit, if you don't the compiler will ask for it
}

so, instead of

trait Bind[F[_]] extends Functor[F]
trait Apply[F[_]] extends Functor[F]

we could do

trait Bind[F[_] : Functor]{
  // now F can use fmap
}

trait Apply[F[_] : Functor]{
  // now F can use fmap
}

but this means that at the call site we would need to do this:

def foo[F](implicit b:Bind[F], a:Apply[F], f:Functor[F])

but also means that there will be no ambiguity as what Functor implementation is being used

the only downside I see, besides that I don't know what would take to make this happen, is that we would need to provide implicits for all the typeclasses required, but in a way I think it adds to clarity

@odersky
Copy link
Author

odersky commented Mar 5, 2017

@twanvanderschoot

Though it seems an interesting and promising concept, relying on a programmer's discipline to check the conditions for coherence is not scalable; managing programmer team discipline over organisational change, scope an time is extremely difficult. Without embedding this check into the compiler, I strongly believe this will increase the problems with implicits even more!

I agree it is a worry. Technically the check cannot be put into the compiler, since a compiler has no knowledge what program is. Maybe we can put it into a build tool. But that's a long way off, for sure.

I inherited a Scala code base of about 5 years old with several "historical" layers of maturity.
The biggest issue is the abuse of implicits combined with an immature type hierarchy (too deep, too much generic types) , effectively making it a "go to". That is, when locally inspecting code, it is very difficult to find the actual conversion or implicit used: It has become the new global.

It would be interesting to know more about this. Implicit parameters are the core of Scala. We cannot simply suppress them and the recent addition of implicit function types has made them even more powerful than before. Implicit conversions we could and probably should restrict further. So, to make progress in the field it would be really important to learn specifics of past mistakes. This would help us develop patterns or maybe even compiler-checked rules that avoid similar mistakes in the future.

@Atry
Copy link

Atry commented Mar 6, 2017

Looks like shapeless's cachedImplicit

@DavidGregory084
Copy link

It would be interesting to know more about this. Implicit parameters are the core of Scala. We cannot simply suppress them and the recent addition of implicit function types has made them even more powerful than before. Implicit conversions we could and probably should restrict further. So, to make progress in the field it would be really important to learn specifics of past mistakes. This would help us develop patterns or maybe even compiler-checked rules that avoid similar mistakes in the future.

@odersky @twanvanderschoot

I think you are right that implicits are at the core of the Scala language, but implicit resolution is definitely an area in which the compiler could provide more help - this does not necessarily have to be in the form of static checks.

Consider that the tool that scalac offers currently is the incredibly verbose output of -Xlog-implicits, which cannot be targeted to explain a specific compilation error and which was written primarily for those with a passing knowledge of the compiler internals (for obvious reasons).

Perhaps compiler plugins like tek/splain and Scala Clippy could be used as inspiration for more concise and user-friendly output in Dotty, or even a slightly more graphical output in keeping with the beautiful new compilation error messages?

However, since this is veering quite a long way off topic I think it would probably be better to discuss this further on https://contributors.scala-lang.org if there is some interest in doing so?

@odersky
Copy link
Author

odersky commented Mar 6, 2017

Yes, let's take discussion about implicit best practices to https://contributors.scala-lang.org.
It's best to keep this issue focussed on coherence.

@odersky
Copy link
Author

odersky commented Mar 7, 2017

What I took away from the very good discussion here (and also on the Scala reddit) is that an unchecked Coherent is probably just too tempting to use as a quick fix for ambiguity problems. So it would likely be abused.

I now think that we should try to solve the question first how to check that usage of a typeclass is actually coherent. To be decidable by the regular compiler, the check has to be quite restrictive. For instance, we'd almost certainly need a rule that implicit definitions that make a type T an instance of the coherent type class C have to be defined in the same compilation unit as either T or C. But I think a restrictive verification of coherence is better than opening the floodgates and let every type class be coherent by declaration.

@smarter
Copy link
Member

smarter commented Mar 7, 2017

For instance, we'd almost certainly need a rule that implicit definitions that make a type T an instance of the coherent type class C have to be defined in the same compilation unit as either T or C.

I like this. It's very reminiscent of GHC orphan instance warning (See https://wiki.haskell.org/Orphan_instance, https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/separate_compilation.html#orphan-modules-and-instance-declarations)

@alexandru
Copy link

alexandru commented Mar 7, 2017

What I took away from the very good discussion here (and also on the Scala reddit) is that an unchecked Coherent is probably just too tempting to use as a quick fix for ambiguity problems. So it would likely be abused.

That's my primary concern expressed in my comment above.

For instance, we'd almost certainly need a rule that implicit definitions that make a type T an instance of the coherent type class C have to be defined in the same compilation unit as either T or C.

I also like this rule very much.

I can think of problems with such restrictions though, for example the reason for why I have a monix-types in my project, along with why @djspiewak's Shims project exists, is because in the community some of us like to provide compatibility for both Cats and Scalaz 😕

So what happens is that if you have defined a shims.Monad[T], it can be converted to either a scalaz.Monad[T] or to a cats.Monad[T]. And people have fine grained control over the dependencies, so you don't necessarily have to import Scalaz or Cats into your classpath, if you don't want it. But with such a restriction then I don't think such conversions work anymore. And dependencies on Cats and Scalaz can no longer be optional.

Of course, using shims is far from ideal, it sucks actually and this isn't a problem that Haskell has AFAIK. Just mentioning this for awareness.

@djspiewak
Copy link

djspiewak commented Mar 7, 2017

@odersky A restriction to only allowing coherence when the instances are defined within the same compilation unit is going to make something like this almost impractically narrowly applicable. The whole point of typeclasses is that they're an open extension mechanism, but that benefit is completely lost if you're trying to achieve same-unit coherence at the same time. In practice, I'm relatively confident that even frameworks which highly value coherence (like Scalaz or Cats) would simply forego that potential benefit, since the costs would be too high.

I agree that a system which implements actual checks for coherence, rather than functioning as an assertion, would be strongly preferable. But it's important that such a system doesn't come at the expense of all of the other benefits of typeclasses.

Edit: I misread the proposal. See below.

@smarter
Copy link
Member

smarter commented Mar 7, 2017

@djspiewak I think "impractically narrowly applicable" is a bit of an exaggeration, as far as I know, it's considered good practice in the Haskell community to avoid orphan instances.

@djspiewak
Copy link

@smarter Rereading @odersky's proposal, I see that it suggests it might be in the compilation units for either T or C, which is less restrictive than it looked on my phone this morning. That would be a reasonable restriction in most cases, though it would prevent things like a checked-coherent Monad[Option], which seems unfortunate. It would at least allow most common cases.

@smarter
Copy link
Member

smarter commented Mar 7, 2017

though it would prevent things like a checked-coherent Monad[Option], which seems unfortunate

Why couldn't you have the Monad[Option] instance be defined where Monad is defined?

@djspiewak
Copy link

You could, but that would force users to bring that instance in, whether they want it or not. That might be a tradeoff that frameworks would make, but Scalaz decided explicitly against doing that back when Scalaz 7 was designed (Scalaz 6 used to bring in stdlib instances by default), and Cats has held to that design for largely the same reasons.

@smarter
Copy link
Member

smarter commented Mar 7, 2017

If it's in the same compilation unit but in a separate object, it wouldn't force users to bring the instance in, right?

@djspiewak
Copy link

@smarter Actually even better, I suppose it could be defined as a trait which is later mixed into objects elsewhere (e.g. scalaz.std.option). So ultimately, due to inheritance, it only compromises file organization. Which is reasonable.

Actually for that very reason, it sort of surprises me that same-compilation unit would be a sufficient restriction.

@smarter
Copy link
Member

smarter commented Mar 7, 2017

Good point, maybe besides same-compilation unit we also need to require the typeclass instance to be defined in a sealed/final class or an object?

@djspiewak
Copy link

Good point, maybe besides same-compilation unit we also need to require the typeclass instance to be defined in a sealed/final class or an object?

That would be annoying, but wouldn't compromise any of the things Scalaz/Cats want to do with their import organization. In fact, we might actually get to see more use of the package foo { ... } construct, as a way of reorganizing the namespacing of objects stored in the same compilation unit solely to achieve coherence.

@adelbertc
Copy link

adelbertc commented Mar 7, 2017 via email

@DarkDimius
Copy link

DarkDimius commented Mar 8, 2017

@twanvanderschoot

relying on a programmer's discipline to check the conditions for coherence is not scalable

I agree. There is an alternative version of this proposal that moves the coherency decision from definition side of the class to the use-site. This has an advantage of not polluting the entire code-base if programmers aren't disciplined enough, as it will only harm a single method.

This can be achived by defining a magical class Coherent[T](u:T) extends AnyVal that typer will be aware of.
in case implicit search looks for a Coherent[T] it means that any value would suffice. Being a value class, it will be erazed to underlying implicit type. Eg, in the example below if you want to build a red-black tree and you don't care what kind of ordering you get, you can write like this

def makeRBTree[T](implicit a: Coherent[Ordering[T]])

I think this alternative provides a safer mechanism to be used in big code-bases. It will be slightly harder to implement due to more magic involved inside compiler

@djspiewak
Copy link

@DarkDimius I had an idea very similar to this a few months ago. It's actually implementable on current scalac, but it does of course require some significant use of dark magic to manipulate the compiler's internal processes via reflection in a macro, but it is possible. I don't really like it as much as the declaration-site proposals though. And particularly if it's possible to have a checked coherence with just same-file restrictions.

@TomasMikula
Copy link

TomasMikula commented Mar 13, 2017

@djspiewak yup, that's what I mean by "assert". I, the programmer, tell the compiler that they do. If they don't, my problem.

@Blaisorblade
Copy link

@djspiewak Program equivalence is certainly not decidable in general for Turing-complete languages, though many instances are more restrictive. But yeah asserting coherence makes sense—one extra example from @adelbertc is below.

Also, let's keep in mind both scalaz and cats have maintained coherent hierarchies, so there's a strong case (two groups of candidate users) for giving careful programmer escape hatches and trust they manage.

@Blaisorblade Good counter-example. The easiest fix is to indeed demand that the class symbols
of the instance types are different. I wonder whether this is too restrictive?

I suspect so, but instead of showing why you want two instances for type Option[X] let me show two instances for X. I hand-waved an example when I mentioned Functor, let me copy-paste a proper version from @adelbertc's post (http://typelevel.org/blog/2016/09/30/subtype-typeclasses.html). BTW note that while the given implementations are not coherent, for law-abiding instances the methods will behave the same, and we certainly don't expect any automated tool to grok by itself the category theory required for the equivalence proof.

BTW, code in this spirit requires OverlappingInstances in Haskell (though maybe not this snippet).

trait FunctorConversions1 {
  implicit def applicativeIsFunctor[F[_]: Applicative]: Functor[F] =
    implicitly[Applicative[F]].functor
}

trait FunctorConversions0 extends FunctorConversions1 {
  implicit def traverseIsFunctor[F[_]: Traverse]: Functor[F] =
    implicitly[Traverse[F]].functor
}

object Prelude extends FunctorConversions0

@odersky
Copy link
Author

odersky commented Mar 14, 2017

@Blaisorblade I have updated the pseudo-code to include a more refined inequality test of instances.

@Blaisorblade
Copy link

Blaisorblade commented Mar 14, 2017

@odersky Thanks — I agree the test should be disjoint.

I still have the same quibbles on how disjoint is implemented, with the same counterexample (see details below), unless disjoint caller ensures some strong precondition I miss.

But we don't need to reinvent disjoint! The pattern matcher needs a good implementation of disjointness (ping @smarter)?

Here, for soundness, we want disjoint(T1, T2) = true if there exists no fully concrete type T (with no leftover type variables) that's an instance of both T1 and T2. The test can be conservative, so spurious false results are allowed, but we want as few as possible to keep expressiveness. The pattern matcher needs the same test to give this error (taken from Scalac 2.11.7):

scala> def f[X](x: Option[X]) = x match { case (y, z) => x}
<console>:10: error: constructor cannot be instantiated to expected type;
 found   : (T1, T2)
 required: Option[?X1] where type ?X1 <: X (this is a GADT skolem)
       def f[X](x: Option[X]) = x match { case (y, z) => x}
                                               ^

In particular, the test in the pattern matcher must be conservative in the same direction, so that Dotty/Scalac only reject matches that can never be triggered. Allowing further matches that could be rejected is okay. Rejecting matches that can be triggered is not OK.


Why this implementation is not good enough:

def norm(T: Type) = 
   "T where every occurrence of a type variable or abstract type X is replaced by
    a fresh, instantiatable type variable Y. If the bounds of X are S..U 
    then the bounds of Y are norm(S)..norm(U)."

def disjoint(T: Type, U: Type) = 
   !(norm(T) <: norm(U) || norm(U) <: norm(T))

The freshening helps a bit (though I wasn't worrying about that yet).
The key problem is that you can't check if two sets A and B are disjoints by checking if either is included in the other, and it still seems you need unification or something similar.

Look again at

class C1[A, B, C] extends Arbitrary[A => (B => C)] { ... }
class C2[A, B, C] extends Arbitrary[(A => B) => C] { ... }

I want disjoint(Arbitrary[A => (B => C)], Arbitrary[(A => B) => C]) = false, but the current impl returns true.
In particular, the freshening by norm makes it equal to disjoint(Arbitrary[X1 => (X2 => X3)], Arbitrary[(Y1 => Y2) => Y3]), and that's good. But while they unify, neither is a subtype of the other.
(OTOH, two occurrences of the abstract type should probably be freshened by the same variable, or the test will be too conservative).

@odersky
Copy link
Author

odersky commented Mar 14, 2017

Here, for soundness, we want disjoint(T1, T2) = true if there exists no fully concrete type T (with no leftover type variables) that's an instance of both T1 and T2.

This would be draconian, as it would mean that even implementing a coherent trait with two different trait arguments would violate disjointness. After all, you can always define a class that inherits both traits.

But I don't think this would be required anyway. What could the common type of T1 and T2 do that would violate coherence?

@Blaisorblade
Copy link

First: maybe we should split a separate issue for coherence checking—the original proposal did not include it and the topic is non-trivial.

Here, for soundness, we want disjoint(T1, T2) = true if there exists no fully concrete type T (with no leftover type variables) that's an instance of both T1 and T2.

But I don't think this would be required anyway. What could the common type of T1 and T2 do that would violate coherence?

That restriction seems required to enforce coherence, in particular the invariant:

If a, b are instances of a coherent class C and m is a member of C, then a.m and b.m refer at runtime to the same implementation of m

In my example (assuming Arbitrary <: Coherent)

class C1[A, B, C] extends Arbitrary[A => (B => C)] { def f = 1; ... }
class C2[A, B, C] extends Arbitrary[(A => B) => C] { def f = 2;... }

you can have two instances a, b of C = Arbitrary[(A => B) => C => D] (for any specific A, B, C, D), one of C1 and one of C2. Let a = new C1() and b = new C2(), now a.f = 1 and b.f = 2. That's a coherence violation, isn't it?

This would be draconian, as it would mean that even implementing a coherent trait with two different trait arguments would violate disjointness. After all, you can always define a class that inherits both traits.

I'm confused, but if you mean trait Foo extends Arbitrary[T1] with Arbitrary[T2] { methodsFoo }, all methods in methodsFoo would pass check require (CM1 == CM2) || disjoint(instance(D1, B), instance(D2, B)) by passing the CM1 == CM2 part. (But I thought that only works if Arbitrary is covariant and either T1 <: T2; in fact trait T extends Seq[AnyRef] with Seq[String] appears to fail).

But more in general, the needed restriction might well be draconian—but we should not ask for the impossible and look at GHC to understand well the "simple" case. Scare quotes are appropriate—understanding that simple case well enough might take weeks.

In GHC, checking coherence when instances are declared indeed rejects my example and many others, including things which are pretty desirable. I suspect the example with applicativeIsFunctor and traverseIsFunctor would be rejected—indeed, each Haskell Applicative instance needs a boilerplate Functor instance, essentially instance Functor Foo where { fmap = fmapFromApplicative }. (In Scala this could be reduced a bit to implicit def fooIsFunctor = new FunctorFromApplicative[Foo]).
@adelbertc Would you consider such boilerplate acceptable?

I'm no expert there, but a quick look suggests we should check coherence at instance search time. Let's look at Haskell'98 and at OverlappingInstances.

  1. It appears my example would be rejected in Haskell'98—which in fact also rejects
instance context1 => C Int a     where ...  -- (A)
instance context2 => C a   Bool  where ...  -- (B)

see https://downloads.haskell.org/~ghc/8.0.2/docs/html/users_guide/glasgow_exts.html#instance-resolution

I hope we agree that's extremely restrictive.

  1. So let's look at OverlappingInstances (we should in fact look at the OVERLAPPABLE and OVERLAPPING pragmas but I won't describe that here). In that scenario, my example is accepted, but implicit search for Arbitrary[(A => B) => C => D] fails since it actually runs into the ambiguity. Worse, even when no ambiguity is detected, instance search checks if other implicits might become applicable:

[...] find all instances that unify with the target constraint, but do not match it. Such non-candidate instances might match when the target constraint is further instantiated. If all of them are incoherent, the search succeeds, returning the selected candidate; if not, the search fails.

See https://downloads.haskell.org/~ghc/8.0.2/docs/html/users_guide/glasgow_exts.html#instance-overlap.

@odersky
Copy link
Author

odersky commented Mar 14, 2017

you can have two instances a, b of C = Arbitrary[(A => B) => C => D] (for any specific A, B, C, D), one of C1 and one of C2. Let a = new C1() and b = new C2(), now a.f = 1 and b.f = 2. That's a coherence violation, isn't it?

Sure. And it would be covered by the new disjoint rules.

? => (? => ?)    and     (? => ?) => ?    unify.

That's what I intended to convey when I said "instantiatable type variables", but maybe that was not very clear.

@Blaisorblade
Copy link

Ah OK! So we're in vociferous agreement. And then I must read norm(T) <: norm(U) in

def disjoint(T: Type, U: Type) = 
   !(norm(T) <: norm(U) || norm(U) <: norm(T))

as "there is substitution sigma instantiating of type variables in T and U such that sigma T <: sigma U"? I must catch up with Dotty lexicon.

@vladap
Copy link

vladap commented Mar 16, 2017

This is probably very stupid point. But who knows. Spring has very simple feature which is not provided by Scala implicit system. If I have an interface A and I will configure SpringContext to contain multiple implementations which could be provided it fails on startup.

But if I @Autowired constuctor value as of type List[A] than SpringContext will inject List of all possible implementations instead of failing. Typically it is used for plugin systems.

If such feature would be there than developer could declare this way he wants to accepts multiple implementations and decide himself inside method implementation what to do. It would be similar like letting C++ developers to solve inheritance diamond themselves.

There could be special collection like type for it.

It just came to my mind.

EDIT: I think it wouldn't help at all. You are after something else.

@scottcarey
Copy link

scottcarey commented Mar 22, 2017

re: design choices and boilerplate

If given two choices:

  • One solution that takes a little bit more typing to achieve it, maybe a little bit of boilerplate, but is refactoring friendly and is easy to modify or rearrange without introducing painful surprises
  • One solution that is super elegant and concise when writing the code the first time, but is harder to maintain because it is not refactoring friendly and leads to surprises or brittleness when rearranged later

I would choose the first one every time. Code is re-read, re-organized, and re-written more often than it is written the first time, in any non-trivial project. The brittle-ness of various common patterns in Scala is a real problem.

re: ugliness of having to type something like [T <: ParametricAny] all over the place for coherent typeclasses. How about a shortcut to represent that like [*T] ? From what I can tell, these can't have variance annotations anyway.

And now for a related digression: Because Dotty now has union types, and we are discussing Top and Bottom types... Removing Null as a bottom type (or equivalently, NonNullRef) would excite my soul. Why? Well then we can define types where Null is not a valid value. At this time, it is only by convention that null is not used as a value, but now with union types Dotty has the power to track null introduction and elimination, and reason about when a type goes from A | Null to A and vice-versa, but there is no way to express F[A] where A is not null. Other languages have chosen things like A? to represent A | Null, I wouldn't mind that. I suspect that many type-classes would love to place such a restriction on the types they accept. What exactly does Monoid[Option[A]] do when the Option is null anyway? Undefined? None? null? Valid choices can be made, but last I looked scalaz threw a runtime exception failing to pattern match on null. As most scala code is written as if type parameterA really means A that is not null, and we aren't getting NPEs all the time, I suspect that converting the implicit convention of not using null to being an explicit request to use null won't break much code.

This also leads to exciting possibilities like a box-less option (and Either)

// A? is short for A | Null; AnyRef = NonNullRef | Null , etc
class Opt[A <: NonNullRef](a: A?) extends AnyVal {  
  def isEmpty = a == null
  ...
}

An inhabited bottom type may be required in some cases with variant types, and honestly I haven't thought through that. Perhaps in the case where an inhabitable bottom is required, one could just use A | Null instead of requiring that A have an inhabitable bottom?
Dealing with Java inter-op is fairly straight-forward though: If Java returns a String, scala would see that as String | Null unless the javax.annotation.NonNull (or simple) was applied to the method.

So... parting thoughts on a new Top type for the Coherence problem: Since Dotty has Union types, we no longer have to think of all of them as in a pure inheritance hierarchy. For example, Instead of something like:

       Any
        |
  AnyWithMethods 
     /     \
AnyVal      AnyRef

You can have:

  Any   =  AnyWithMethods | ParametricAny  // _horrible_ names

   AnyWithMethods
      /       \
  AnyVal     AnyRef

(the difference is subtle here, but in more complicated cases union types give more flexibility -- e.g. if one adds in "nullability" and "identity" orthogonal concepts, these don't fit into an inheritance tree nicely)

@TomasMikula
Copy link

Regarding parametric top, I think it would be beneficial to have a version of AnyVal without methods as well, i.e. a value class that is not a subtype of AnyObj. It seems to me that such value class would allow to avoid boxing when passed to a generic method (if the underlying type is already a ref type), because there are no methods defined on the value class that need to be preserved.

@Blaisorblade
Copy link

It seems to me that such value class would allow to avoid boxing when passed to a generic method (if the underlying type is already a ref type), because there are no methods defined on the value class that need to be preserved.

I'm afraid I fail to see how.

If you have a generic method f[T](t: T) and don't use specialization, it needs to be compiled to f(Object). Because what else should it compile to? Having methods doesn't matter: Haskell has (more) parametricity and does the same. And specialization in this case already avoids boxing successfully, but requires duplicating object code. This is not even a JVM limitation BTW, it's fundamental to implementing polymorphism. Some unboxed types simply take a different amount of bits. You can check for instance the discussion in Sec. 4.7 of the paper "Implementing lazy functional languages on stock hardware - the Spineless Tagless G-machine". "Types Are Calling Conventions", from 2009, mentions the same point.

For reference, I'm looking again at

       Any
        |
      AnyObj
     /     \
AnyVal      AnyRef

from https://github.com/lampepfl/dotty/issues/2047#issuecomment-286187515, but I don't think this makes a difference.

@TomasMikula
Copy link

If you have a generic method f[T](t: T) and don't use specialization, it needs to be compiled to f(Object).

Yes. The key was "if the underlying type is already a ref type." If I have

class Foo(val s: String) extends AnyVal

then passing Foo to f could just pass the underlying String?

@sjrd
Copy link

sjrd commented Apr 27, 2017

Even if you have no method, you still have to support x.isInstanceOf[Foo].

@TomasMikula
Copy link

Ah, that ruins it 😞

@Blaisorblade
Copy link

@sjrd I think that @TomasMikula's suggestion in fact works, which is cool! isInstanceOf is not accessible on ParametricAny/AnyObj! Unless I miss something and @sjrd shows why, I suggest filing a separate issue.

The key was "if the underlying type is already a ref type."

Ah, I saw that and wondered what that means. Thanks!

Even if you have no method, you still have to support x.isInstanceOf[Foo].

Not entirely... Parametricity forbids isInstanceOf—see Martin above, Scalazzi rules, or Haskell. Below is what I actually mean. I use ParametricAny and variants to be clear on what should be parametric.

class Foo(val s: String) extends ParametricAnyVal
class Foo2(val s: String) extends ParametricAnyVal

new Foo("hi").isInstanceOf[Foo] // I think we *could* allow this without violating parametricity.
//But this example is always `true`.
new Foo("hi").isInstanceOf[Foo2] //allowable, always false.

//But I'm sure `isInstanceOf` *cannot* be a member of `ParametricAnyVal`, because...

def weirdId[T <: ParametricAnyVal](t : T) = { 
  //x.isInstanceOf[Foo] // this must 100% be a compile error to support parametricity
  x
}

def weirdId2[T <: ParametricAny](t : T) = { 
  //x.isInstanceOf[Foo] // this must 100% be a compile error to support parametricity
  x
}

Here, weirdId and weirdId2 are supposed to be parametric. Neither can tell Foo and Foo2 apart. weirdId2 can also be invoked on Strings.

I mean, parametricity specifically means that weirdId2[String](s) == weirdId2[Foo](Foo(s)) == weirdId2[Foo2](Foo2(s)), because wrapping/unwrapping in Foo is an allowed change of representation. Change of representation have to be invisible to weirdId2 for it to be parametric.
In fact, in this program I think you could even identify at runtime Foo and Foo2.

It might still be hard to implement the optimization, especially if new Foo("hi").isInstanceOf[Foo] is allowed. It's much easier if Foo is never generated—like Haskell newtypes.

It's hard to construct examples where isInstanceOf can't be simplified statically—all the ones I can imagine require hierarchies of value classes/traits:

class Foo3(s: String) extends Foo(s) //with ParametricAnyVal //please, still a parametric value class.

def f(x: Foo) = x.isInstanceOf[Foo3] //actually requires a runtime test.

Are such hierarchies allowed for value classes? Do we need them for ParametricAnyVal, or can we forbid them here?

@TomasMikula
Copy link

@Blaisorblade It would be cool if this could be salvaged!

Are such hierarchies allowed for value classes?

All value classes are final, at least up to Scala 2.12.2.

@odersky
Copy link
Author

odersky commented Apr 27, 2017 via email

@Blaisorblade
Copy link

@odersky I was working based on one of your proposals

       Any
        |
      AnyObj
     /     \
AnyVal      AnyRef

from https://github.com/lampepfl/dotty/issues/2047#issuecomment-286187515, extended with some unspecified ParametricAnyVal. But ParametricAnyVal adds no methods to Any, so it could be the same. Maybe you can specify "valueness" with an annotation.

For reference, the constraint from @TomasMikula is

I think it would be beneficial to have a version of AnyVal without methods as well, i.e. a value class that is not a subtype of AnyObj.

@odersky
Copy link
Author

odersky commented Apr 27, 2017 via email

@TomasMikula
Copy link

My idea was this

    (Parametric)Any
               /   \
ParametricAnyVal   AnyObj
               \   /    \
              AnyVal     AnyRef

@neko-kai
Copy link

neko-kai commented Sep 10, 2018

It's strange to me that everyone was so busy with making rules for a global coherence requirement,
that no one has proposed an obvious solution for the original problem statement, ambiguities in implicit search. It's true that with global coherence ambiguities don't matter, but with all the restrictions on modularity and expressivity that coherence imposes, I think we should search for any other possible solution first, before trying to touch coherence.

If we forget about trying to impose global coherence, and come back to the ambiguity question, let's ask – what prevents us from choosing an arbitrary instance in case of ambiguity?
It's only the fact that when we get two opaque implicit values in our implicit search: Monad[F] and Traversable[F], we can't possibly know if the underlying Functor[F] is the same for both.

If we knew, we could pick either of them. But we could just directly check if they're the same!

Let's start with the usual suspects:

trait Functor[F[_]] { def map[A, B](f: A => B): F[B] }
trait Monad[F[_]] extends Functor[F]
trait Traversable[F[_]] extends Functor[F]

trait Maybe[A]
object Maybe {
  trait FunctorImpl extends Functor[Maybe] {
    final def map[A, B](f: A => B): Maybe[B] = ???
  }

  implicit val maybeMonad: Monad[Maybe] = new Monad[Maybe] with FunctorImpl { ??? }
  implicit val maybeTraversable: Traversable[Maybe] = new Traversable[Maybe] with FunctorImpl { ??? }
}

When we try to summon an implicit value, we'll get an ambiguity error.

  implicitly[Functor[Maybe]] // error

We know that both maybeMonad and maybeTraversable are inheriting the implementation of Functor[Maybe] from FunctorImpl, we also know that map in FunctorImpl is final, so neither of them could override it.
We have ample evidence that both of these implementations are equivalent with respect to Functor, but unfortunately, we hid it from the compiler by widening the type back to Monad[Maybe] and removing the mention of FunctorImpl, let's fix that mistake:

  implicit val maybeMonad: Monad[Maybe] with FunctorImpl = new Monad[Maybe] with FunctorImpl { ??? }
  implicit val maybeTraversable: Traversable[Maybe] with FunctorImpl = new Traversable[Maybe] with FunctorImpl { ??? }

Or just omit the signatures and let the compiler infer the narowest type:

  implicit final val maybeMonad = ???
  implicit final val maybeTraversable = ???

Now the compiler, too, has direct unfalsifiable evidence that both of the implementations are the same, it only needs to glance at the type signature to verify it. At this point it can safely pick either instance, knowing that they both point to the same trait.

  implicitly[Functor[Maybe]] // good, all the instances are just FunctorImpl in disguise

In more detail, when encountering an ambiguity over some type T, for each ambiguous instance,
the resolver should try to find the latest base class that implements T with all T methods as final, if all found base classes are the same for every instance, the compiler now has direct evidence that all implementations of T in the search are the same and can pick any of them.
Alternatively, a more exhaustive scheme would be to extract the set of methods of T and verify that for each ambiguous instance, there is a final implementation of each method with the same 'source'. This would allow two methods of T to be finaly implemented by two different traits, but still correctly disambiguated.

For library authors, the scheme would entail two minor changes to their workflow:

  • they would have to specify the narrowest type for their implicit values, instead of widening them
  • they would have to final their methods

For users of implicit there would be no changes at all – they would just suddenly stop getting ambiguity errors! They wouldn't have to specify coherence domains in their implicit summons and they wouldn't lose the ability to declare orphans and local instances.

This scheme does not substitute for coherence in all cases, but it solves the problem of ambiguity in a straightforward and obvious manner without imposing any draconian restrictions on the users.

Implementation: this scheme can be implemented easily today as an implicit macro or as a compiler plugin.

Caveats: If an implicit value is taken and re-assigned to a variable with a widened type, it's disambiguation properties are lost. However, coherent domains exhibit the same properties if an implicit in a domain is re-assigned to a domainless value.

A variant of this scheme can also be implemented if Scala gets first-class delegation, e.g:

  final val functorMaybe: Functor[Maybe] = ???

  implicit val maybeMonad: Monad[Maybe] with (Functor[Maybe] by functorMaybe.type)
  implicit val maybeTraversable: Traversable[Maybe] with (Functor[Maybe] by functorMaybe.type)

Would be resolved because the singleton types which implement the Functor class are proven the same.

Addendum: to prevent circumventing finality with e.g. final def map = realMap; protected def realMap, methods should be considered unambiguous IF they're final AND don't call any non-final/non-private members of this. Singleton-delegation variant can't be circumvented like this

@aloiscochard
Copy link

@Kaishh FWIW, what you just described is basically (delta few integration details with Dotty) the encoding of the TC system in Scalaz8.

@neko-kai
Copy link

neko-kai commented Sep 10, 2018

@aloiscochard
Doesn't Scalaz 8 define relationships between type classes themselves ahead of time?

This scheme applies only to instances, i.e. Monad and Traversable are completely free to be ambiguous on Functor for one type, but coherent for another, depending on whether the instances were implemented through a common ancestor or not.

[not similar, scalaz defines type class relations here. The thing scalaz 8 does vs. ambiguity is erase subtype relations with a newtype, then redefine custom type class relations via implicit conversions, which is different from an error recovery strategy that tries to resolve any subtype relation only in case of ambiguity]

@liufengyun liufengyun transferred this issue from scala/scala3 May 28, 2019
@pjrt
Copy link

pjrt commented Jun 17, 2019

If a type T extends directly or indirectly the Coherent trait, implicit searches for instances of type T will never issue ambiguity errors. Instead, if there are several implicit values of type T (where no value is more specific than any other) an arbitrary value among them will be chosen.

Isn't that he opposite of coherence? Not to mention terrifying (non-deterministic behavior).
Why not just throw the common ambiguous error but, like Haskell, prevent Coherent typeclass instances from:

  1. be not exported (must be public)
  2. be not imported (will always be imported if the user imports the module)
  3. automatically exported once imported

This is how Haskell does it, and seems to work well enough.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests