Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method return type checks avoid conversion when the type fits #10882

Closed
wants to merge 3 commits into from

Conversation

JaroslavTulach
Copy link
Member

@JaroslavTulach JaroslavTulach commented Aug 23, 2024

Pull Request Description

Discussion about type classes revealed that Enso is able to support Haskell 8 type classes with a single variable. However more complicated type classes (enabled with -XMultiParamTypeClasses and by default on in Haskell 9) are more complicated. Turns out one may do it at the end, but only with intersection types returned from functions and conversion methods. This PR enables such behavior.

Related

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • All code follows the
    Scala,
    Java,
  • Unit tests have been written where possible.

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Aug 23, 2024

The original Haskell program provided by @radeusgd:

class Connection a where
    get_tables :: a -> [String]
    
data Snowflake_Connection = Snowflake_Connection_Impl -- here we'd have internals of the connection

instance Connection Snowflake_Connection where
    get_tables c = ["some Snowflake table"] -- we'd call JDBC here ofc.
    
get_warehouses :: Snowflake_Connection -> [String]
get_warehouses c = ["SNOWFLAKE_WAREHOUSE"]

database_connect :: Connection c => Connection_Details d c => d -> c
database_connect details = make_connection_from_details details

class Connection_Details d c where
    make_connection_from_details :: d -> c
    
data Snowflake_Connection_Details = Snowflake_Connection_Details String String String -- username, password etc.

instance Connection_Details Snowflake_Connection_Details Snowflake_Connection where
    make_connection_from_details details = Snowflake_Connection_Impl
    
main :: IO ()
main = do
  let snowflake_connection = database_connect (Snowflake_Connection_Details "foo" "bar" "baz")
  print (get_tables snowflake_connection)
  print (get_warehouses snowflake_connection)

The program rewritten to Enso is made part of our test suite.

@JaroslavTulach JaroslavTulach changed the title Return value from a typed method stays uncoverted if its type fits Method return type checks avoid conversion when the type fits Aug 23, 2024
@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Aug 23, 2024

Just a recap how one is supposed to write type classes in Enso.

Define API and "Dictionary"

Here is a way to define Set type class:

from Standard.Base import Error
from Standard.Base.Errors.Common import Type_Error
from  Standard.Base.Errors import Illegal_State

type Set a
    Value o s:a

    is_empty self = self.o.is_empty self.s
    contains self e = self.o.contains self.s e
    length self = self.o.length self.s
    insert self e = Set.Value self.o <| self.o.insert self.s e
    union self (other:Set) = if self.o != other.o then Error.throw (Type_Error "Incompatible sets") else
        Set.Value self.o <| self.o.union self.s other.s

type Operator a
    is_empty (_ : Set) = Error.Throw (Illegal_State.Error "is_empty not implemented")
    contains (_ : Set) _ = Error.Throw (Illegal_State.Error "contains not implemented")
    length (_ : Set) _ = Error.Throw (Illegal_State.Error "length not implemented")
    insert (_ : Set) _ = Error.Throw (Illegal_State.Error "insert not implemented")
    union (_ : Set) _ = Error.Throw (Illegal_State.Error "union not implemented")

The type Set defines is_empty, contains, length, insert and union operations. Once one has such a set up, one can define an implementation.

Implementation with Vector Backing Store

Let's use Vector Any to store the elements as shown here:

from Standard.Base import Vector

import project.Set.Set


type Vector_Impl a
    empty = Set.Value Vector_Impl []

    is_empty (elements : Vector) = elements.length == 0
    contains (elements : Vector) e = elements.contains e
    length (elements : Vector) = elements.length
    insert (elements : Vector) e = if elements.contains e then elements else elements + [ e ]
    union (us : Vector) (them : Vector) = (us + them).distinct

the preferred way of creating type Set is to define a conversion to turn any Vector into a Set:

Set.from (that:Vector) = Vector_Impl.empty . insert that

Using Set

To use the set one can start with a conversion and they operate on the set as desired:

v = [1, 2, 3]
s = v:Set
four = s . insert 4
four . length . should_equal 4

Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really cool that we are able to implement this example in Enso.

But I'm actually worried about integrating this (I know I gave you the example, but this was just a way to show that this kind of type refinement is doable in Java or Haskell, I did not intent to argue that Enso should do it the same way).

We will probably want some way to be able to nicely typecheck Database.connect to return 'more specific' types that expose additional methods, like Snowflake_Connection.warehouses.

However, doing it this way is actually breaking the assumptions I made in #9812. I'm assuming that if I have a variable x : Integer, it will only contain methods available on Integer (and supertypes). Only if I have a variable of type a : Any I assume that this can be 'anything' and allow calling any kind of methods on it.

If now we allow x : Integer to actually be x : Integer & Mixin and allow methods of Mixin type to be called on x, then no static type checking of method calls can be done - we are becoming fully dynamic as everywhere I can have random methods being added to types by such mixins.


This PR was supposed to solve the typechecking of Database.connect, but it seems to me that the 'cost' of solving that problem is completely disabling us from doing static type analysis. At this point, correct typechecking of Database.connect has no longer much benefit, as we are unable to do any typechecking at all. The only benefit is that we can 'correctly' typecheck this at runtime - but at runtime we know the concrete type of the object anyway so it is of little gain.

Please don't take this as criticism of the overall idea - I'm really glad we were able to implement this extension of the 'typeclass' mechanims in Enso with such a relatively small change. But I'm actually worried that this will cause us more problems than benefit.


Of course there is always possibility that I misunderstood something. If I'm wrong about preventing to do static type analysis, please let's discuss - I'll be happy to be shown that we can actually still do type analysis with such change. But I just don't see such a possibility.

@radeusgd
Copy link
Member

As for the typing of Database.connect and similar examples - I think we need to keep looking for a solution that allows us to refine the types somehow, but still keep some form of typechecking (although at first it does sound like it may be a contradictory requirement...).

For now, I think it is just safer to keep Any return type on Database.connect - as apart from documentation purposes, it gives us similar guarantees as Connection typeclass return type in the new 'paradigm'.

@radeusgd
Copy link
Member

Hmm, I tried compiling an example:

from Standard.Base import all

type My_Type
    Value x

    get_a self -> Integer = 42

type Mixin_1
    Value x

    get_a self -> Boolean = False

type Mixin_2
    Value x

    get_a self -> Text = "Hello"

Mixin_1.from (that : My_Type) = Mixin_1.Value that.x
Mixin_2.from (that : My_Type) = Mixin_2.Value that.x

do_call (mt : My_Type) =
    IO.println <| "calling get_a on " + mt.to_text + " ( : My_Type)"
    result = mt.get_a
    IO.println <| "result of mt.get_a (expected -> Integer): " + result.to_text

main =
    x = My_Type.Value 42
    do_call x

    m1 = (x : Mixin_1 & My_Type)
    do_call m1

    m2 = (m1 : Mixin_2 & My_Type)
    do_call m2

    do_call (x : My_Type & Mixin_1 & Mixin_2)
    do_call (x : Mixin_1 & Mixin_2 & My_Type)

and I see it is more nuanced than I thought - I thought the example will be troubling, sometimes calling the My_Type.get_a and sometimes ones from Mixins, but it seems that you are doing the special logic only on -> T check and not on : T checks, so the argument type check of the function is actually still behaving like I'd expect. So maybe it's not as worrying as I thought.

I'm still not entirely convinced why the -> T check should behave differently to : T, it feels quite non-uniform. I don't feel like I understand enough of implications of such change to introduce it rashly.

@radeusgd
Copy link
Member

Thinking through implications for #9812 - my static type analysis logic relies on -> behaving the same as :. I'm trying to run actual type inference, e.g. when I have a function foo (arg Text) -> Integer = ... and I see a call x = foo "abc", then my static analysis is inferring that the type of x should be Integer. And that inference is the same as if I had an argument x : Integer - in both cases I assume that I have enough info to infer that x is an Integer.

And if on such x known to be Integer (in both cases), I assume that calling non-Integer methods is an error.

But with this PR, only x : Integer would have such guarantees, and the Integer returned from foo (arg : Integer) -> Text = ..., a special multi-value may come out such that it may have additional methods that are not expected. Thus I can no longer do my method checks. So I'd have to introduce a type 'surely Integer' and 'Integer with maybe extra stuff'. I'm not sure this is a good idea.

@radeusgd
Copy link
Member

Btw. is it expected that

from Standard.Base import all

type My_Type
    Value x

    get_a self -> Integer = 42

type Mixin_1
    Value x

    get_a self -> Boolean = False

type Mixin_2
    Value x

    get_a self -> Text = "Hello"
    get_b self -> Text = "World"

Mixin_1.from (that : My_Type) = Mixin_1.Value that.x
Mixin_2.from (that : My_Type) = Mixin_2.Value that.x

returns_mt x =
    My_Type.from x

do_call arg =
    mt = returns_mt arg
    IO.println <| "returned -> My_Type mt = " + mt.to_text
    result = mt.get_a
    IO.println <| "mt.get_a: " + result.to_text # expecting Integer

main =
    x = My_Type.Value 42
    do_call x

    m1 = (x : Mixin_1 & My_Type)
    do_call m1

    m2 = (m1 : Mixin_2 & My_Type)
    do_call m2

    do_call (x : My_Type & Mixin_1 & Mixin_2)
    do_call (x : Mixin_1 & Mixin_2 & My_Type)

prints

returned -> My_Type mt = (My_Type.Value 42)
mt.get_a: 42
Execution finished with an error: Could not find a conversion from `Mixin_1` to `My_Type`.
        at <enso> multivalue.returns_mt(multivalue.enso:23:5-18)
        at <enso> multivalue.do_call(multivalue.enso:26:10-23)
        at <enso> multivalue.main(multivalue.enso:36:5-14)

?

I thought that Mixin_1 & My_Type should be convertible to My_Type.

@JaroslavTulach
Copy link
Member Author

If now we allow x : Integer to actually be x : Integer & Mixin and allow methods of Mixin type to be called on x,

That's how OOP works and that's how it works in Java compiler or IDEs. If java.lang.Runnable x, then the only non-java.lang.Object method that can be called on it is x.run(). The fact that there may be other methods - for example when Runnable x = new Thread() is irrelevant. IDEs will not display the additional Thread methods and the compiler will refuse to call them. These additional methods are only available via reflection. The Enso case (after this PR) is however a bit different:

  1. all the Integer & Mixin methods can be called (different)
  2. IDE (e.g. component browser) would see all of them being available (same)

then no static type checking of method calls can be done - we are becoming fully dynamic as everywhere I can have random methods being added to types by such mixins.

I can understand why you think issue no. 1 complicates your work on static type checking. I assume if we could get no. 2 and allow only methods of Integer to be called on such a mixin value you'd find the change acceptable.

@JaroslavTulach
Copy link
Member Author

not entirely convinced why the -> T check should behave differently to : T, it feels quite non-uniform

Yes, the difference is unfortunate. I don't need this behavior on regular methods. It could be disabled.

The motivating example only needs it for conversion methods. I need from to be able to return the mixin (kind what the Haskell example does). That used to be possible until #10468 - got integrated.

@JaroslavTulach JaroslavTulach added the s-research-needed Status: the task will require heavy research to complete label Aug 30, 2024
@JaroslavTulach
Copy link
Member Author

In any case: at current state this PR cannot be integrated. Closing.

mergify bot pushed a commit that referenced this pull request Dec 12, 2024
Implementation of **type checks** for **intersection types**. The idea is to split the list of `types[]` in `EnsoMultiValue` into two parts:
- first `methodDispatchTypes` represent the types the value _"has been cast to"_
- the rest of the types represent the types the value _"can be cast to"_

By performing this separation we address the #10882 requirements. After a type check only methods available on the `methodDispatchTypes` can be invoked. However the value can still be cast to all the possible types.
jdunkerley pushed a commit that referenced this pull request Dec 13, 2024
Implementation of **type checks** for **intersection types**. The idea is to split the list of `types[]` in `EnsoMultiValue` into two parts:
- first `methodDispatchTypes` represent the types the value _"has been cast to"_
- the rest of the types represent the types the value _"can be cast to"_

By performing this separation we address the #10882 requirements. After a type check only methods available on the `methodDispatchTypes` can be invoked. However the value can still be cast to all the possible types.

(cherry picked from commit 2964457)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
s-research-needed Status: the task will require heavy research to complete
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants