Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only allow the correct type to be used for self in the static method call syntax #8805

Closed
2 tasks
radeusgd opened this issue Jan 18, 2024 · 10 comments · Fixed by #8867
Closed
2 tasks

Only allow the correct type to be used for self in the static method call syntax #8805

radeusgd opened this issue Jan 18, 2024 · 10 comments · Fixed by #8867
Assignees
Labels
-compiler -type-system Category: type system and type inference

Comments

@radeusgd
Copy link
Member

radeusgd commented Jan 18, 2024

A discussion during today's standup led me to try discussing this suggestion.

Expectations

When I'm declaring member methods on a type, I definitely expect self within those methods to have that type:

type My_Type
    Ctor_A x
    Ctor_B x y

    sum self = case self of
        My_Type.Ctor_A x -> x
        My_Type.Ctor_B x y -> x + y

    my_method self z =
        self.sum + z

Within sum, I definitely expect that self has type My_Type, so that I know that my case-of statement covering the 2 constructors of it is definitely exhaustive.

Within my_method, I definitely expect that self has type My_Type, so that I know I can safely call the other method sum on it.

Unexpected reality

We have static syntax for calling member methods, which allows us to replace x.my_method 200 with My_Type.my_method x 200.

This syntax allows us to kind-of 'manually' specify the self parameter to the method, by referring to it statically.

However, it does not enforce that such-specified self has the correct type. Thus the following program:

from Standard.Base import all

type My_Type
    Ctor_A x
    Ctor_B x y

    sum self = case self of
        My_Type.Ctor_A x -> x
        My_Type.Ctor_B x y -> x + y

    my_method self z =
        self.sum + z

    simple_method self = self.x+1000

type Other_Type
    Value x

main =
    o = Other_Type.Value 23

    IO.println <| My_Type.simple_method o
    IO.println <| Panic.recover Any <| My_Type.my_method o 200
    IO.println <| Panic.recover Any <| My_Type.sum o

will yield

1023
(Error: (No_Such_Method.Error (Other_Type.Value 23) UnresolvedSymbol<sum>))
(Error: (Inexhaustive_Pattern_Match.Error (Other_Type.Value 23)))

The simple_method will just work, because Other_Type provides all it needs - a getter x. The other 2 methods however, will break in wonderful (read: harder to debug) ways somewhere within the internals of the implementation.

As @JaroslavTulach pointed out, this basically rules out the ability to infer self : My_Type inside of any of these methods, because as shown in this example, the self may be whatever type we put there.

This defeats almost all of our type checking on member methods (which constitute most of Enso libraries). If I don't know the type of self, I cannot statically infer the type of any methods called on self as well. Thus if a member method relies on other member methods/fields, I cannot really perform type inference within it. And lots of methods do refer to others in such a way.

I don't think this is good, so I'd suggest to change that.

Proposal

Let's enforce the type of self by checking it during the static-syntax dispatch.

To keep sanity of our code, I think we need to ensure that self is what the developer expects it to be. The proposal is to add a check that will only allow the correct type within the static dispatch.

That is My_Type.sum (My_Type.Ctor_A 2) will happily work, however My_Type.sum (Other_Type.Value 3) will raise a Type_Error.

Then the behaviour of the program I pasted above would be following:

(Error: (Type_Error.Error My_Type (Other_Type.Value 23) 'self'))
(Error: (Type_Error.Error My_Type (Other_Type.Value 23) 'self'))
(Error: (Type_Error.Error My_Type (Other_Type.Value 23) 'self'))

All these invocations would fail, because we cannot call a My_Type member method on a non-My_Type instance, even (ab)using the static call syntax.

I think this is the right thing to do, because then, within those member methods, we can safely assume that self is of the expected type. This is sane both for the developer and for the type system.

Consequences of the change

I'm not aware of any problematic consequences of this change. However, I may be missing something regarding the expected semantics of the static-syntax for member methods. If so, let's discuss.

The only usage of this pattern that I know of currently is within Array.enso where we dispatch statically the methods defined on Vector with the Array as their self parameter, to re-use the Vector's implementation of these methods. This is a neat trick, but I don't think it's a thing that we really need at all, so I'd suggest to refactor it (an alternative has already been suggested). We can move the common methods to be static methods on Array_Like_Helpers that take a array_like parameter (what used to be self). Then we can change the invocations to be:

-Array.insert self at=self.length item=Nothing = Vector.insert self at item
+Array.insert self at=self.length item=Nothing = Array_Like_Helpers.insert self at item

Tasks

@radeusgd radeusgd added -type-system Category: type system and type inference -compiler labels Jan 18, 2024
@radeusgd
Copy link
Member Author

@JaroslavTulach has mentioned that @wdanilo was heavily involved in the design of the 'static syntax for calling member methods' (I'm not sure what is the official name of this construct). @wdanilo could you have a look at this proposal and let me know if you think it is OK to implement it?

@wdanilo
Copy link
Member

wdanilo commented Jan 18, 2024

Given

type My_Type
    Ctor_A x
    Ctor_B x y

    sum self = case self of
        My_Type.Ctor_A x -> x
        My_Type.Ctor_B x y -> x + y

My_Type.sum : My_Type -> Number, always. Passing any other type to My_Type.sum should throw type error. If we are not throwing type error here, this should be considered a bug in the current implementation, because, as you pointed out, it can easily cause hard to find bugs throwing errors from their internal implementation. Of course, such things can be (should be?) caught by type inferencer, as checking that at runtime might slow things down.

Anyway, yes, self is always typed as Self.

@JaroslavTulach
Copy link
Member

Let's enforce the type of self by checking it during the static-syntax dispatch.

Should the policy have any real effect, it has to be enforced during runtime. E.g. we have to use ReadArgumentCheckNode to verify self in the static version of each method (invocable via the static Type.method syntax). Using ReadArgumentCheckNode will give us automatic conversions of the self argument, however that seems desirable and consistent with behavior of other static methods.

@radeusgd radeusgd self-assigned this Jan 18, 2024
@radeusgd radeusgd moved this from ❓New to 📤 Backlog in Issues Board Jan 18, 2024
@radeusgd radeusgd moved this from 📤 Backlog to 🔧 Implementation in Issues Board Jan 25, 2024
@enso-bot enso-bot bot mentioned this issue Jan 25, 2024
6 tasks
@radeusgd radeusgd moved this from 🔧 Implementation to 👁️ Code review in Issues Board Jan 25, 2024
@enso-bot
Copy link

enso-bot bot commented Jan 25, 2024

Radosław Waśko reports a new STANDUP for today (2024-01-25):

Progress: Implemented the self type check (only in the static invocation path, as it is unnecessary in member invocation path and would be wasteful), and added tests for it. Also fixed #8706 because researching the self type check I found the cause of that. It should be finished by 2024-01-29.

Next Day: Next day I will be working on the #8809 task. Continue S3 writing - work on a Writable_File 'typeclass' that allows us to abstrac the logic for writing to files on various backends. Think how to make this work sensibly with On_Existing_File logic.

@enso-bot
Copy link

enso-bot bot commented Feb 5, 2024

Radosław Waśko reports a new 🔴 DELAY for today (2024-02-05):

Summary: There is 9 days delay in implementation of the Only allow the correct type to be used for self in the static method call syntax (#8805) task.
It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: Did not spend enough time last week on this because I was stumped by S3 file writing ticket and I was trying to finish it before moving on to other tasks.

@enso-bot
Copy link

enso-bot bot commented Feb 5, 2024

Radosław Waśko reports a new STANDUP for today (2024-02-05):

Progress: Fixed remaining tests for S3 file writing. Reviewed all write operations should be able accept S3_File. Moved common implementation of common Vector operations to Array_Like_Helpers. Basic tests are passing. Got annoyed by #8976 - Base tests are failing with it and I cannot find the cause... It should be finished by 2024-02-07.

Next Day: Next day I will be working on the #8833 task. Try finding the cause of Base_Tests failing with the refactor. If time allows move 2 builtins and run benchmarks. Start work on S3 file copy (initial steps already done in #8921).

@enso-bot
Copy link

enso-bot bot commented Feb 7, 2024

Radosław Waśko reports a new STANDUP for yesterday (2024-02-06):

Progress: Found the root cause for the base tests failure. Working on a fix. Working on license review for #8907 It should be finished by 2024-02-07.

Next Day: Next day I will be working on the same task. Try finishing my part of the license review. Try another approach for the fix for #8805. Run benchmarks. Try different approach for common Vector ops and benchmark it also.

@enso-bot
Copy link

enso-bot bot commented Feb 9, 2024

Radosław Waśko reports a new 🔴 DELAY for today (2024-02-09):

Summary: There is 5 days delay in implementation of the Only allow the correct type to be used for self in the static method call syntax (#8805) task.
It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: Unexpected edge case - extension methods defined on a type that is not available in the current scope made the current approach not work. Tried using qualified type names but that seemed to not work because of #8997. Also need a bit more time to run the benchmarks and tried the other approach to Array/Vector sommon methods suggested by James.

@enso-bot
Copy link

enso-bot bot commented Feb 9, 2024

Radosław Waśko reports a new STANDUP for the provided date (2024-02-07):

Progress: Finished the license review. Implemented self-type resolution as a fix for #8805. It should be finished by 2024-02-12.

Next Day: Next day I will be working on the #8833 task. Check and fix remaining test failures in #8805 (~1). Review suggestions on #8809. Start work on #8833.

@enso-bot
Copy link

enso-bot bot commented Feb 12, 2024

Radosław Waśko reports a new STANDUP for the provided date (2024-02-09):

Progress: Fixing edge cases and refactoring self type resolution. Running stdlib benchmarks (engine are failing, waiting for a fix). Applying review suggestions for #8809, created followup tasks. It should be finished by 2024-02-12.

Next Day: Next day I will be working on the #8833 task. Add tests for copying accross backends, improve it.

@mergify mergify bot closed this as completed in #8867 Feb 14, 2024
mergify bot pushed a commit that referenced this issue Feb 14, 2024
@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-compiler -type-system Category: type system and type inference
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants