Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Motivation section #11

Open
NeilGirdhar opened this issue Jul 28, 2023 · 13 comments
Open

Motivation section #11

NeilGirdhar opened this issue Jul 28, 2023 · 13 comments

Comments

@NeilGirdhar
Copy link
Collaborator

The motivation is really uncompelling. It describes what intersections do, but not why we need them. I think it would be better to dig through the mypy issues to find various problems that require intersection to solve them and present them under motivation.

For example, this one: python/mypy#9424:

from typing import Optional, TypeVar

U = TypeVar('U', bound=Optional[int])


def f(u: U) -> U:
    if u is None:
        return None
    assert isinstance(u, int)  # removing this line causes it to pass.
    return u

works with working intersections since u will have static type int & U.

@erictraut
Copy link
Collaborator

I agree that the motivation section of the PEP needs to be compelling. Adding support for intersections will require a significant amount of work across many tools in the ecosystem. If the motivation is not sufficiently compelling, it will be difficult to rally support for it and justify spending the time on a complex new feature rather than, for example, fixing existing type checker bugs or adding support for existing typing features (like PEP 695) in mypy.

For the above example, pyright already implements a lightweight form of intersections to handle this case. I refer to the feature as "conditional types", and you can read about it here. Other type checkers could implement this solution (without the need for a new PEP and at a fraction of the implementation cost) if they want to improve their support for type narrowing when used with type variables. If intersections are officially introduced to the type system, I will change pyright's implementation to use real intersections for this functionality, but since I've already found a good solution to this problem, pyright users won't observe any improvement.

I'd love to see a comprehensive list of use cases that motivate the need for intersections.

@NeilGirdhar
Copy link
Collaborator Author

For the above example, pyright already implements a lightweight form of intersections to handle this case.

Yes, it's great! But it is limited. I've definitely run into its limitations. I'll scan through my code for pyright: ignore directives that would be solved with full intersections.

, I will change pyright's implementation to use real intersections for this functionality, but since I've already found a good solution to this problem, pyright users won't observe any improvement.

Awesome.

I think we should highlight some of the benefits to pyright users that already have conditional types. Intersections do have other advantages. They handle conflicting methods a bit differently. If there's a conflicting method on A & B, there's talk of making that an error whereas Pyright's conditional types uses A's definition. That might let errors slip through.

So, in addition to adding motivations of intersections, we should:

  • add a section on existing solutions for rudimentary intersections (custom protocols, Pyright's conditional types), and
  • add motivations that explore why intersections are significantly more useful than these rudimentary intersections.

@erictraut
Copy link
Collaborator

Thinking about this a bit more, I don't think I'd be able to switch from my current implementation of "conditional types" to real intersections. It would break many existing use cases that work with conditional types. That's because the condition is propagated to return types of calls made on conditional types. That wouldn't be the case for intersections. Consider the following:

def func(v: AnyStr) -> AnyStr:
    if isinstance(v, str):
        reveal_type(v)  # str*
        x = v.capitalize()
        reveal_type(x)  # str*
    else:
        reveal_type(v)  # bytes*
        x = v.center(0)
        reveal_type(x)  # bytes*

    reveal_type(x)  # str* | bytes*
    return x  # No error

If this were replaced with a real intersection implementation:

def func(v: AnyStr) -> AnyStr:
    if isinstance(v, str):
        reveal_type(v)  # str & AnyStr
        x = v.capitalize()
        reveal_type(x)  # str
    else:
        reveal_type(v)  # bytes & AnyStr
        x = v.center(0)
        reveal_type(x)  # bytes

    reveal_type(x)  # str | bytes
    return x  # False positive error: Incompatible return type

@NeilGirdhar
Copy link
Collaborator Author

Can you remind me what the * indicates again?

@erictraut
Copy link
Collaborator

See this documentation.

@NeilGirdhar
Copy link
Collaborator Author

NeilGirdhar commented Jul 28, 2023

Okay, I see. The way I see it, the problem is that some information is lost, but not about the variable x, but rather about the variable v. I feel like there should be some way to record that AnyStr's original type was str. Would it be possible to add a directive typing.reveal_bound:

def func(v: AnyStr) -> AnyStr:
    reveal_bound(AnyStr)  # Returns the original bound: (str, bytes) 
    if isinstance(v, str):
        reveal_bound(AnyStr)  # Returns the narrowed type: str
        reveal_type(v)  # str & AnyStr
        x = v.capitalize()  # line B
        reveal_type(x)  # str

All good so far, but now when the user tries to return x, which is supposed to match the type variable, Pyright could backtrack through the function, and at line B (marked), it would see that x's static type is a subtype of AnyStr's static type, and it would be able to resurrect str & AnyStr as its static type. If it can do that for both branches, then the revealed type would be str & AnyStr | bytes & AnyStr, which is a subtype of the return type.

Alternatively, you could do it all in the forward pass, which is I guess what's going on with your * notation, but only report the & AnyStr if it's actually used to satisfy something.

Would that work?

If so, this solution would support a broader range of problems, e.g.:

def func(v: AnyStr) -> AnyStr:
    if isinstance(v, str):
        reveal_type(v)  # str & AnyStr
        x = "abc"
        reveal_type(x)  # Literal['abc'] & AnyStr (by backtracking)
    else:
        reveal_type(v)  # bytes & AnyStr
        x = b"abc"
        reveal_type(x)  # Literal[b'abc'] & AnyStr (by backtracking)

    reveal_type(x)  # Literal['abc'] & AnyStr| Literal[b'abc'] & AnyStr
    return x

@erictraut
Copy link
Collaborator

Sorry, but I don't understand either of your proposals or your code samples.

@DiscordLiz
Copy link
Collaborator

The cases I have for intersections are entirely protocols and mixins. Not sure if that helps, but I think those are really strong motivating cases. I don't think there's going to be much demand for intersections outside of the use by library authors needing to indicate to their users type information, but I'd love to see more uses people have.

@NeilGirdhar
Copy link
Collaborator Author

NeilGirdhar commented Jul 29, 2023

@erictraut I'm trying to solve the problem you showed whereby Pyright uses "conditional types" to track that a variable's type is a subtype of a type variable. This solution solves this problem, but the continuity of these conditional types is limited. Instead of conditional types, I'm proposing a more general solution (see the "broader range of problems" example).

The ideas of this general solution are:

  • The problem is that Pyright is only inferring information about the static type of variables (and intermediate values). I think it needs to infer information about type variables as well.
  • That's why I suggested reveal_bound as a way of viewing what we know about a type variable at any point in the code.
  • Once you can deduce information about bounds, you can then know when static types are subtypes of those bounds.
  • It would be too expensive to track for every static type, which type variables it's a subtype of. Therefore, we wait until there's a type error x is not a subtype of a type variable T to induce the type checker to work backwards checking if that static type is in fact a subtype of T.
  • The backtracking is just a backwards pass through the code, splitting at each branch. Each branch of the backtracking can stop if reveal_type(x) is a subtype of reveal_bound(T).

For example, let's run an ordinary forward pass through this code inferring types of variables and information about type variable bounds:

def func(v: AnyStr) -> AnyStr:
    if isinstance(v, str):
        x = "abc"
        reveal_type(x)  # Literal['abc']
        reveal_bound(AnyStr)  # str
    else:
        x = b"abc"
        reveal_type(x)  # Literal[b'abc']
        reveal_bound(AnyStr)  # bytes

    reveal_bound(AnyStr)  # (str, bytes)
    reveal_type(x)  # Literal['abc'] | Literal[b'abc']
    return x  # Type error: x is not an instance of AnyStr.

And now, given the type error, let's run a backward pass to see what we can learn about x relative to AnyStr:

def func(v: AnyStr) -> AnyStr:
    if isinstance(v, str):
        x = "abc"
        reveal_type(x)  # Literal['abc'] & AnyStr (since x is an instance of AnyStr's bound)
    else:
        x = b"abc"
        reveal_type(x)  # Literal[b'abc'] & AnyStr (as above)

    reveal_type(x)  # Literal['abc'] & AnyStr | Literal[b'abc'] & AnyStr
    return x  # No type error anymore.

Does that make more sense?

@erictraut
Copy link
Collaborator

You're using the word "bound" here in a way that doesn't make sense to me, but I think I know what you mean. Pyright doesn't do "forward passes" and "backward passes", so that represents a misunderstanding of how it works, but again I think I get the gist of what you mean. What you're proposing would be very expensive at analysis time, so it would be a big performance regression for pyright and pylance — probably not something we would entertain. Even if we were to entertain such a change, the existing "conditional types" mechanism would suffice; generalized intersections wouldn't be needed.

In any case, we're a bit in the weeds at this point. Popping up a few levels, I think you now appreciate why generalized intersections are not needed to solve the constrained TypeVar issue that you referenced at the start of this thread.

@randolf-scholz
Copy link

As a very principled motivating example, I'd put forward the multiple inheritance problem that came up in one of the other threads.

Given:

class X:
    def foo(self, x: A) -> B: ...
class Y:
    def foo(self, x: C) -> D: ...

One may be tempted to implement a common subclass of X and Y as

class Z(X, Y):
    @overload
    def foo(self, x: A) -> B: ...
    @overload
    def foo(self, x: C) -> D: ...

However, given how currently type-checkers handle overloads (noting that overloads are not sufficiently specified in PEP484), this is not type safe since a type-checker would identify Z.foo(A&C) as being of type B, which would violate LSP if Z was used as a replacement for Y, since Y.foo(A&C) would expect return type D.

There are 2 ways of resolving this:

① With the help of intersections, we can write

class Z(X, Y):
    @overload
    def foo(self, x: A&C) -> B&D: ...
    @overload
    def foo(self, x: A) -> B: ...
    @overload
    def foo(self, x: C) -> D: ...

Which would make Z type-safe again under the "first-match"-paradigm.

② Overloads need to be re-specced (or properly specced in the first place, since PEP484 is silent on this topic), so that when multiple overloads match a given input, then the intersection type of the return values is considered.

This second option has the potential to break a lot of existing code however.

@NeilGirdhar
Copy link
Collaborator Author

That's a great example. Are you aware of it ever coming up in practice?

@mikeshardmind
Copy link
Collaborator

Tossing this here so it doesn't get lost as something else waiting on intersections in some form. python/typeshed#10523

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants