-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More consistency in inference rules between type checkers #1315
Comments
You may say - type inference is a feature of the type checker. However, I think that from a user's perspective, this is just not the case. Inference should be part of the language, just like the rest of the typing features (which are consistent across type checkers and specified in PEPs). |
P.S. If a full specification is difficult or problematic, another option is to at least make a list of differences for users to consult. I think such a list could also help in discussing the situation and achieving more consistency eventually. |
Most code bases are type checked with only one type checker. It's quite rare for code bases to use more than one, so I don't think this is a widespread problem. It would be a massive amount of work to debate the merits of every inference behavior and reach consensus on a common behavior. There are literally hundreds — and probably even thousands — of such individual decisions involved in type inference. Even if we could achieve consensus, it would be very painful for any of the major type checkers to modify their inference behaviors as it would cause massive compatibility issues for those who are using those type checkers for existing code bases. This would cause significant code churn for little or no benefit to those code base owners. To achieve consistent type evaluation behaviors across all type checkers, you'd need to do much more than agree on inference behaviors. You'd also need to agree on bidirectional (context-based) type inference behaviors, type narrowing behaviors, override resolution behaviors, constraint solver behaviors, and more. As a maintainer of pyright, I don't feel very motivated to participate in such an exercise when the cost would be so high (we're talking about many hundreds of hours of discussions, debates, presentations, compromises, etc.) that would result in significant pain and little or no benefit for most users. Perhaps there are a few specific behaviors that we could target that would lessen your pain if you decide to continue to use both mypy and pyright. In my experience, there are several primary sources of inconsistent type evaluation:
I do think it's reasonable to document the differences in behaviors between major type checkers. A while ago, I started to write a document that explains the difference between pyright and mypy behaviors and explains the justification for each of these differences. As you mentioned above, every one of these decisions was made for a well-justified reason. I'll try to find time to make further progress on this document and eventually post it. |
Thank you very much for the detailed answer.
Right, but as a library author, I must take these differences into account, since my users might use any of the available type checkers. That is a key point I forgot to mention earlier so thanks for pointing that out. One example is annotating parameters with Literal - consider the examples I gave here and the API typing debate exacerbated by the inconsistency. Consider also this example from numpy where they had to modify overload ordering to cater for mypy & pyright overload choosing logic. Also in typeshed (python/typeshed#8566).
Yes. I understand.
Another weird difference is that mypy only allows redefinitions "within the same block and nesting depth as the original definition", which seems somewhat arbitrary
That would be super helpful! |
Thanks for the additional context. I agree that it's important for library authors to be able to define interface types in a manner that works across type checkers. I think that's a more attainable goal than the more general goal you articulated above. And I think this is a goal worth pursuing. You may already be aware of this, but I implemented a feature in pyright (in the form of the Looking at the four sources of type evaluation inconsistency that I listed above, item 2 is already slated to be fixed. If you think this is important to have fixed for the mypy 1.0 release, please voice your opinion. I think that there's a strong argument to be made for the change in item 3. If you're interested in seeing that, please lend your voice to the discussion in the mypy tracker. Item 1 is a source of much pain for mypy users. It is also the reason for the literals problem described in the issue you linked. Perhaps mypy maintainers would reconsider this behavior there's sufficient interest expressed by mypy users. Here's the code sequence mentioned in the issue you cited above: def test(x: Literal["x", "y"]):
pass
x = "x"
test(x) # error in mypy, ok in pyright The reason this works fine in pyright is because it infers the type of As for item 4, I have an idea for how to eliminate the inconsistency without harming the user experience for pylance users. I've filed this work item. I'll investigate whether this is feasible. No promises, but I think it's worth exploring. |
I'm happy to hear that
Yes, I've been using it and it's very helpful
Actually, you confused two separate things. This example is not about the assignment vs. declaration difference. It's specifically about Literals. The difference here between mypy and pyright is that mypy never infers Literal (unless it's Final). Otherwise it would infer x as a Literal. To illustrate, even if I separate the declaration from the assignment, the error remains: def test(x: Literal["x", "y"]):
pass
x: str
x = "x"
test(x) # error in mypy, ok in pyright |
@matangover, I've posted documentation that captures the major behavior differences between mypy and pyright and the justifications for these design choices. Let me know if you think that I've missed any important points. |
The doc is very helpful. I learned a lot from reading it and I don't know of anything you've missed. Thanks for this |
I've recently added type annotations to a large library, and have been checking my code using both mypy and pyright. While doing so, I noticed many differences between mypy and pyright in the types they choose to infer. Each type checker has a justification for its choices, but as a user this situation is frustrating, because I rely a lot on inference. This leads to a situation where I frequently had to think about "what pyright would do" and "what mypy would do", and scour their issue trackers to understand what's going on - what's a bug and what's a "feature".
I totally understand that each type checker has been developed independently and influenced by different needs and design choices. I also understand that the ecosystem is very much in flux. I have seen authors of type checkers justify their choices - and rightfully so. Nonetheless I think it would be a big benefit to the community to specify type inference rules more fully (PEP?). If typing is seen as part of the Python language (in various PEPs), and type inference is seen as a feature of typing, then that feature should behave consistently.
I assume there would be much work to define inference rules and reach an agreement that works for all type checkers. Also it would probably require considerable work to implement the necessary changes. However, I believe in the long run it's beneficial. Especially since as time goes by, more backwards compatibility concerns will just pile up.
Areas where I have noticed considerable differences include redefinitions (mypy's allow_redefinition only partially consistent with pyright default), Literals, union vs join, and overloads. Maybe others I can't recall.
I'd like to hear what others think about this topic.
The text was updated successfully, but these errors were encountered: