-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NoReturn return type with singledispatch causing weird errors #11727
Comments
Right now this use case doesn't seem to be properly supported. All singledispatch variant functions need to have the same (or similar) return type, I think. In particular, using |
Thanks for creating this. I'm interested because of a similar problem I'm having—which doesn't use |
I don't think this is reasonably possible. The original implementation of the singledispatch plugin actually did try to do this by changing the type of the singledispatch function to an overloaded function, but that turned out to have several problems:
All of those problems might be fixable with enough special casing, but I don't think it's really worth the effort. The plugin currently just type checks calls to the singledispatch function using just the signature of the main singledispatch function, and then requires that the first argument of every registered implementation is a subtype of the first argument of the main singledispatch function, which is much easier to understand and implement. On master, the error messages I'm getting are:
The first 2 messages seem fairly clear about the return types for the main singledispatch function and all the implementations needing to be the same. The last 2 error messages, however, could be improved. This doesn't seem to be related to singledispatch because you can get the same result using something like this: from typing import NoReturn
def foo(data: object) -> NoReturn:
raise ValueError(f"Unhandled type {type(data)}")
a = foo(7) # E: Need type annotation for "a"
reveal_type(a) # N: Revealed type is "<nothing>" I'm not really sure why there's an error message about needing a type annotation for the output of a
Could you be more specific? What errors are you getting from mypy? |
Despite the objections, I have to second this as a worthwhile goal for typing in Python. Single dispatch is intended to be an alternative way of specify function overloading. Mypy has no problem with method overloads, so I think it is a reasonable goal for single dispatch work. Part of my motivation for single dispatch is that I'm a big proponent of multiple dispatch when it's applicable. The dream would be for Python to one day use multiple dispatch for comparison operators. This would solve a lot of typing issues since the type checker cannot easily perform introspection on comparison methods, which internally switch on types. If multiple dispatch were used, then the type switching is moved out of the methods, where it's visible to the type checker. But for that to work, type checkers need to be able to treat registered dispatch methods as overloads. So, would you indulge me in discussing the problems you mentioned?
Right, so you're checking the dispatch declarations. I think the request for treating dispatch declarations as overloads is for type checking callers.
I don't think this needs to be a problem. Couldn't you do exactly what you do with ordinary method overloads? If you haven't seen a dispatch registration, just don't consider it?
This sounds right to me. If someone needs access to the registration for the purpose of type checking, they should import the appropriate module, perhaps behind a
Yeah, this may be a challenge, but I think it's a worthwhile challenge for the future of Python. |
Yeah, the purpose of those checks is so that we can type check calls to the singledispatch function more easily. Requiring the first argument of every registered implementation to be a subtype of the main singledispatch function means that we can determine what the allowed types are for the first argument just by looking at the first argument of the main singledispatch function, which will be a supertype of all dispatch types. That means that passing a type that isn't compatible with the main singledispatch function first argument type definitely won't start being correct regardless of how many registered implementations you add.
Ordinary overloads don't have this issue. All the overloads are defined right before the implementation, so once we've seen the definition we know about all of the possible overloads and other modules will never add new overloads.
The problem with that is that, if we do that, the order that we type check modules can affect what errors we show. Consider this example: a.py: from functools import singledispatch
@singledispatch
def f(a: object, b: object) -> None:
pass b.py: from a import f
@f.register
def _(a: str, b: int) -> None:
pass c.py: from a import f
f('a', 'b') If we type check b.py before c.py, then we'll get an error for the call in c.py, pointing out that the second argument should be an Since the order that modules get type checked is somewhat arbitrary (when one module doesn't depend on the other), I think this would lead to errors that would appear or disappear due to unrelated changes that happen to change the type checking order. If we wanted to avoid this, I think we could try adding a pass that would make modules with singledispatch functions depend on all modules that have registered implementations for those functions, which would force a.py and b.py in the above example to be in the same SCC. I'm honestly not sure if this would be possible without significant changes as it seems like it would require semantic analysis to be at least partially done in order to identify registered implementations, even though we can't do semantic analysis until after we've figured out the dependency graph.
I'm not sure what you're saying here. I was thinking of an example like this: from functools import singledispatch
from typing import Union
class Node: pass
class MypyFile(Node): pass
class Missing: pass
@singledispatch
def f(a: Node) -> None:
pass
@f.register
def g(a: MypyFile) -> None:
x: Missing
f(x)
@f.register
def h(a: Missing) -> None:
pass The original implementation tried to type check the call to
Could you elaborate on how this would work? Where would you put the type information for each comparison operator/dunder method? |
I understand that, but that's a lot less strict than it could be. Consider a comparison operator. The base definition would just be
Yes, but that should be an error.
I don't think it's mypy's responsibility to do this. The caller should explicitly do the import for mypy.
It doesn't work even though it's in the same file? If it's in a different file, the import should make it visible, right?
I wrote this up a while ago, which details the idea. |
How common is this pattern? I'd be fine if mypy only works with a singledispatch function defined entirely in one module. We probably want some kind of warning so users doing this know exactly why mypy is not working correctly for them, but I'm not sure it's a common enough use case that it needs to be fully working in this situation. |
I think this is somewhat related to #2922 where the same problem of registering subclasses with the superclass can break modularity. I second the idea of using multiple dispatch for binary operators, and having multiple dispatch and typing support for multiple dispatch builtin to Python, as being a worthwhile goal. I already have a need for this. I have types where I want to support both concatenation and element-wise addition. Since they both use
I don't see how this is a problem. At runtime |
As much as I like the benefits that this proposal would add, I think the fundamental problem with your suggestion is the same as the fundamental problem with the singledispatch-as-overloads idea: it leaves a lot of opportunities for spooky action at a distance. With your proposal, every time a module gets imported, it could register an implementation that can change the behavior of any other module by providing a better match for the passed in types than the implementation that module was previously using. That could happen even if the module registering that implementation is several layers deep in your dependency graph, or even if it's in a separate module's dependency graph that happens to be used by an application that's also using your library (which is much worse because you wouldn't run into this until someone happened to try to use those 2 modules at the same time). Some other examples of unfortunate consequences of this:
Most of the problems above have similarities to problems type checkers would face. A type checker would almost certainly not try searching all modules for all possible registered implementations (due to the performance and complexity implications of that), and would probably not know which registered implementations are actually registered at any given time. The best it would be able to do is pick some subset of registered implementations (all the implementations in a certain file, in the current project, etc.), type check your code using that subset, and hope that the errors you get from implementations not being found are false positives and not false negatives. The current Python binary operator method resolution system, for all its flaws, manages to avoid all these problems by only ever needing to check 2 places for possible implementations: the definition of the relevant functions on the left type and the right type. (The existence of the reverse versions of those dunder methods admittedly leads to some of the above problems in a limited fashion, because adding an implementation of the regular version on the left type could lead to code relying on using the reverse version on the right type breaking, but that's far more manageable than the problems described here). There is a chance that all of these problems are solvable, but I can't really think of a solution off the top of my head. Interestingly enough, according to this, Julia takes a similar approach to your proposal, so maybe they've found a solution to these problems that I'm not seeing. Side note: Since this is a change that depends on changes to the language itself, maybe it would be better to move this to the python/typing bug tracker? There seem to be several suggestions to have mypy type check calls to singledispatch functions using only the registered implementations in files imported directly or indirectly by the calling module. That would definitely be easier than the general case of finding all implementations, but it still comes with some problems of its own:
I think both of those combined mean that implementing this probably isn't really worth the effort. Interestingly, while I was writing this up I realized that the suggestions for possible implementations for the singledispatch-as-overloads idea are essentially the same as #2922 (comment), with all of the same difficulties (this suggestion appears to be option 3).
mypyc/mypyc#802 (comment) has a list of everywhere edgedb uses singledispatch functions, which does include 2 singledispatch functions with registered implementations outside of the original file they're defined in. That's admittedly a very small sample size (and I'd definitely be interested in more data), but I'd guess registering implementations outside of the main file isn't common, but it probably isn't rare either.
That's admittedly more of a plugin API limitation than a fundamental limitation of the strategy being suggested. The plugin API doesn't offer many guarantees about when the plugin will be called, so mypy tries to call the plugin to type check the call before it calls the plugin on all the registered implementations, which leads to the problem I mentioned earlier. We could probably avoid that by moving the code into mypy itself instead of a plugin, but it would be kind of unfortunate to have to add more special casing in mypy.
I think I probably wasn't clear here. mypy determines the order to type check modules by looking at imports to determine the dependency graph, and then picking an order of modules to check that always checks a module before any of the modules that import it. In that example,
Could you clarify why you can't type the method without a mypy plugin? If the problem here is wanting to be able to accept a type that implements either the forward or reverse method, then I think typeshed's strategy with typing (If I'm misunderstanding your problem, it might be worth asking this question at https://github.com/python/typing/discussions just in case someone there has a better solution than a mypy plugin). |
Yes. That flexibility is the point of dispatch. This allows registration by classes unknown to the dispatch function as types accepted by the dispatch function. Consider something like a display function using dispatch. Callers can add their types to the display function's accepted types. I don't think it's as "spooky" as you're making it out to be. Either the caller:
This seems perfectly reasonable to me. Even the error could simply say, "the function F doesn't support an object of type T. Did you forget to import your registration before this call so that MyPy knows about it?"
I think what we're suggesting is option 5. No second pass is needed. It is the caller's responsibility to ensure that the registrations have already been seen by MyPy. As Jukkal points out, this would be easy to implement. I'm not sure it would be that confusing with a reasonable error message when things don't match.
This could be made into a false positive if a
You're basically saying no to single dispatch being fully supported by MyPy, which I think is a shame because plenty of other languages have single dispatch.
It's the user's fault if MyPy's arbitrary module import order causes an error: If module c depends on module b through dispatch registration, then c should import b. I think part of the misunderstanding between us is that you want any code that runs to pass MyPy's type checking. You're right that this goal would be an unreasonable amount of work. I agree with you there! But it's already the case that users have to spend time helping MyPy to understand code. So what I'm humbly suggesting is that users who want their single dispatch code to be type checked should have to import any dispatch overloads before they use them so that MyPy knows about them (This is Jukkal's option number 5 from the comment you linked). This makes the MyPy implementation much simpler, and I think it addresses most of your issues. What do you think? |
@pranavrajpal Your entire response seems to be assuming that any implementation of dispatch at runtime would necessitate a global overload table. That's bad for all the reasons you listed, and I am well aware of them. There is also the possibility that an implementation of dispatch doesn't update a global overload table, but instead retains value-like semantics so that the overload sets for a given function ########## a.py
@dispatch
def add(i: int, j: int) -> int:
return i + j
print(add(1, 2)) # 3
print(add("a", "b")) # TypeError: unsupported parameter type(s) for add: 'str' and 'str'
########## b.py
@dispatch
def add(i: str, j: str) -> str:
return i + j
print(add("a", "b")) # 'ab'
# the `multipledispatch` package would actually dispatch to the definition of `add` in module `a` here
# since module `b` is loaded second in module `d`.
# This is bad...
print(add(1, 2)) # TypeError: unsupported parameter type(s) for add: 'int' and 'int'
########## c.py
from a import add
# extends the dispatch from module `a` *without modifying the value of `add` in module a*
@dispatch
def add(i: list[T], j: list[T]) -> list[T]:
return i + j
add(1, 2) # 3
add([], [1, 2]) # [1, 2]
########## d.py
import a
import b
# definitions of add in module `a` and module `b` are distinct
a.add(1, 2) # 3
a.add("a", "b") # TypeError
b.add(1, 2) # TypeError
b.add("a", "b") # 'ab'
# support some way of creating new overloads by "using" definitions from other modules
use(a.add) # creates a local `add` that is the *value* of `add` from module `a`
use(b.add) # extends the current value of `add` with the value of `add` from module `b`
# add another dispatch!
@dispatch
def add(i: MyType, j: int) -> MyType:
...
It is. Consider the above example where multipledispatch misbehaves. The current system can also inject specializations into the dispatch hierarchy that better match the call than the "expected" overload. A couple examples... ########## a.py
@dispatch
def add(i: float, j: float) -> float:
return i + j
########## b.py
from a import add
# I am only expecting the overloads from module `a` here as stated by my imports
add(1, 2)
def call_add() -> int:
return add(1, 2)
########## c.py
# no imports, no expectation of the overloads defined in module `a`
@dispatch
def add(i: int, j: int) -> int:
raise Exception("LOL")
########## d.py
import b
import c # okay
call_add() # Exception('LOL')
########## e.py
import c
import b # Exception('LOL') The writer of module |
@ktbarrett For future readers, it would be good to use the ordinary semantics of dispatch in Python: @singledispatch
def f(...): ...
@f.register # Explicitly mention that we're overloading f.
def _(...): ... Also, your example violates return type covariance, an parameter contravariance. You should have a base definition that accepts
Yes, that's an important feature of dispatch. And I agree that the type checker can't be expected to catch such "injection". I personally don't find this "spooky", especially considering that the code you wrote is pretty unusual. In most cases, various libraries register their own dispatch overloads, and callers use those overloads. So the ordinary case is fairly straightforward for users wanting type checking: ideally, they just need to import the overload registration, and MyPy has all the information in needs to do a reasonable job. There are plenty of ways to fool MyPy with weird code. I don't think that's a good argument against making the ordinary case work really well. |
None of my examples are singledispatch. I am talking about the problem in general, singledispatch is no exception.
Sure, but it should only do it for the module that created the specialization and downstream modules, not for unrelated modules that do not expect this specialization to be there. This is the key to the argument. Violating user expectation is never a good idea. If there is an alternative implementations that solves that problem while keeping all the utility, there are no arguments left for the old system beside legacy support. I would not try to further argue that singledispatch works well, just that it works, it exists in the standard library, and mypy should consider supporting it to the best ability it can despite the problems.
It is not, it is simply an example. No one is going to unconditionally throw an exception in a specialization, but the value of the arguments may result in an exception that would not happen with the less specialized method that the user thought they were dispatching to. That's the point. I will agree there is no hope in mypy trying to correctly type check all the possible issues that |
Right, yes. And what we want is for MyPy to see these specializations, which it currently just ignores.
What I mean is that in the rare case where someone is specifying an overload using type X that is a subtype of another overload type Y, normally they're saying that such types X are okay. It's very unusual to specify an overload only to block it by making the overload
Right, I think we totally agree then. This is why I like approach number 5. As Jukkal pointed out, it should be easy to implement, and it covers the most common case. The current system ignores the overloads, which makes MyPy much less useful when type checking dispatch. |
[This was originally in a comment by @Dreamsorcerer]
...
This issue has mostly talked about the redefinition error. But, we should also have singledispatch typing working. e.g. With this example:
You also get errors due to not recognising the overloads:
It would be good for mypy to treat the
@foo.register
like an@overload
.I was also getting unreachable code errors after every call to the defined function. However, I was unable to reproduce it with a minimal example, so not sure exactly what is causing that.
The text was updated successfully, but these errors were encountered: