-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce typing.STRICTER_STUBS
#1096
Comments
Interesting idea. How would users opt into the "stricter" stubs? Do you propose that type checkers add a new config setting? I'm not a fan of reusing the |
typing.TYPE_CHECKING == "strict"
typing.STRICT_STUBS
Yes, type checkers would need to introduce a new config setting. I'd propose
Yes, good point. I updated the proposal to |
typing.STRICT_STUBS
typing.STRICTER_STUBS
Here are two other potential solutions to this problem.
Of these three options, I think the |
Why do we need a flag? In the OP's example it seems a pretty good idea to change the typechecker behavior to return |
@gvanrossum The type checker does not know about the return types of @erictraut thanks for bringing up these alternatives. I think the appeal of the flag is that you can set it and forget it, whereas the introduction of a new type would require Python programmers to learn about the type ... and for what? I personally wouldn't want to deal with some functions returning |
Are you sure? It may be worth checking whether that's really common in real code. mypy-primer should help. I like the |
I'm well aware. :-) My actual proposal was actually more complex. If we have stubs containing e.g. @overload
def foo(x: int) -> list[int]: ...
@overload
def foo(y: str) -> list[str]: ... and we have a call like this def f(flag: bool):
a = foo(0 if flag else "")
reveal_type(a) then the revealed type could be
But in this example the return type is not from typing import overload
@overload
def foo(a: int) -> list[int]: ...
@overload
def foo(a: str) -> list[str]: ...
def foo(a): pass
def f(flag: bool):
a: int | str = 0 if flag else ""
reveal_type(a) # int | str
b = foo(a)
reveal_type(b) # list[int] | list[str] When I tried the OP's example it seems that because the mode expression's type is inferred as from typing import Literal
def my_open(name: str, write: bool):
mode: Literal['rb', 'wb'] = 'wb' if write else 'rb'
with open(name, mode) as f:
reveal_type(f) # typing.IO[Any] I still get the fallback. This seems to be due to some bug (?) in the typeshed stubs -- if I add def my_open(name: str, write: bool):
mode: Literal['rb', 'wb'] = 'wb' if write else 'rb'
with open(name, mode, buffering=0) as f:
reveal_type(f) # io.FileIO (@JelleZijlstra @srittau Do you think the former result is a bug in the stubs? Those overloads have |
Exactly what I was thinking. #566 ( |
I am unsure how representative 100 projects can be for the far larger Python ecosystem. Anyway I tried it with python/typeshed#7416 and it interestingly enough resulted in some INTERNAL ERRORs for mypy.
When should a function return |
But that's not what's going on in the OP's example (see my message). |
@gvanrossum It's late here and my brain might be a bit mushy. But aren't those orthogonal issues? Also if I remember correctly, there are some issues with Edit: I will try to give more coherent thoughts tomorrow. |
IIRC I just wanted to get to the root of the motivation of the issue as presented by the OP, and came up with some interesting (to me) facts. The plugin for open has (finally) been deleted from the mypy sources (python/mypy#9275) but that was only three weeks ago, so it's probably not yet in the latest release (0.931). The results for my examples are the same on the master branch and with 0.931 though. |
Adding a default value to There's interest in getting mypy to use a meet instead of a join for ternary, e.g. see python/mypy#12056. I don't think anyone is too opposed philosophically. i.e. I agree that OP's specific complaint would be fixed if mypy inferred OP's general complaint still stands. If you tweak the example to take My recollection of past discussions of AnyOf isn't fierce resistance as much as inertia, since the benefits for type checking are fairly marginal (although of course, if substantial numbers of users would use an |
I'm a big advocate of the benefits of static type checking, but I also recognize that the vast majority of Python developers (99%?) don't use type checking. Most Python developers do use language server features like completion suggestions, so the type information in typeshed stubs benefits them in other ways. I need to continually remind myself to think about the needs of these developers. I agree that |
On Tue, Mar 1, 2022 at 11:21 PM Eric Traut ***@***.***> wrote:
I'm a big advocate of the benefits of static type checking, but I also
recognize that the vast majority of Python developers (99%?) don't use type
checking. Most Python developers *do* use language server features like
completion suggestions, so the type information in typeshed stubs benefits
them in other ways. I need to continually remind myself to think about the
needs of these developers.
I agree that AnyOf has marginal value to static type checking scenarios.
It has some value because it allows for optional stricter type checking,
but the real value is for the other Python users who would see improved
completion suggestions. So yes, I am supportive of adding something like
AnyOf.
I also agree that AnyOf has marginal value for static type checking. It
also sounds like a complex feature that would take a lot of effort to
implement, and it would make type checkers that don't provide completion
suggestion functionality harder to maintain for not much benefit.
Completion suggestions are an important use case, however. Assuming Eric's
estimate above is not way off the mark, relatively few users of code
completion use static type checking. It seems sufficient to only support
the proposed functionality in stubs (and possibly PEP 561 compliant
packages). Most users that will benefit from this feature wouldn't be using
AnyOf in their code, after all (since they aren't using static type
checking). I think that we could design the feature in a way that static
type checkers don't have to add any major new functionality, but language
servers would still be able to generate better suggestions.
The earlier STRICTER_STUBS idea might be enough, possibly with a different
name. We'd support specifying a different type for language server use
cases and static type checking use cases. Maybe language server users would
be fine with a union type return type (with no Any items), even if static
type checker users would prefer Any (or a union with Any).
For example, Match.group could be defined like this:
if STRICTER_STUBS:
def group(self, __group: str | int) -> AnyStr | None: ...
else:
def group(self, __group: str | int) -> AnyStr | Any: ...
Now language servers don't need to provide completion suggestions for an
Any return, and static type checkers still wouldn't require callers to
guard against None. A language server could even infer AnyOf[AnyStr, None]
as the return type, by using some simple heuristics to merge the two
signatures.
|
I still wonder why people think that. I believe that this clearly improves type safety in a lot of cases: def foo() -> AnyOf[str, bytes]: ...
def bar(x: URL) -> None: ...
f = foo()
bar(f) # will fail
f.schema # will fail
with open(...) as f:
f.write(pow(x, y)) # will fail #566 now has links to over 20 other issues or PRs from typeshed, most of which state the type checking could be improved by
Fortunately, as a stopgap, |
I agree with @srittau. I think The disadvantage of The two ideas are in some ways complementary, however. If we had an |
Ok, I like the idea of Because I feel like we don't need a new type for this. It could just be a type checker setting. The code could be: def get_animal(name) -> Union[Cat, Dog]: ... Type checkers (and users) can then decide if they want to treat
|
This would actually reduce type safety for users that don't use strict mode. Many functions already return strict unions where it's necessary to check the return value at runtime. It's especially common to see |
This is really the point I don't get. IMO it is not up to an API to decide the level of type safety for the caller. That is solely up to the API user. |
I agree that the name of |
Well I find having both I don't see any problem with special casing def get_text() -> str | None: ...
"Hello " + get_text() # always error: Unsupported operand types for + ("str" and "None")
def get_name() -> str | bytes | None: ...
"Hello " + get_name() # always error: Unsupported operand types for + ("str" and "None")
name = get_name()
assert name is not None # narrowing Optional[Union[str, bytes]] down to Union[str, bytes] You would now get (depending on your type checker settings):
To avoid user confusion, I think when using unions as |
On Wed, Mar 2, 2022 at 11:33 AM Sebastian Rittau ***@***.***> wrote:
I also agree that AnyOf has marginal value for static type checking.
I still wonder why people think that. I believe that this clearly improves
type safety in a lot of cases:
def foo() -> AnyOf[str, bytes]: ...def bar(x: URL) -> None: ...
f = foo()bar(f) # will failf.schema # will fail
with open(...) as f:
f.write(pow(x ,y)) # will fail
#566 <#566> now has links to over
20 other issues or PRs from typeshed, most of which state the type checking
could be improved by AnyOf.
Based on quick grepping, typeshed has over 40k functions. If this feature
could improve 40 functions, that's still only about 0.1% of the entire
typeshed. Clearly not all functions are equally useful, so this is not a
very good estimate of the benefit, but I'm not convinced that this would
have a big impact. In several of the cases using X | Any already gives
pretty good type checking coverage, and AnyOf would only be slightly
better. There are also thousands of Any types that could already be given
precise types using existing type system features. Improving those seems
like a better proposition in terms of cost/benefit.
There are a few common functions which are currently problematic, such as
open. For them, I'd prefer to have type-safe alternatives that would give
full type checking coverage, instead of improvements which make things only
incrementally less unsafe. For example, I've previously suggested open_text
and open_binary wrappers for open, and similarly we could have integer-only
and float-only pow functions. These are easy improvements and could be
provided in a small third-party library, for example.
In contrast, implementing AnyOf in mypy would take several months of
full-time effort, and it would require somebody who has a strong
understanding of type systems and mypy internals. Finding a volunteer would
likely be hard, since the apparent benefits don't seem convincing enough.
Clearly there would be some benefit, but I don't think that it is
proportional to the required effort.
It also sounds like a complex feature that would take a lot of effort to
implement,
Fortunately, as a stopgap, AnyOf could be treated like Any. This provides
the same (lack of) type safety that the current situation offers, but would
allow type checkers and other tooling to use AnyOf to its full benefit.
I expect that this could result in unproductive discussions about whether
to annotate something as "str | Any" or "AnyOf[str, None]", etc. Users of
some tools would prefer the prior while users of other tools would prefer
the latter. Already annotating functions involving unions is tricky, and
this could add another dimension of difficulty that would affect all
contributions. I don't like the idea of fragmenting the type system into
multiple dialects.
|
On Wed, Mar 2, 2022 at 12:29 PM Sebastian Rittau ***@***.***> wrote:
This would actually reduce type safety for users that don't use strict
mode. Many functions already return strict unions where it's necessary to
check the return value at runtime. It's especially common to see X | None.
Making all of these non-strict by default would be a step backwards.
This is an important point. Any new features should be in addition to what
we currently can express to maintain backward compatibility and existing
guarantees. Also having stubs use a different syntax or semantics than
normal code for common things sounds quite confusing.
|
Sorry for the confusion, what I proposed in #1096 (comment) is meant to apply to all type checking (so regular code as well). And I do not think that we would loose any expressiveness, quite the opposite instead. APIs could always just return an expressive |
Ah sorry, I misunderstood your proposal. I don't think that we can change the meaning of union types, and we must remain backward compatible. The existing union types are used by (hundreds of?) thousands of projects and supported by many tools. I think that the current definition of union types is perfectly fine, but they don't cover all the use cases well (in particular, legacy APIs not designed for type checking). |
Thanks @JukkaL that is a very good observation ... that the root cause of the problem are legacy APIs that weren't designed for type checking! With that in mind, at least my concern could be addressed by introducing a from typing import overload, warning, Literal
@overload
def open(file: str, mode: Literal['rb']) -> io.BufferedReader: ...
@overload
def open(file: str, mode: Literal['wb']) -> io.BufferedWriter: ...
@overload
@warning('cannot statically determine return type because `mode` is unknown')
def open(file: str, mode: str) -> Any: ... This would also allow deprecated functions and functions that should not be used (but are included in typeshed because their absence would confuse users) to be annotated with warnings, so that type checkers can forward these warnings when the functions (or specific overloads) are used. For example the standard library docs note:
So typeshed could annotate: class Logger(Filterer):
@warning('should not be used, use `logging.getLogger` instead')
def __init__(self, name: str, level: _Level = ...) -> None: ... I think it would even make sense for the @warning('you have reached the end of this type stub')
def __getattr__(name: str) -> Any: ... Because it is very easy to accidentally call such |
@JukkaL makes a good point about I think there are three ways
Interpretations 1 and 3 should be "free" to implement since they are already fully supported by mypy and other type checkers. Interpretation 2 would be expensive, but I think it would be fine for type checkers to ignore this mode. Here's an idea that should be (relatively) cheap to implement. We could leverage PEP 646 to create a typeshed-specific internal class called _Ts = TypeVarTuple("_Ts")
class _WeakUnion(Generic[*_Ts], Any): ...
@overload
def int_or_float(x: Literal[True]) -> int: ...
@overload
def int_or_float(x: Literal[False]) -> float: ...
@overload
def int_or_float(x: bool) -> _WeakUnion[int, float]: ... Type checkers that want to interpret Thoughts? |
Couple of notes:
|
mypy does support subclassing from Any, but emits an error for it in strict mode. It should work fine once you type ignore that error or turn off the option for it. However, mypy doesn't support any of PEP 646 at all. |
As pointed out by @gvanrossum in python/typing#1096 Improves type inference in cases when we know that mode is OpenBinaryMode, but don't know anything more specific: ``` def my_open(name: str, write: bool): mode: Literal['rb', 'wb'] = 'wb' if write else 'rb' with open(name, mode) as f: reveal_type(f) # previously typing.IO[Any], now typing.BinaryIO ``` You may be tempted into thinking this is some limitation of type checkers. mypy does in fact have logic for detecting if we match multiple overloads and union-ing up the return types of matched overloads. The problem is the last overload interferes with this logic. That is, if you remove the fallback overload (prior to this PR), you'd get "Union[io.BufferedReader, io.BufferedWriter]" in the above example.
As pointed out by @gvanrossum in python/typing#1096 Improves type inference in cases when we know that mode is OpenBinaryMode, but don't know anything more specific: ``` def my_open(name: str, write: bool): mode: Literal['rb', 'wb'] = 'wb' if write else 'rb' with open(name, mode) as f: reveal_type(f) # previously typing.IO[Any], now typing.BinaryIO ``` You may be tempted into thinking this is some limitation of type checkers. mypy does in fact have logic for detecting if we match multiple overloads and union-ing up the return types of matched overloads. The problem is the last overload interferes with this logic. That is, if you remove the fallback overload (prior to this PR), you'd get "Union[io.BufferedReader, io.BufferedWriter]" in the above example.
Afaik the unpack operator in subscript requires Python 3.11 which is a problem because mypy parses the source code with I think we could just do something simple like:
instead, since Edit: Apparently mypy doesn't like arguments given to type aliases (unlike pyright which doesn't complain). |
We can use |
No, the issue is that >>> from typing import Any
>>> Any[str]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Alex\AppData\Local\Programs\Python\Python310\lib\typing.py", line 311, in inner
return func(*args, **kwds)
File "C:\Users\Alex\AppData\Local\Programs\Python\Python310\lib\typing.py", line 402, in __getitem__
return self._getitem(self, parameters)
File "C:\Users\Alex\AppData\Local\Programs\Python\Python310\lib\typing.py", line 424, in Any
raise TypeError(f"{self} is not subscriptable")
TypeError: typing.Any is not subscriptable |
This feels a lot like feature flags with current any_of direction we're doing. Specifically it reminds me of this comment. I would be very interested in feature flags for new peps in general or features that are useful, but vary across type checkers (recursive types is only one in mind for me). Main two I'm interested in currently are PEP 646 and recursive types, although I think having new flag for each new pep that introduces new types would allow experimentation/stub improvement faster for type checkers that support them earlier. Both would be useful to try in stubs but without feature flags my guess is it'll take a while for 646 given it's complexity. If we had feature flags we could do in a stub file in typeshed, if typing.flags.any_of:
from typing_extensions import AnyOf
AnyOf = AnyOf
else:
AnyOf: Any (or some other definition) or for recursive types, BasicJSON = str | float | None
if typing.flags.recursive:
JSON = Mapping[str, JSON] | Sequence[JSON] | BasicJSON
else:
JSON = Mapping[str, BasicJSON] | Sequence[BasicJSON] | BasicJSON | Mapping[str, Any] | Sequence[Any] This way type checkers who understand recursion get a fully accurate type, while other type checks still have a reasonable approximation. |
As pointed out by @gvanrossum in python/typing#1096 Improves type inference in cases when we know that mode is OpenBinaryMode, but don't know anything more specific: ``` def my_open(name: str, write: bool): mode: Literal['rb', 'wb'] = 'wb' if write else 'rb' with open(name, mode) as f: reveal_type(f) # previously typing.IO[Any], now typing.BinaryIO ``` You may be tempted into thinking this is some limitation of type checkers. mypy does in fact have logic for detecting if we match multiple overloads and union-ing up the return types of matched overloads. The problem is the last overload interferes with this logic. That is, if you remove the fallback overload (prior to this PR), you'd get "Union[io.BufferedReader, io.BufferedWriter]" in the above example. Co-authored-by: hauntsaninja <>
This looks promising. This still is missing the ability to specify a non- class _WeakUnionFallback(Generic[_T, *_Ts], Any): ... Now type checkers could treat If we don't want to wait until PEP 646 is supported everywhere, perhaps we could just use a union type type argument, which would be interpreted as a "weak union"? So instead of writing |
I like your idea of using a union form within I don't see the need for an explicit fallback. Perhaps you could give a concrete example where this would be desirable? I was thinking that the fallback should be based on type checker configuration and capabilities, not specified by the library author. Pyright is both a language server and a type checker, but it has no "language server mode" where it evaluate types differently. The two pieces of functionality (LS and type checker) are intrinsically tied. Having different modes for LS and type checker would be confusing because type errors would be inconsistent with LS features like "hover text" where hovering over an identifier displays its type. I was thinking that pyright would interpret
In both cases, the LS functionality would provide completion suggestions for all of the subtypes in union U. Maybe in the future I would consider replacing the "basic" mode with a true "weak union" implementation, but as you pointed out, that would be a lot of work, and I'm not sure it's worth it. |
def f(m: str) -> None:
f = open('x', mode=m)
f.bad_call() # "IO[Any]" has no attribute "bad_call"
"x" + f # Invalid types "str" and "IO[Any]" However, tools that properly support def f(m: str) -> None:
f = open('x', mode=m)
s = f.read()
reveal_type(s) # _WeakUnion[str, bytes] (instead of Any) Tools that support Here I think that the best compromise for the return type of
In the above example, we could perhaps invent rules so that
Would this result in worse type checking results compared to
Mypy could possibly also support this behavior behind a strictness flag. |
Thanks for the example. It's clear to me now, and I agree there's utility in adding an explicit fallback — at least in some cases. These are cases where the stub can provide a fallback that's more precise than |
Noting that a flag would also provide a way forward for stricter typing of json loads and dumps functions. I don't see how the unsafe union types proposed could help in that case. For future proofing, assuming that there are many places left that we'll want to stricten typing for, perhaps it should rather be a function than a constant? if typing.feature("strict-json"): ... Type checkers would implement configuration parameters to enable or disable features. In a large code-base with lots of legacy Python, the fact that typeshed uses Any in many places is giving quite a lot of grief. Automating the process of adding type ignores is easy for us, and this enables us to gradually improve explicitly disabled type checking, rather than constantly running into much less visible implicitly disabled type checking, caused by Any. |
The introduction of a function would also allow third party packages to provide such conditional strictness levels. |
Some functions can return different types depending on passed arguments. For example:
open(name, 'rb')
returnsio.BufferedReader
, whereasopen(name, 'wb')
returnsio.BufferedWriter
.The typeshed accurately models this using
@typing.overload
andtyping.Literal
. However there is the case that the argument value deciding the return type cannot be determined statically, for example:In this case typeshed currently just claims that
open
returnstyping.IO[Any]
, socontent
ends up having the typeAny
, resulting in a loss of type safety (e.g.content.startswith('hello')
will lead to a runtime error if the file was opened in binary mode, but type checkers won't be able to warn you about this because ofAny
).While typeshed could theoretically just change the return type to
typing.IO[Union[str, bytes]]
, that would force all existing code bases that currently rely onAny
to type check to update their code, which is of course unacceptable.When starting a new project I however want the strictest type stubs possible. I explicitly do not want standard library functions to return unsafe values like
Any
(or the a bit less unsafeAnyOf
suggested in #566), when the return types can be modeled by aUnion
.I therefore propose the introduction of a new variable
typing.STRICTER_STUBS: bool
, that's only available during type checking.Which would allow typeshed to do the following:
Ambiguous return types could then be annotated as e.g.
-> typing.IO[AnyOrUnion[str, bytes]]
.This would allow users to opt into stricter type stubs, if they so desire, without forcing changes on existing code bases.
CC: @AlexWaygood, @JelleZijlstra, @srittau, @hauntsaninja, @rchen152, @erictraut
P.S. Since I have seen
Union
return types being dismissed because "the caller needs to useisinstance()
", I want to note that this is not true, if the caller wants to trade type safety for performance, they can always just add an explicitAny
annotation to circumvent the runtime overhead ofisinstance
. Union return types force you to either handle potential type errors or explicitly opt out of type safety, which I find strongly preferable to lack of type safety by default.The text was updated successfully, but these errors were encountered: