Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncio.gather bad type information (pyright) #1913

Closed
shoffmeister opened this issue Oct 7, 2021 · 8 comments
Closed

asyncio.gather bad type information (pyright) #1913

shoffmeister opened this issue Oct 7, 2021 · 8 comments
Labels
needs investigation Could be an issue - needs investigation reference

Comments

@shoffmeister
Copy link

Issue Type: Bug

asyncio.gather is seen as returning Tuple[], but in reality returns List[]. This results in errors on type-checking.

To Reproduce

Open the script below in Visual Studio Code, with Pylance extension installed, and with setting "python.analysis.typeCheckingMode": "basic" (or more aggressive).

Expected

Pylance (pyright) does flag errors on main1().
Pylance (pyright) does not flag any errors on main2().

Actual

Pylance (pyright) does not flag errors on main1().
Pylance (pyright) does flag errors on main2().

Basically, the type-checker sees asyncio.gather() return Tuple[] where-as, in fact, it returns List[]. The code further down demonstrates that asyncio.gather() indeed returns List[]

Compare the results obtained also against mypy, which works correctly:

image

Alas, I cannot tell at which level things are wrong; my understanding is that pyright/pylance and mypy share the same type information, so I am surprised to see this behaviour.

This is with a virtual environment on Windows, Python 3.9.7 from the Windows Store.

import asyncio
import sys

async def doit(id) -> bool:
    return True

async def main0():
    ret = await asyncio.gather(*[ doit(id) for id in range(1) ])
    return ret

async def main1() -> tuple[bool]:
    ret: tuple[bool] = await asyncio.gather(*[ doit(id) for id in range(1) ])
    return ret

async def main2() -> list[bool]:
    ret: list[bool] = await asyncio.gather(*[ doit(id) for id in range(1) ])
    return ret

x0 = asyncio.run(main0())
x1: tuple[bool] = asyncio.run(main1())
x2: list[bool] = asyncio.run(main2())

if "list" == type(x0).__name__ == type(x1).__name__ == type(x2).__name__:
    print("all lists - good")
    sys.exit(0)
else:
    print("ehh?")
    sys.exit(1)

Extension version: 2021.10.0
VS Code version: Code 1.60.2 (7f6ab5485bbc008386c4386d08766667e155244e, 2021-09-22T12:00:31.514Z)
OS version: Windows_NT x64 10.0.18363
Restricted Mode: No

System Info
Item Value
CPUs Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz (8 x 2904)
GPU Status 2d_canvas: enabled
gpu_compositing: enabled
multiple_raster_threads: enabled_on
oop_rasterization: enabled
opengl: enabled_on
rasterization: enabled
skia_renderer: enabled_on
video_decode: enabled
vulkan: disabled_off
webgl: enabled
webgl2: enabled
Load (avg) undefined
Memory (System) 63.66GB (45.31GB free)
Process Argv --crash-reporter-id 5ddcd13a-5853-4d88-9c59-31e3be66729a
Screen Reader no
VM 44%
A/B Experiments
vsliv368:30146709
vsreu685:30147344
python383cf:30185419
pythonvspyt602:30300191
vspor879:30202332
vspor708:30202333
vspor363:30204092
pythonvspyt639:30300192
pythontb:30283811
pythonvspyt551cf:30345471
pythonptprofiler:30281270
vshan820:30294714
vstes263:30335439
vscorecescf:30358481
pythondataviewer:30285071
pythonvsuse255:30340121
vscod805:30301674
pythonvspyt200:30340761
binariesv615:30325510
vsccppwtct:30378365
pythonvssor306:30344512
bridge0708:30335490
pygetstartedt2:30371810
dockerwalkthru:30377721
bridge0723:30353136
pythonrunftest32:30373476
pythonf5test824:30373475
javagetstartedc:30364665
pythonvspyt187:30373474
pydsgst2:30361792
vsqsis200:30374795
vsaa593cf:30376535

@github-actions github-actions bot added the triage label Oct 7, 2021
@srittau
Copy link

srittau commented Oct 7, 2021

This is a tradeoff chosen by the typeshed project that pyright uses for type information. The source code of the type hints has this comment:

# `gather()` actually returns a list with length equal to the number
# of tasks passed; however, Tuple is used similar to the annotation for
# zip() because typing does not support variadic type variables.  See
# typing PR #1550 for discussion.

The type system is currently not expressive enough to say: "This returns a list with three elements of the following types: ..." As a workaround, gather() is annotated to return a tuple so that the following works:

foo, bar = gather([do_foo(), do_bar()])  # do_foo() returns int, do_bar() returns str

See python/typeshed#1550 and python/typing#193 for discussion.

@shoffmeister
Copy link
Author

To the point response, many thanks!

But then https://mypy.readthedocs.io/en/stable/stubs.html claims "Mypy uses stub files stored in the typeshed repository" - and mypy does not have this challenge (see my screenshot, above).

So, from a feature point of view, I wonder how mypy can get this right?

@srittau
Copy link

srittau commented Oct 7, 2021

I can't verify this at the moment, and possibly the pyright maintainers have more insight here, but there is actually one overload for gather, where it's returning a list: when the number of arguments is too high or unknown. In this particular case, I assume that mypy picks that overload, while pyright picks a different one.

@jakebailey
Copy link
Member

Given the return type says Tuple[bool], this is probably the thing where when we have no matching overload we arbitrarily pick the first one (which has always felt wrong to me, as it's the most specific overload, versus say the last one or some unioning behavior).

@erictraut
Copy link
Contributor

when we have no matching overload we arbitrarily pick the first one

This is not the behavior. If there is no matching overload, we report that there is no matching overload.

@erictraut
Copy link
Contributor

I'll investigate the issue later today. I'm busy at the moment.

@jakebailey
Copy link
Member

My mistake, I didn't see that there wasn't that error above and mixed it up with the case where no overload matches (mypy in this case will return Any, we still pick the first return type, both will say there's no match).

@judej judej added the needs investigation Could be an issue - needs investigation label Oct 7, 2021
@github-actions github-actions bot removed the triage label Oct 7, 2021
@erictraut
Copy link
Contributor

Here's what's happening. In this case, the overload is ambiguous. The call expression is using an unpack operator (*) to unpack a list of values with unknown length. These get mapped to one or more parameters in the overload.

Consider the following:

@overload
def func(a: int) -> Literal[1]: ...
@overload
def func(a: int, b: int) -> Literal[2]: ...
@overload
def func(a: int, b: int, c: int) -> Literal[3]: ...

def test(val: list[int]):
    ret = func(*val)
    reveal_type(ret) # ????

What would you expect the type of ret to be in this case? There's no single "correct" answer. There are arguments for Literal[1], Literal[2] or Literal[3]. One could even argue for Literal[1, 2, 3], although that would probably be the worst choice because it will almost definitely result in false positives.

Both pyright and mypy choose the first overload in the list, and both determine the answer to be Literal[1].

However, if we add the following overload, mypy and pyright differ in their behavior:

@overload
def func(a: int, b: int, c: int, *args: int) -> Literal[4]:
    ...

When we add this overload to the end, mypy now evaluates a type of Literal[4] whereas pyright evaluates Literal[1]. That's what's happening in this use of asyncio.gather.

PEP 484 doesn't provide much guidance in defining how overloads are supposed to work, especially in ambiguous cases like this, so it's not surprising that different type checkers will make different choices when subtle edge cases are involved. I would normally say that pyright should simply adopt the same behavior as mypy in a case like this, but there will likely be some (perhaps significant) performance impact of doing so. That gives me pause. Pyright's current behavior short-circuits a bunch of extra work if it discovers an overload match. This is a very common case during type evaluation. It appears that mypy always evaluates all overloads even if an earlier overload is a legitimate match.

@judej judej closed this as completed Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs investigation Could be an issue - needs investigation reference
Projects
None yet
Development

No branches or pull requests

5 participants