Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document why having a union return type is often a problem #1693

Open
JukkaL opened this issue Jun 10, 2016 · 22 comments
Open

Document why having a union return type is often a problem #1693

JukkaL opened this issue Jun 10, 2016 · 22 comments

Comments

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 10, 2016

Union return types get suggested pretty often, but they usually are not the right thing for library stubs.

@gvanrossum
Copy link
Member

The key reason is that every caller will have to use isinstance() to pick the union apart before its value can be used, right? And that's often unnecessary because there's some reason why the runtime value is always one or the other. Example, if pow() returned Union[int, float], then we couldn't use pow(2, 2) as a sequence index.

@johnthagen
Copy link
Contributor

@gvanrossum

Example, if pow() returned Union[int, float], then we couldn't use pow(2, 2) as a sequence index.

I think it's an implementation detail of how the type checker works and what it considers a type error. I've worked with Python type hinting for quite a while (love it by the way), though using PyCharm rather than mypy (but am looking forward to trying out mypy soon).

This is how PyCharm 2016.4 handles the following Python 3.5 code:

from typing import Union

x = 1  # type: Union[int, float]


def f(y: int) -> None:
    pass

f(x)  # No type error, as 'Union[int, float]' is allowed to be input into 'int' parameter

# Test to show that bare 'float' will raise error.
f(0.5)  # Warning raised: Expected type 'int', got 'float' instead

PyCharm seems to prefer avoiding false positive type errors, which I suppose is more of a design choice. I bring this up because they recently fixed a type hinting issue by switching the inferred type hint to a Union.

I also heard PyCharm is planning to switch to using typeshed in a future release. Should type checkers other than mypy be considered when it impacts the typeshed project?

I don't know the internal plans for PyCharm, only a user, so I'll let @vlasovskikh speak to those details and correct me if needed.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Jun 13, 2016

typeshed is not specific to any tool. As PEP 484 doesn't spell out all details of how type checking should work, tools may make different interpretations and thus different annotations sometimes make sense for different tools. Currently mypy probably (?) is the biggest production user of typeshed stubs, so the stubs may have a tendency of being optimized towards mypy.

PEP 484 doesn't define the precise semantics of union types. Mypy does "strict" union checking which is similar to how union types behave in type theory: an operation is okay for a union type if it is valid for all items of the union. It seems that PyCharm uses "unsafe" (not established terminology) union type checking where an operation is okay if it is valid for some item of an union. @vlasovskikh please let me know if I understood this correctly.

The main motivation for strict unions in mypy is finding None-related type errors, and in particular checking against forgetting to check if an Optional[x] value is None. This is a common source of bugs in Python and other languages.

For a checker that does strict unions Any is the probably best return type annotation for a function whose return type depends on the value (not type) of an argument, since otherwise an isinstance check would be needed after many or most calls. For a checker that uses unsafe unions, a union return type is better. The latter allows the tool to perform some (even if not full) checking without changing existing code.

Here are ideas for resolving this general issue:

  • Leave return types Any as it will work with both tools, though with somewhat limited level of checking for PyCharm. Optionally, tools can special case some of these functions internally and infer a more precise type than provided by a stub. The stubs would generally use least common denominator functionality for things not specified in PEP 484.
  • Support tool-specific conditional definitions in stubs. For example, have something like if typing.TOOL == 'pycharm': ... so that PyCharm can use a more precise return type. This would likely require a PEP 484 update.
  • Update PEP 484 or typeshed documentation to specify how union types should be used, and just follow the convention in the stubs.

@johnthagen
Copy link
Contributor

@JukkaL Thanks for the detailed explanation. Just wanted to highlight, as you've said, the different interpretations being used so that the community as a whole can continue to work together on this.

It seems that PyCharm uses "unsafe" (not established terminology) union type

While PyCharm's handing of Union's is less safe than mypy, I think it's important to remember that using Any for these return types brings with it other issues. So it's more of a trade off situation:

def fun(y: str) -> None:
    y.capitalize()
    ...

x = pow(2, 2)
fun(x)  # type error missed because pow annotated as returning Any

I think another reason I can imagine why PyCharm would favor Union's over Any is that PyCharm is an IDE, and having intelligent autocomplete is super helpful. Can't do that with Any.

@matthiaskramm
Copy link
Contributor

If mypy deals better with Any returns instead of Union returns, would it be possible to have mypy interpret any -> Union[...] in typeshed as -> Any?

(Btw. pytype does "unsafe" Unions as well, for most operations)

@JukkaL
Copy link
Collaborator Author

JukkaL commented Jun 14, 2016

@johnthagen Yes, using an Any return type is clearly a compromise. I also agree that for code completion it makes sense to try to infer a type even if it's not precise. Precise types are important for type checking, and strict checking of unions (in particular, Optional[x]) has been one of the most commonly requested mypy features.

@matthiaskramm I don't like the idea of ignoring some type annotations, as this could be very confusing for users. Also, often a union return type will be fine for mypy. For example, the types in the union may have some common functionality, and sometimes the caller is expected to do a check before using the value.

We are planning to teach mypy to special case the signatures of some functions (#1240), and then there would be more flexibility. It would be less of a usability problem for mypy to infer a more precise type for a return type than the one declared. For example, if pow is called integer arguments, and with a non-negative literal second argument, we could infer the return type to be int instead of float.

Also, PEP 484 suggests that Union[int, float] is redundant and can be replaced with just float, since int is valid when a float is expected (though the wording in the PEP is a little vague).

@vlasovskikh
Copy link
Member

@JukkaL @johnthagen In PyCharm we switched from "strict" to "weak" unions a few years ago after some experiments with None checks. We infer unions as return types of functions. Strict inferred unions confused our users a lot since they required isinstsance checks. In order to be compliant with PEP 484 and Mypy we will go back to "strict" unions for type hints and "weak" unions for inferred types. In PyCharm we need "weak" unions instead of Any because we provide code completion based on this data.

@atagar
Copy link

atagar commented Apr 23, 2020

Could mypy add a configuration flag to chose between strict and lenient union matching? A configuration flag seems like a better solution than to ask mypy users to reduce return type specificity by replacing Union with Any.

@jriddy
Copy link

jriddy commented Apr 23, 2020

Is this really because Union return types (other than possibly the degenerate case of Optional) are a bad idea design-wise?

By which I mean that the return type should usually be inferrrable from the types (or Literal values) of the inputs?

For example, a function that upcases either strings or the ascii-range letters of a bytes object could accept and return and Union:

def upcase(s: Union[str, bytes]) -> Union[str, bytes]: ...

...but the better thing to do is use an overload...

@overload
def upcase(s: str) -> str: ...
@overload
def upcase(s: bytes) -> bytes: ...

def upcase(s): 
    if isinstance(s, str): 
        return s.upper() 
    elif isinstance(s, bytes): 
        return bytes(x - 0x20 if 0x61 <= x <= 0x7A else x for x in s)
    else: 
        raise TypeError("need str or bytes")

If this is the kinda thing we're getting at, I'd be glad to document this.

@atagar
Copy link

atagar commented Apr 23, 2020

Hi jriddy. Apologies, but does python provide an @overload builtin? If so that would be fascinating, but glancing around I found a deferred PEP and PyPI project but as of Python 3.7 I'm not finding that decorator...

atagar@morrigan:~$ python3.7
>>> @overload
... def echo(msg: str) -> None:
...   print('str echo: %s' % msg)
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'overload' is not defined

@JelleZijlstra
Copy link
Member

@shaperilio
Copy link

What is the recommendation to handle cases like this?

def my_sort(items: Sequence[int], also_return_indices: bool) -> Union[List[int], Tuple[List[int], List[int]]]
    sorted_items, indices = ... # bunch of code
    if also_return_indices:
        return sorted_items, indices
    return sorted_items

i.e. I cannot use @overload.

@hauntsaninja
Copy link
Collaborator

Use overload + Literal, should work for most cases.

(Otherwise your fallback is the same as everyone else's, use Any)

@shaperilio
Copy link

Use overload + Literal, should work for most cases.

(Otherwise your fallback is the same as everyone else's, use Any)

Do you have an example or link to docs? Are you saying I need to change also_return_indices to a Literal['true', 'false']?

@PedanticHacker
Copy link

It’s frustrating that annotating the return values of a function as -> bool | int (which is equivalent to -> Union[bool, int]) gets you a mypy error as if the expression has the int return type while the annotated variable has the bool return type (but the return type of the function is actually True, so a bool type). I have annotated the return type as being either a bool or an int, so what the ****?!

I know this is because the bool type is a subtype of the int type. But can’t mypy distinguish between those two types?

Annotating the return type of my function as only -> int (just to make mypy happy) is wrong as it can also return a bool, so True or False, and that would confuse my fellow Python developers.

Mishaps like this is what make me hate annotations still.

pmhahn added a commit to pmhahn/pyenchant that referenced this issue Jun 1, 2023
get_text() -> Union[str, array] depends on what was used when
set_text() was used.
Because of <python/mypy#1693> use Any.
pmhahn added a commit to pmhahn/pyenchant that referenced this issue Jun 1, 2023
get_text() -> Union[str, array] depends on what was used when
set_text() was used.
Because of <python/mypy#1693> use Any.
pmhahn added a commit to pyenchant/pyenchant that referenced this issue Jun 1, 2023
get_text() -> Union[str, array] depends on what was used when
set_text() was used.
Because of <python/mypy#1693> use Any.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests