Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reason for re.Match.group and re.Match.__getitem__ returning AnyStr | Any? #9465

Closed
willfrey opened this issue Jan 6, 2023 · 7 comments
Closed

Comments

@willfrey
Copy link
Contributor

willfrey commented Jan 6, 2023

Is there a reason why the return annotation is AnyStr | Any instead of AnyStr | None in the snippets below?

typeshed/stdlib/re.pyi

Lines 72 to 77 in b1cb9c8

@overload
def group(self, __group: Literal[0] = ...) -> AnyStr: ...
@overload
def group(self, __group: str | int) -> AnyStr | Any: ...
@overload
def group(self, __group1: str | int, __group2: str | int, *groups: str | int) -> tuple[AnyStr | Any, ...]: ...

typeshed/stdlib/re.pyi

Lines 96 to 99 in b1cb9c8

@overload
def __getitem__(self, __key: Literal[0]) -> AnyStr: ...
@overload
def __getitem__(self, __key: int | str) -> AnyStr | Any: ...

If not, I'd like to submit a PR to change the return type hints accordingly.

Thank you!

@AlexWaygood
Copy link
Member

AlexWaygood commented Jan 6, 2023

There's been a lot of discussion of the signature of Match.group() over the years:

Trying to piece together a precise history at this point is quite honestly a little difficult! I think it's likely that the stricter, more correct signature causes an unacceptable number of false positives. But you're welcome to submit a PR and see what the results of mypy_primer are in 2023 :-) (No promises that it will necessarily be merged.)

@AlexWaygood
Copy link
Member

Well, yup, looks like AnyStr | Any is still the best we can do here for now! #9482 (comment)

@AlexWaygood AlexWaygood closed this as not planned Won't fix, can't repro, duplicate, stale Jan 9, 2023
@AlexWaygood
Copy link
Member

(Thanks for the PR, by the way! It's always good to check whether these decisions made years ago still apply. Type checkers are always getting better with these things, and experimenting is definitely encouraged in this repo :)

@willfrey
Copy link
Contributor Author

willfrey commented Jan 9, 2023

Thanks! I get annoyed with Any creeping its way into my code, so I try to stamp that out when I can. Problem is, I still don't have a good intuition of when I should use cast() vs. an explicit type annotation for a variable vs. some sort of assertion to narrow the inferred type, especially as it pertains when something has type Any or some union with Any.

Taking an example right from the re docs,

import re

m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
assert m is not None
first_name = m[1]  # Has type `str | Any`

# Do this?
g1: str = m[1]

# Or this?
g1 = cast(str, m[1])

# Or this?
g1 = m[1]
assert isinstance(g1, str)

The first two make disallow-any-expr directives happy whereas the third one does not.

@AlexWaygood Do you have any personal philosophy of when to use what? If so, I'd love to hear it!

@hauntsaninja
Copy link
Collaborator

The first is more type safe, cast will let you lie

@willfrey
Copy link
Contributor Author

willfrey commented Jan 9, 2023

@hauntsaninja Yeah, agreed about cast. It's the BFG of shutting up the type checker.

@KotlinIsland
Copy link
Contributor

Basedmypy works correctly in these cases:

import re

s: str
if m := re.match("a(b)?(c)", s):
    reveal_type(m.groups())  #  (str | None, str)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants