Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document "the Any trick" #11117

Merged
merged 11 commits into from
Dec 18, 2023
52 changes: 52 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -570,6 +570,58 @@ It's a "to do" item and should be replaced if possible. `Any` is used when
it's not possible to accurately type an item using the current type system.
It should be used sparingly.

### "The `Any` trick"

Consider the following signature of `re.Match[str].group`:
Avasam marked this conversation as resolved.
Show resolved Hide resolved

```python
class Match:
def group(self, __group: str | int) -> str | Any: ...
Avasam marked this conversation as resolved.
Show resolved Hide resolved
```
Akuli marked this conversation as resolved.
Show resolved Hide resolved

The `str | Any` seems unnecessary and weird at first.
Because `Any` includes all strings, you would expect `str | Any` to be
equivalent to `Any`, but it is not. To understand the difference,
let's look at what happens when type-checking this simplified example:

Suppose you have a legacy system that for historical reasons has two kinds
of user IDs. Old IDs look like `"legacy_userid_123"` and new IDs look like
`"456_username"`. The function below is supposed to extract the name
`"USERNAME"` from a new ID, and return `None` if you give it a legacy ID.

```python
import re

def parse_name_from_new_id(user_id: str) -> str | None:
match = re.fullmatch(r"\d+_(.*)", user_id)
if match is None:
return None
name_group = match.group(1)
return name_group.uper() # This line is a typo (`uper` --> `upper`)
```

Regexes are often used so that the regex matches any string, or the string
has already been validated so that the regex will match. But type checkers
don't know this, so complaining about the `None` would get very annoying.
Avasam marked this conversation as resolved.
Show resolved Hide resolved

* `-> Any` would mean "please do not complain" to type checkers.
Avasam marked this conversation as resolved.
Show resolved Hide resolved
If `name_group` has type `Any`, you will get no error for this.
* `-> str` would mean "will always be a `str`", which is wrong, and would
cause type checkers to emit errors for code like `if name_group is None`.
* `-> str | None` means "you must check for None", which is correct but can get
annoying for some common patterns. Checks like `assert name_group is not None`
would need to be added into various places only to satisfy type checkers,
even when it is impossible to actually get a `None` value
(type checkers aren't smart enough to know this).
* `-> str | Any` means "must be prepared to handle a `str`". You will get an
error for `name_group.uper`, because it is not valid when `name_group` is a
`str`. But type checkers are happy with `if name_group is None` checks,
because we're saying it can also be something else than an `str`.

In typeshed we unofficially call returning `Foo | Any` "the Any trick".
We tend to use it whenever something can be `None`,
but requiring users to check for `None` would be more painful than helpful.

## Submitting Changes
Akuli marked this conversation as resolved.
Show resolved Hide resolved

Even more excellent than a good bug report is a fix for a bug, or the
Expand Down