Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] PEP 695 type aliases #14357

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

[red-knot] PEP 695 type aliases #14357

wants to merge 14 commits into from

Conversation

sharkdp
Copy link
Contributor

@sharkdp sharkdp commented Nov 15, 2024

Summary

Add support for (non-generic) type aliases. The main motivation behind this was to get rid of panics involving expressions in (generic) type aliases. But it turned out the best way to fix it was to implement (partial) support for type aliases.

type IntOrStr = int | str

reveal_type(IntOrStr)  # revealed: typing.TypeAliasType
reveal_type(IntOrStr.__name__)  # revealed: Literal["IntOrStr"]
reveal_type(IntOrStr.__value__)  # revealed: int | str

x: IntOrStr = 1

reveal_type(x)  # revealed: Literal[1]

def f() -> None:
    reveal_type(x)  # revealed: int | str

Test Plan

  • Updated corpus test allow list to reflect that we don't panic anymore.
  • Added Markdown-based test for type aliases (type_alias.md)

@sharkdp sharkdp added the red-knot Multi-file analysis & type inference label Nov 15, 2024
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do a VC call next week if anything remains unclear about how this is all supposed to work!

@sharkdp sharkdp force-pushed the david/type-aliases branch 4 times, most recently from 9f827d6 to b9d931c Compare November 20, 2024 15:05
Copy link
Contributor

github-actions bot commented Nov 20, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@sharkdp sharkdp force-pushed the david/type-aliases branch 5 times, most recently from e8c92e3 to 06e615d Compare November 20, 2024 21:23
@sharkdp sharkdp marked this pull request as ready for review November 20, 2024 21:24
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated type inference structure looks great!

@@ -120,6 +120,7 @@ impl<'db> ScopeId<'db> {
NodeWithScopeKind::ClassTypeParameters(_)
| NodeWithScopeKind::FunctionTypeParameters(_)
| NodeWithScopeKind::Function(_)
| NodeWithScopeKind::TypeAlias(_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On closer look here I realize that NodeWithScopeKind::TypeAliasTypeParameters also belongs here, just like ClassTypeParameters and FunctionTypeParameters.

Though it seems like we could also simplify this to match on self.node(db).kind() instead, and then including ScopeKind::Annotation would cover all three of the type-param scopes.

Either option seems fine to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it seems like we could also simplify this to match on self.node(db).kind() instead, and then including ScopeKind::Annotation would cover all three of the type-param scopes.

If I understand correctly, you would want to change it to:

matches!(
    self.node(db).scope_kind(),
    ScopeKind::Annotation | ScopeKind::Function | ScopeKind::TypeAlias | ScopeKind::Comprehension
)

This would match everything that was matched previously, but it would additionally iunclude NodeWithScopeKind::Lambda(_). I wasn't sure if that's correct, so I just added NodeWithScopeKind::TypeAliasTypeParameters for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a lambda scope is a function scope, so that would actually be a bugfix.

(For reference, https://github.com/python/cpython/blob/main/Python/symtable.c#L563 is the upstream equivalent of our is_function_like, and it operates on a CPython enum which is precisely parallel to our ScopeKind enum. And we can see at https://github.com/python/cpython/blob/main/Python/symtable.c#L2379 that Lambda nodes do get a FunctionBlock scope.)

crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/mro.rs Outdated Show resolved Hide resolved
Comment on lines 271 to 272
// fails with salsa cycle panic:
("crates/ruff_python_parser/resources/inline/err/type_alias_invalid_value_expr.py", true, true),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. I suspect it might go away if we stop trying to infer the type of .value() on a type alias anytime we encounter it. If not I'll want to take a closer look.

Copy link
Contributor Author

@sharkdp sharkdp Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It happens for self-referential type aliases like type NestedInt = int | list[NestedInt]. We form a genuine cycle when inferring the annotation type on the RHS and look up the name NestedInt.

@sharkdp sharkdp changed the title [red-knot] Preliminary support for type aliases [red-knot] PEP 695 type aliases Nov 21, 2024

reveal_type(IntOrStr) # revealed: typing.TypeAliasType
reveal_type(IntOrStr.__name__) # revealed: Literal["IntOrStr"]
reveal_type(IntOrStr.__value__) # revealed: int | str
Copy link
Contributor

@carljm carljm Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I thought yesterday about recommending a test of the __value__ attribute, and then I didn't because it's actually pretty tricky. But I should have gone ahead and mentioned why it's tricky, to save you some time.

Internally the value type we care about for a type alias is the type of the RHS expression as a type expression. And that type here is the type int | str. But the type of the actual __value__ attribute at runtime is the type of the RHS as a runtime value expression. Which in this case would be an instance of types.UnionType, the result at runtime of evaluating the expression int | str. That's a very different type from the type int | str.

If we want to do this precisely, it's possible, but I think it would require inferring the type of the RHS twice, once as a value expression and once as a type expression. And that gets into some difficulties with our expectation that a given expression has only one inferred type.

So the question is, how valuable is it to have precise typing of the __value__ attribute of a type alias? Tbh I'm not sure; it's possible that with the increasing use of runtime-typing libraries, like Pydantic, we may get user requests for this precision. But existing type checkers don't bother; they just go with the typeshed definition of TypeAliasType, which says the type of __value__ is Any: https://github.com/python/typeshed/blob/main/stdlib/typing.pyi#L1045

For TypeVars, I did try to do precise typing of __bounds__ and __constraints__, mostly so I could write tests like this one, because unlike with a type alias, I didn't even have any way to write a test like the one you wrote down on line 17 here, which correctly asserts for the type int | str. But the way I did it (using to_meta_type to try to convert "type as type expression" to "type as value expression") is hacky and wrong, as I've realized more clearly since. It works only in simple cases (if A is a class, the meta type of A is Literal[A], and the latter is also the runtime type of the expression A), but it doesn't even work for a union case like this. The meta type of A | B is Literal[A] | Literal[B], but that's not the type of the runtime value expression A | B; the latter is an instance of types.UnionType.

All that is to say: I think for this PR we should not implement any special handling for the __value__ attribute of a type alias type, and just let it fallback to typeshed. (And, separately, we should probably remove the special handling I added for __bounds__ and __constraints__ on a TypeVar. I'll push a PR for this.)

type IntOrStr = int | str
type IntOrStrOrBytes = IntOrStr | bytes

reveal_type(IntOrStrOrBytes.__value__) # revealed: int | str | bytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the more verbose "annotate a name as IntOrStrOrBytes and check its revealed type from a different scope" approach here, for the reasons discussed above; the actual type of IntOrStrOrBytes.__value__ here would be types.UnionType.


MyIntOrStr = IntOrStr

reveal_type(MyIntOrStr.__value__) # revealed: int | str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And similarly this assertion is not right.

@@ -120,6 +120,7 @@ impl<'db> ScopeId<'db> {
NodeWithScopeKind::ClassTypeParameters(_)
| NodeWithScopeKind::FunctionTypeParameters(_)
| NodeWithScopeKind::Function(_)
| NodeWithScopeKind::TypeAlias(_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a lambda scope is a function scope, so that would actually be a bugfix.

(For reference, https://github.com/python/cpython/blob/main/Python/symtable.c#L563 is the upstream equivalent of our is_function_like, and it operates on a CPython enum which is precisely parallel to our ScopeKind enum. And we can see at https://github.com/python/cpython/blob/main/Python/symtable.c#L2379 that Lambda nodes do get a FunctionBlock scope.)

Comment on lines 451 to 453
Self::Function(_) => ScopeKind::Function,
Self::TypeAlias(_) => ScopeKind::TypeAlias,
Self::Lambda(_) => ScopeKind::Function,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor code organization nit: maybe let's not split the two Function scopes (function and lambda) apart here? We could even group them into a single | pattern, like with the annotation and comprehension scopes below

@@ -1954,6 +1968,11 @@ impl<'db> KnownInstanceType<'db> {
.default_ty(db)
.map(|ty| ty.to_meta_type(db))
.unwrap_or_else(|| KnownClass::NoDefaultType.to_instance(db)),
(Self::TypeAliasType(alias), "__name__") => Type::string_literal(db, alias.name(db)),
(Self::TypeAliasType(alias), "__value__") => alias.value_ty(db),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed above, I think we should remove this.

Comment on lines +1973 to +1975
(Self::TypeAliasType(_), "__type_params__") => {
todo_type!("TypeAliasType __type_params__")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure we need a match arm and a Todo here, when it would be correct (and match the behavior of other type checkers) to just allow it to fall back to the typeshed definition of the attribute. We can potentially add more precise per-instance typing for it in the future if there are compelling use cases, but we don't need to consider it a todo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants