Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Add MRO resolution for classes (take 2) #14027

Merged
merged 26 commits into from
Nov 4, 2024
Merged

Conversation

AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood commented Oct 31, 2024

A second attempt at implementing MRO resolution (see #13722 for the first attempt). Unlike the first attempt, this version does not attempt to handle union types in a class's bases list (which would potentially mean that you would have to consider multiple possible MROs for any given class).

Summary

A Python class's "Method Resolution Order" ("MRO") is the order in which superclasses of that class are traversed by the Python runtime when searching for an attribute (which includes methods) on that class. Accurately inferring a class's MRO is essential for a type checker if it is going to be able to accurately lookup the types of attributes and methods accessed on that class (or instances of that class).

For simple classes, the MRO (which is accessible at runtime via the __mro__ attribute on a class) is simple:

>>> object.__mro__
(<class 'object'>,)
>>> class Foo: pass
... 
>>> Foo.__mro__  # classes without explicit bases implicitly inherit from `object`
(<class '__main__.Foo'>, <class 'object'>)
>>> class Bar(Foo): pass
... 
>>> Bar.__mro__
(<class '__main__.Bar'>, <class '__main__.Foo'>, <class 'object'>)

For more complex classes that use multiple inheritance, things can get a bit more complicated, however:

>>> class Foo: pass
... 
>>> class Bar(Foo): pass
... 
>>> class Baz(Foo): pass
... 
>>> class Spam(Bar, Baz): pass
... 
>>> Spam.__mro__  # no class ever appears more than once in an `__mro__`
(<class '__main__.Spam'>, <class '__main__.Bar'>, <class '__main__.Baz'>, <class '__main__.Foo'>, <class 'object'>)

And for some classes, Python realises that it cannot determine which order the superclasses should be positioned in order to create the MRO at class creation time:

>>> class Foo(object, int): pass
... 
Traceback (most recent call last):
  File "<python-input-12>", line 1, in <module>
    class Foo(object, int): pass
TypeError: Cannot create a consistent method resolution order (MRO) for bases object, int
>>> class A: pass
... 
>>> class B: pass
... 
>>> class C(A, B): pass
... 
>>> class D(B, A): pass
... 
>>> class E(C, D): pass
... 
Traceback (most recent call last):
  File "<python-input-17>", line 1, in <module>
    class E(C, D): pass
TypeError: Cannot create a consistent method resolution order (MRO) for bases A, B

The algorithm Python uses at runtime to determine what a class's MRO should be is known as the C3 linearisation algorithm. An in-depth description of the motivations and details of the algorithm can be found in this article in the Python docs. The article is quite old, however, and the algorithm given at the bottom of the page is written in Python 2. As part of working on this PR, I translated the algorithm first into Python 3 (see this gist), and then into Rust (the c3_merge function in mro.rs in this PR). In order for us to correctly infer a class's MRO, we need our own implementation of the C3 linearisation algorithm.

As well as implementing the C3 linearisation algorithm in Rust, I also had to make some changes to account for the fact that a class might have a dynamic type in its bases: in our current model, we have three dynamic types, which are Unknown, Any and Todo. This PR takes the simple approach of deciding that the MRO of Any is [Any, object], the MRO of Unknown is [Unknown, object], and the MRO of Todo is [Todo, object]; other than that, they are not treated particularly specially by the C3 linearisation algorithm. Other than simplicity, this has a few advantages:

  • It matches the runtime:
    >>> from typing import Any
    >>> Any.__mro__
    (typing.Any, <class 'object'>)
  • It means that they behave just like any other class base in Python: an invariant upheld by all other class bases in Python is that they all inherit from object.

Dynamic types will have to be treated specially when it comes to attribute and method access from these types; however, that is for a separate PR.

Implementation strategy

The implementation is encapsulated in a new red_knot_python_semantic submodule, types/mro.rs. ClassType::try_mro attempts to resolve a class's MRO, and returns an error if it cannot; ClassType::mro is a wrapper around ClassType::try_mro that resolves the MRO to [<class in question>, Unknown, builtins.object] if no MRO for the class could be resolved.

It's necessary for us to emit a diagnostic if we can determine that a particular list of bases will (or could) cause a TypeError to be raised at runtime due to an unresolvable MRO. However, we can't do this while creating the ClassType and storing it in self.types.declarations in infer.rs, as we need to know the bases of the class in order to determine its MRO, and some of the bases may be deferred. This PR therefore iterates over all classes in a certain scope after all types (including deferred types) have been inferred, as part of TypeInferenceBuilder::infer_region_scope. For types that will (or could) raise an exception due to an invalid MRO, we infer the MRO as being [<class in question>, Unknown, object] as well as emitting the diagnostic.

We also emit diagnostics for classes that inherit from bases which would be too complicated for us to resolve an MRO for: anything except a class-literal, Any, Unknown or Todo is rejected.

I deleted the ClassType::bases() method, because:

  • It's easier to calculate the MRO if you work directly from the AST rather than having an intermediate method that converts the slice of AST nodes into an iterator of types
  • There are no direct uses of ClassType::bases() in types.rs or infer.rs anymore now (they all should have been iterating over the MRO all along!).
  • The bases() method was something of a footgun: it only gave you the slice of the class's explicit bases, and ignored the fact that a class will implicitly have object in its bases list at runtime if it has no explicit bases.

Test Plan

Lots of mdtests added. Several tests taken from https://docs.python.org/3/howto/mro.html#python-2-3-mro.

Copy link
Contributor

github-actions bot commented Nov 1, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@AlexWaygood AlexWaygood marked this pull request as ready for review November 1, 2024 12:55
@AlexWaygood AlexWaygood force-pushed the alex/simple-mro branch 2 times, most recently from 3e75482 to d1ccf88 Compare November 1, 2024 13:29
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great and I like the simplification. I would be interested in a few more cycle tests to make sure that indirect cycles are correctly handled too.

crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/mro.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/mro.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/mro.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/mro.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
Copy link

codspeed-hq bot commented Nov 1, 2024

CodSpeed Performance Report

Merging #14027 will not alter performance

Comparing alex/simple-mro (a9d379b) with main (88d9bb1)

Summary

✅ 32 untouched benchmarks

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not finished reviewing yet, but submitting some comments now rather than waiting because you're actively working on the branch :)

Overall, this is awesome!

crates/red_knot_python_semantic/resources/mdtest/mro.md Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Show resolved Hide resolved
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Some comments, but nothing blocking besides the cycle handling.

crates/red_knot_python_semantic/resources/mdtest/mro.md Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/mro.rs Outdated Show resolved Hide resolved
Comment on lines +399 to +403
# TODO: can we avoid emitting the errors for these?
# The classes have cyclic superclasses,
# but are not themselves cyclic...
class Baz(Bar, BarCycle): ... # error: [cyclic-class-def]
class Spam(Baz): ... # error: [cyclic-class-def]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty unhappy about the cascading errors here. I might work on a followup PR to see if I can avoid them.

Unfortunately a solution doesn't seem trivial (at least from what I can see). Also, this situation should really be very rare (I don't see any plausible reason why anybody would try to make a class inherit from itself, even indirectly, in real code). So I think this is okay, if necessary.

Copy link
Member

@MichaReiser MichaReiser Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a reason why they would want to do that but I remember that I at least accidentally did just that more than once in my early career because I was unaware that it leads to a cycle.

crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved
@MichaReiser
Copy link
Member

This is great!

@AlexWaygood
Copy link
Member Author

Thanks both for the excellent review comments!

@AlexWaygood AlexWaygood merged commit df45a0e into main Nov 4, 2024
18 checks passed
@AlexWaygood AlexWaygood deleted the alex/simple-mro branch November 4, 2024 13:31
@carljm
Copy link
Contributor

carljm commented Nov 4, 2024

🎉

carljm added a commit to Glyphack/ruff that referenced this pull request Nov 5, 2024
* main: (39 commits)
  Also remove trailing comma while fixing C409 and C419 (astral-sh#14097)
  Re-enable clippy `useless-format` (astral-sh#14095)
  Derive message formats macro support to string (astral-sh#14093)
  Avoid cloning `Name` when looking up function and class types (astral-sh#14092)
  Replace `format!` without parameters with `.to_string()` (astral-sh#14090)
  [red-knot] Do not panic when encountering string annotations (astral-sh#14091)
  [red-knot] Add MRO resolution for classes (astral-sh#14027)
  [red-knot] Remove `Type::None` (astral-sh#14024)
  Cached inference of all definitions in an unpacking (astral-sh#13979)
  Update dependency uuid to v11 (astral-sh#14084)
  Update Rust crate notify to v7 (astral-sh#14083)
  Update cloudflare/wrangler-action action to v3.11.0 (astral-sh#14080)
  Update dependency mdformat-mkdocs to v3.1.1 (astral-sh#14081)
  Update pre-commit dependencies (astral-sh#14082)
  Update dependency ruff to v0.7.2 (astral-sh#14077)
  Update NPM Development dependencies (astral-sh#14078)
  Update Rust crate thiserror to v1.0.67 (astral-sh#14076)
  Update Rust crate syn to v2.0.87 (astral-sh#14075)
  Update Rust crate serde to v1.0.214 (astral-sh#14074)
  Update Rust crate pep440_rs to v0.7.2 (astral-sh#14073)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants