[red-knot] Use the right scope when considering class bases #13766

rtpg · 2024-10-16T01:57:02Z

Summary

PEP 695 Generics introduce a scope inside a class statement's arguments and keywords.

class C[T](A[T]):  # the T in A[T] is not from the global scope but from a type-param-specfic scope
   ...

When doing inference on the class bases, we currently have been doing base class expression lookups in the global scope. Not an issue without generics (since a scope is only created when generics are present).

This change instead makes sure to stop the global scope inference from going into expressions within this sub-scope. Since there is a separate scope, check_file and friends will trigger inference on these expressions still.

Another change as a part of this is making sure that ClassType looks up its bases in the right scope. I do not believe the way I do the lookup in this change is the most precise way, and would appreciate comments on that fragment.

Test Plan

cargo test --package red_knot_python_semantic generics will run the markdown test that previously would panic due to scope lookup issues

In particular, introducing PEP 695 generics introduces a new scope

rtpg · 2024-10-16T01:59:04Z

crates/red_knot_python_semantic/src/semantic_index/ast_ids.rs

+        match self.expressions_map.get(key) {
+            Some(result) => *result,
+            None => {
+                panic!("Could not find expression ID for {key:?}");


This is very helpful because debug output on ExpressionNodeKey includes the range, which is usually enough in context to identify what text fragment is causing issues.

I put this in under the idea that this doesn't cost much but I might be wrong

rtpg · 2024-10-16T02:02:46Z

crates/red_knot_python_semantic/src/types.rs

+            // we need to use the type param'd scope
+            let type_param_scope = index
+                .node_scope(NodeWithScopeRef::ClassTypeParameters(class_stmt_node))
+                .to_scope_id(db, file);


It's unclear to me what the cost of the semantic_index lookup would be in practice. I just want to know what scope I need to be looking at. While ClassType does have body_scope, that is a separate scope from the type parameter scope.

Would it make sense to add bases_scope to ClassType here to avoid needing this index build up?

I would expect that we could reuse the self.file and self.index because all the expressions in the class's base should be in the same file and from the same index.

However, we probably don't want to call infer_scope_types because inferring the class's bases then suddenly becomes dependent of all types in that other scope, resulting in poor incremental caching.

I'm not quiet sure but maybe definition_expression_ty is a better fit?

I don't think we can use definition_expression_ty because, in a type params scope, the class bases aren't "part of a definition" (the class definition is in the outer scope.)

We could mark the class bases as a "standalone expression" (or set of them) so we can query for their types directly, but I don't think this is worth the extra Salsa tracked structs. The "type params scope" is automatically generated and implicit and contains nothing but definitions for type parameters, and the class bases/keywords. There isn't "all types in that scope" to be worried about, because there are no other types in that scope. So I think infer_scope_types is the best option here.

I would expect that we could reuse the self.file and self.index

We aren't in TypeInferenceBuilder here, so those aren't available.

I think perhaps the simpler way to get the type params scope is that it must always be the parent scope of the body scope (for a class with type params.) We should be able to add a ScopeId::parent method (it would need to use the symbol table, but not the semantic index.)

I'm also not opposed to adding bases_scope (or optional type_params_scope) to ClassType.

You can also use index.scopes_by_node when the query already depends on the index anyway (which parent would as well?)

ScopeId::parent would rely on just the symbol table, which is part of the index, but in principle could be backdated independently if it didn't change? Unlikely though; I don't think depending on the index here is a problem.

Ended up storing the bases scope in ClassType (which gets built up in the inference builder so we have access to the index right there). Definitely feels right after having written it up.

rtpg · 2024-10-16T02:04:08Z

crates/red_knot_python_semantic/src/types.rs

+            } else {
+                definition_expression_ty(db, definition, base_expr)
+            }
+        };


the mapper is pulled out here because if we build up separate closures in different if branches, we hit the "no two closures are alike" issue. Hence the type_scope_info Option song and dance.

github-actions · 2024-10-16T02:11:00Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+1 -0 violations, +0 -0 fixes in 1 projects; 53 projects unchanged)

pandas-dev/pandas (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ pandas/core/internals/blocks.py:1664:9: F841 Local variable `icond` is assigned to but never used

Changes by rule (1 rules affected)

code	total	+ violation	- violation	+ fix	- fix
F841	1	1	0	0	0

MichaReiser · 2024-10-16T06:41:35Z

crates/red_knot_python_semantic/src/semantic_index/ast_ids.rs

+        let key = &key.into();
+        match self.expressions_map.get(key) {
+            Some(result) => *result,
+            None => {
+                panic!("Could not find expression ID for {key:?}");
+            }
+        }


I like this. You could do

Suggested change

let key = &key.into();

match self.expressions_map.get(key) {

Some(result) => *result,

None => {

panic!("Could not find expression ID for {key:?}");

}

}

let key = ;

match self.expressions_map.get(&key.into()).unwrap_or_else(|| {

panic!("Could not find expression ID for {key:?}");

}

Went with this, thanks for the suggestion!

Great, went with that

MichaReiser · 2024-10-16T06:46:00Z

crates/red_knot_python_semantic/src/types.rs

+            // we need to use the type param'd scope
+            let type_param_scope = index
+                .node_scope(NodeWithScopeRef::ClassTypeParameters(class_stmt_node))
+                .to_scope_id(db, file);


I would expect that we could reuse the self.file and self.index because all the expressions in the class's base should be in the same file and from the same index.

However, we probably don't want to call infer_scope_types because inferring the class's bases then suddenly becomes dependent of all types in that other scope, resulting in poor incremental caching.

I'm not quiet sure but maybe definition_expression_ty is a better fit?

carljm · 2024-10-16T13:12:17Z

crates/red_knot_python_semantic/src/types.rs

+            if has_type_params {
+                // Safety: we calculated the inference if has_type_params
+                let (type_param_scope, inferences) = type_scope_info.unwrap();


Is there a reason to use has_type_params and then type_scope_info.unwrap() rather than just if let Some(...) = type_scope_info { ? They rely on the same invariants, but the latter seems simpler.

Hmm... your suggestion is cleaner and this code isn't big, but I dislike the hiding of the "the branching is due to type parameter-ness".

I believe my concerns can be resolved with a more well thought out variable name for the scope info. Will ruminate on it overnight

Changed this up to bases_scope_info. It's still an Option (instead of doing something like checking if bases_scope is equal to some other scope to determine specialized-ness), but good enough for me to feel comfortable with the cleaner if let solution.

carljm

This looks really good, thank you!! Comments aren't anything major, but there are a few things to address, so I'll request changes for now.

carljm · 2024-10-16T13:13:58Z

crates/red_knot_python_semantic/resources/mdtest/generics.md

+reveal_type(MyBox.box_model_number)  # revealed: Literal[695]
+```
+
+## Subclassing


I would like to also add a test that in a stub file (pyi) we can have a generic class that refers to itself in its own bases. We have a similar test already for non-generic classes.

MichaReiser · 2024-10-17T05:43:51Z

crates/red_knot_python_semantic/src/types.rs

+            if let Some((bases_scope, inferences)) = bases_scope_info {
+                // when we have a specialized scope, we'll look up the inference
+                // within that scope
+                inferences.expression_ty(base_expr.scoped_ast_id(db, bases_scope))


Nit: We could consider to just call base_expr.ty(...) over repeating the entire ceremony of storing the base specialized scope, fingering the scope, and calling expression_ty. However, it would be slightly less efficient.

Going down that path let me remove all the base scope song and dance, at (by my read) not much cost. The resulting if statement is probably very hard to look at though now, since there's no longer quite a clear reason for why I am calling .ty (even with the comment). But simpler is simpler, thanks for the suggestion

crates/red_knot_python_semantic/src/types.rs

MichaReiser · 2024-10-17T05:47:57Z

crates/red_knot_python_semantic/src/types/infer.rs

+        let bases_specialized_scope = type_params.as_ref().map(|_| {
+            self.index
+                .node_scope(NodeWithScopeRef::ClassTypeParameters(class))
+                .to_scope_id(self.db, self.file)
+        });


Nit: I suggest making this a method on ClassType. It doesn't seem worth storing (it's very cheap to recompute). Unless we think that it helps with reducing the blast radius of file changes (e.g. so that calling (class A).bases where class A is defined in foo.py from bar.py doesn't get invalidated when foo.py changes). However, I do think that we might want to start caching bases` once we add MRO resolution (which seems expensive)

Thanks to your suggestion on using ty I don't even need this anymore, fortunately.

Co-authored-by: Micha Reiser <[email protected]>

Use the right scope when considering class bases

d57ab6c

In particular, introducing PEP 695 generics introduces a new scope

rtpg requested review from carljm, MichaReiser and AlexWaygood as code owners October 16, 2024 01:57

rtpg changed the title ~~Use the right scope when considering class bases~~ [red-knot] Use the right scope when considering class bases Oct 16, 2024

rtpg commented Oct 16, 2024

View reviewed changes

MichaReiser added the red-knot Multi-file analysis & type inference label Oct 16, 2024

MichaReiser reviewed Oct 16, 2024

View reviewed changes

carljm reviewed Oct 16, 2024

View reviewed changes

carljm requested changes Oct 16, 2024

View reviewed changes

rtpg added 4 commits October 17, 2024 10:38

Hold onto the bases scope if scope specialization happens

a1e430a

simplify expression_id panic

41fee40

Add cyclical class definition

46cc50b

remove manual map implementation

8601fc8

MichaReiser approved these changes Oct 17, 2024

View reviewed changes

rtpg and others added 2 commits October 17, 2024 17:10

pull the closure into the use site

1b9e355

Co-authored-by: Micha Reiser <[email protected]>

simplify implementation

42728f8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Use the right scope when considering class bases #13766

[red-knot] Use the right scope when considering class bases #13766

rtpg commented Oct 16, 2024 •

edited

Loading

rtpg Oct 16, 2024

rtpg Oct 16, 2024

MichaReiser Oct 16, 2024

carljm Oct 16, 2024 •

edited

Loading

carljm Oct 16, 2024

carljm Oct 16, 2024

MichaReiser Oct 16, 2024

carljm Oct 16, 2024

rtpg Oct 17, 2024

rtpg Oct 16, 2024

github-actions bot commented Oct 16, 2024 •

edited

Loading

MichaReiser Oct 16, 2024

rtpg Oct 17, 2024

rtpg Oct 17, 2024

MichaReiser Oct 16, 2024

carljm Oct 16, 2024

rtpg Oct 16, 2024

rtpg Oct 17, 2024

carljm left a comment

carljm Oct 16, 2024

rtpg Oct 17, 2024

MichaReiser Oct 17, 2024

rtpg Oct 17, 2024

MichaReiser Oct 17, 2024

rtpg Oct 17, 2024

[red-knot] Use the right scope when considering class bases #13766

Are you sure you want to change the base?

[red-knot] Use the right scope when considering class bases #13766

Conversation

rtpg commented Oct 16, 2024 • edited Loading

Summary

Test Plan

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carljm Oct 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Oct 16, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carljm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rtpg commented Oct 16, 2024 •

edited

Loading

carljm Oct 16, 2024 •

edited

Loading

github-actions bot commented Oct 16, 2024 •

edited

Loading

`ruff-ecosystem` results