Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Use the right scope when considering class bases #13766

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions crates/red_knot_python_semantic/resources/mdtest/generics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# PEP 695 Generics

## Class Declarations

Basic PEP 695 generics

```py
class MyBox[T]:
data: T
box_model_number = 695
def __init__(self, data: T):
self.data = data

# TODO not error (should be subscriptable)
box: MyBox[int] = MyBox(5) # error: [non-subscriptable]
# TODO error differently (str and int don't unify)
wrong_innards: MyBox[int] = MyBox("five") # error: [non-subscriptable]
# TODO reveal int
reveal_type(box.data) # revealed: @Todo

reveal_type(MyBox.box_model_number) # revealed: Literal[695]
```

## Subclassing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to also add a test that in a stub file (pyi) we can have a generic class that refers to itself in its own bases. We have a similar test already for non-generic classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added


```py
class MyBox[T]:
data: T

def __init__(self, data: T):
self.data = data

# TODO not error on the subscripting
class MySecureBox[T](MyBox[T]): # error: [non-subscriptable]
pass

secure_box: MySecureBox[int] = MySecureBox(5)
reveal_type(secure_box) # revealed: MySecureBox
# TODO reveal int
reveal_type(secure_box.data) # revealed: @Todo
```
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,13 @@ pub(crate) struct AstIds {

impl AstIds {
fn expression_id(&self, key: impl Into<ExpressionNodeKey>) -> ScopedExpressionId {
self.expressions_map[&key.into()]
let key = &key.into();
match self.expressions_map.get(key) {
Some(result) => *result,
None => {
panic!("Could not find expression ID for {key:?}");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very helpful because debug output on ExpressionNodeKey includes the range, which is usually enough in context to identify what text fragment is causing issues.

I put this in under the idea that this doesn't cost much but I might be wrong

}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. You could do

Suggested change
let key = &key.into();
match self.expressions_map.get(key) {
Some(result) => *result,
None => {
panic!("Could not find expression ID for {key:?}");
}
}
let key = ;
match self.expressions_map.get(&key.into()).unwrap_or_else(|| {
panic!("Could not find expression ID for {key:?}");
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went with this, thanks for the suggestion!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, went with that

}

fn use_id(&self, key: impl Into<ExpressionNodeKey>) -> ScopedUseId {
Expand Down
33 changes: 27 additions & 6 deletions crates/red_knot_python_semantic/src/types.rs
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
use infer::TypeInferenceBuilder;
use infer::{TypeInference, TypeInferenceBuilder};
use ruff_db::files::File;
use ruff_python_ast as ast;

use crate::module_resolver::file_to_module;
use crate::semantic_index::ast_ids::HasScopedAstId;
use crate::semantic_index::definition::{Definition, DefinitionKind};
use crate::semantic_index::symbol::{ScopeId, ScopedSymbolId};
use crate::semantic_index::symbol::{NodeWithScopeRef, ScopeId, ScopedSymbolId};
use crate::semantic_index::{
global_scope, semantic_index, symbol_table, use_def_map, BindingWithConstraints,
BindingWithConstraintsIterator, DeclarationsIterator,
Expand Down Expand Up @@ -1416,10 +1416,31 @@ impl<'db> ClassType<'db> {
let DefinitionKind::Class(class_stmt_node) = definition.kind(db) else {
panic!("Class type definition must have DefinitionKind::Class");
};
class_stmt_node
.bases()
.iter()
.map(move |base_expr| definition_expression_ty(db, definition, base_expr))

let has_type_params = class_stmt_node.type_params.is_some();

let type_scope_info: Option<(ScopeId<'db>, &TypeInference<'db>)> = if has_type_params {
let file = definition.file(db);
let index = semantic_index(db, file);
// we need to use the type param'd scope
let type_param_scope = index
.node_scope(NodeWithScopeRef::ClassTypeParameters(class_stmt_node))
.to_scope_id(db, file);
Some((type_param_scope, infer_scope_types(db, type_param_scope)))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me what the cost of the semantic_index lookup would be in practice. I just want to know what scope I need to be looking at. While ClassType does have body_scope, that is a separate scope from the type parameter scope.

Would it make sense to add bases_scope to ClassType here to avoid needing this index build up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that we could reuse the self.file and self.index because all the expressions in the class's base should be in the same file and from the same index.

However, we probably don't want to call infer_scope_types because inferring the class's bases then suddenly becomes dependent of all types in that other scope, resulting in poor incremental caching.

I'm not quiet sure but maybe definition_expression_ty is a better fit?

Copy link
Contributor

@carljm carljm Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can use definition_expression_ty because, in a type params scope, the class bases aren't "part of a definition" (the class definition is in the outer scope.)

We could mark the class bases as a "standalone expression" (or set of them) so we can query for their types directly, but I don't think this is worth the extra Salsa tracked structs. The "type params scope" is automatically generated and implicit and contains nothing but definitions for type parameters, and the class bases/keywords. There isn't "all types in that scope" to be worried about, because there are no other types in that scope. So I think infer_scope_types is the best option here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that we could reuse the self.file and self.index

We aren't in TypeInferenceBuilder here, so those aren't available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think perhaps the simpler way to get the type params scope is that it must always be the parent scope of the body scope (for a class with type params.) We should be able to add a ScopeId::parent method (it would need to use the symbol table, but not the semantic index.)

I'm also not opposed to adding bases_scope (or optional type_params_scope) to ClassType.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use index.scopes_by_node when the query already depends on the index anyway (which parent would as well?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ScopeId::parent would rely on just the symbol table, which is part of the index, but in principle could be backdated independently if it didn't change? Unlikely though; I don't think depending on the index here is a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up storing the bases scope in ClassType (which gets built up in the inference builder so we have access to the index right there). Definitely feels right after having written it up.

} else {
None
};

let mapper = move |base_expr: &ast::Expr| {
if has_type_params {
// Safety: we calculated the inference if has_type_params
let (type_param_scope, inferences) = type_scope_info.unwrap();
inferences.expression_ty(base_expr.scoped_ast_id(db, type_param_scope))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use has_type_params and then type_scope_info.unwrap() rather than just if let Some(...) = type_scope_info { ? They rely on the same invariants, but the latter seems simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... your suggestion is cleaner and this code isn't big, but I dislike the hiding of the "the branching is due to type parameter-ness".

I believe my concerns can be resolved with a more well thought out variable name for the scope info. Will ruminate on it overnight

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this up to bases_scope_info. It's still an Option (instead of doing something like checking if bases_scope is equal to some other scope to determine specialized-ness), but good enough for me to feel comfortable with the cleaner if let solution.

} else {
definition_expression_ty(db, definition, base_expr)
}
};
class_stmt_node.bases().iter().map(mapper)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the mapper is pulled out here because if we build up separate closures in different if branches, we hit the "no two closures are alike" issue. Hence the type_scope_info Option song and dance.

}

/// Returns the class member of this class named `name`.
Expand Down
26 changes: 15 additions & 11 deletions crates/red_knot_python_semantic/src/types/infer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -853,7 +853,7 @@ impl<'db> TypeInferenceBuilder<'db> {
let ast::StmtClassDef {
range: _,
name,
type_params: _,
type_params,
decorator_list,
arguments: _,
body: _,
Expand Down Expand Up @@ -881,17 +881,21 @@ impl<'db> TypeInferenceBuilder<'db> {

self.add_declaration_with_binding(class.into(), definition, class_ty, class_ty);

for keyword in class.keywords() {
self.infer_expression(&keyword.value);
}
// if there are type parameters, then the keywords and bases are within that scope
// and we don't need to run inference here
if type_params.is_none() {
for keyword in class.keywords() {
self.infer_expression(&keyword.value);
}

// Inference of bases deferred in stubs
// TODO also defer stringified generic type parameters
if self.are_all_types_deferred() {
self.types.has_deferred = true;
} else {
for base in class.bases() {
self.infer_expression(base);
// Inference of bases deferred in stubs
// TODO also defer stringified generic type parameters
if self.are_all_types_deferred() {
self.types.has_deferred = true;
} else {
for base in class.bases() {
self.infer_expression(base);
}
}
}
}
Expand Down
Loading