[red-knot] Avoid panicking when hitting failures looking up AST information #13701

rtpg · 2024-10-10T06:59:05Z

This is a set of changes that allow for cargo --bin red_knot to not panic (though report many failures across the repo).

The main idea here is to avoid panicking if, instead, type inference can choose to have "less information". For example, if some entry is not found, we often can return Unknown. In a stable system this can lead to Unknown spreading in a nasty way, but here this could allow for easier post-mortem debugging (for example, querying the salsa DB ad-hoc to figure out why certain expectations didn't hold).

I tried adding tracing logs in places that felt like indications of real failures upstream (in particular partial AST traversal). I'm not well versed in tracing, but in Python land I would probably group all of these into a specific logger so that they can be treated as "road to 1.0" stuff.

In this process I changed some ID lookup methods to return Option. I believe that the goal here is that this wouldn't be needed (as the inference DB should cover "everything"), but the changeset is so small that it feels worthwhile until AST failures are inbound.

Part of the failures I found were "invalid AST but still constructed by Ruff"-style errors.

An example is x, y: int = 1, 2.

red_knot sees an annotated assignment, and so figures out the types of 1 and 2, but then trying to assign 1 and 2 to the left side (as a tuple), but an awkward interaction between annotated assignments and tuple assignments (like x, y = 1, 2) leads to x, y getting defined by 1 and 2. This hits the "shouldn't have more than one definition for an expression" problem.

Ultimately this is a syntax error to begin with, so then we probably shouldn't traverse this. But if we don't traverse it and "try our best" with the above, suddenly x and y don't have anything associated.

I think the "right thing" to do above is to try harder (for example, erase the signature in the type inference treatment). But I think this branch shows that the cost of trying to keep the system running instead doesn't have too high of a cost.

In any case, this branch is hopefully an indicator of what kinds of places in the code are triggering panics when running red_knot against the ruff repo (as artificial of an example that might be).

In most cases we can "give up" on a type inference with "unknown"

MichaReiser · 2024-10-10T07:08:40Z

crates/red_knot_python_semantic/src/semantic_index/builder.rs

+                if self.current_assignment.is_some() {
+                    // This can happen if we have something like x = y := 2
+                    // which is invalid syntax but still is provided in the AST
+                    tracing::warn!("Current assignment is unexpectedly set");


We limit the use of tracing::warn for messages that need the user's attention and they need to address (e.g. when setting up file watching for a directory failed). Here, there's not much a user can do about this and invalid syntax is very common in an LSP. Logging a warning would be very noisy for users. We should use either the debug or possibly even trace level instead.

MichaReiser · 2024-10-10T07:10:25Z

In this process I changed some ID lookup methods to return Option. I believe that the goal here is that this wouldn't be needed (as the inference DB should cover "everything"), but the changeset is so small that it feels worthwhile until AST failures are inbound.

Could you talk me through the benefit of returning an Option in more detail? To me it's not clear how panicking at the call side is much different than panicking in the e.g. expression method directly because the call site is visible in the stack frame.

rtpg · 2024-10-10T07:00:43Z

crates/red_knot_python_semantic/src/semantic_index/builder.rs

@@ -212,7 +213,9 @@ impl<'db> SemanticIndexBuilder<'db> {
        let existing_definition = self
            .definitions_by_node
            .insert(definition_node.key(), definition);
-        debug_assert_eq!(existing_definition, None);
+        if existing_definition.is_some() {
+            tracing::warn!("Existing definition was unexpectedly evicted");


There are a bunch of warnings I added just to try and get things working.

In total there are 228 warnings (of which this one is only hit 5 times), so it's pretty surmountable all things considered

rtpg · 2024-10-10T07:07:25Z

crates/red_knot_python_semantic/src/semantic_index/builder.rs

+                    *node.target,
+                    Expr::Attribute(_) | Expr::Subscript(_) | Expr::Name(_)
+                );
+                if valid_target {


if we visit with an invalid target we get downstream errors on re-defining things. There's likely a way to visit here with an invalid target that doesn't generate issues

rtpg · 2024-10-10T07:08:42Z

crates/red_knot_python_semantic/src/semantic_model.rs

+            lookup.unwrap_or(Type::Unknown)
+        } else {
+            tracing::warn!("Couldn't find expression ID");
+            Type::Unknown


I feel like it would make a lot of sense for there to be something like Type::Unknown(SourcingInformation), where each callsite could embed where that specific one came from.

That way, if reveal_type gives you an Any, you could probably trace back why you got that.

rtpg · 2024-10-10T07:09:23Z

crates/red_knot_python_semantic/src/types/infer.rs

@@ -96,6 +96,27 @@ fn infer_definition_types_cycle_recovery<'db>(
    inference
 }

+/// Cycle recovery for [`infer_deferred_types()`]: for now, just [`Type::Unknown`]
+/// TODO fixpoint iteration
+fn infer_deferred_types_cycle_recovery<'db>(


copied what infer_definition_types did here.

rtpg · 2024-10-10T07:10:08Z

crates/red_knot_python_semantic/src/types/infer.rs

-    }
+    // pub(crate) fn expression_ty(&self, expression: ScopedExpressionId) -> Type<'db> {
+    //     self.expressions[&expression]
+    // }


here I went with leaving only try_expression_ty, other places I Option'd the existing methods. Not sure what pattern makes sense (if any)

rtpg · 2024-10-10T07:10:30Z

crates/red_knot_python_semantic/src/types/infer.rs

+                    tracing::warn!(
+                        "name in a type expression is always 'load' but got: '{:?}'",
+                        name.ctx
+                    );


another place where the failure is around an invalid AST

github-actions · 2024-10-10T07:12:45Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

rtpg · 2024-10-10T07:48:28Z

@MichaReiser the main point of using Option is that almost all of the call sites end up having a good answer for what to do if they couldn’t find the ID (often, in inference, say “well I guess the type is unknown”). And in cases where you don’t have a good answer, there’s still the fallback option of panicking.

The core thing with panicking is you lose all context except the backtrace, right? And stuff gets torn down etc. Meanwhile if you power through as much as possible you can have a partially filled database, and then do something like dump everything that seems missing during a debug session.

I think that pairing this with things like Type::Todo feels like a comfortable strategy up until there’s stability. And panicking due to inference failures means that big reports are “this crashes on my codebase” rather than “this code reports the type as Any”.

But maybe everything I’m seeing is all solvable by a handful of changes (in which case “fail hard” will make sure that the DB doesn’t get filled up with spurious inference failures in the future).

carljm · 2024-10-16T16:10:14Z

This is very useful, thanks for putting it up! I've created #13778 to track the overall issue of error resilience.

I don't think I want to move forward with landing this exact approach. I'd like to take this project in a more incremental fashion, fixing one cause of panics at a time (so we can consider the tradeoffs in each case, and look at relevant code samples), and ensuring we add a test case (not just "this Python file exists in the ruff repo", but an actual test case that runs in CI) for each case that we fix.

But regardless this PR is a great resource and reference point to have on hand for that work.

carljm · 2024-10-16T16:11:30Z

I'm also aware that some issues addressed in this PR might be bugs on valid syntax, not panics on invalid syntax; those should also be addressed in separate PRs.

sharkdp · 2024-11-14T10:14:40Z

I'm tentatively closing this with the same reasoning as in #13710 (comment). Please feel free to comment here if you feel that something from this branch should be integrated into red knot.

rtpg added 5 commits October 10, 2024 16:19

Allow for existing definitions to be evicted

933c4c0

Allow for definition lookups to be falliable

eb8eae2

In most cases we can "give up" on a type inference with "unknown"

Handle two cases of syntax errors causing crashes

e3af8dc

Gracefully recover from a cycle in defered type inference

8ca348d

Fix up tests

c0292e0

MichaReiser reviewed Oct 10, 2024

View reviewed changes

rtpg commented Oct 10, 2024

View reviewed changes

AlexWaygood added the red-knot Multi-file analysis & type inference label Oct 10, 2024

rtpg mentioned this pull request Oct 12, 2024

[red-knot] Do not panic if named expressions show up in assignment position #13711

Merged

carljm mentioned this pull request Oct 16, 2024

[red-knot] support invalid syntax without panics #13778

Closed

sharkdp closed this Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Avoid panicking when hitting failures looking up AST information #13701

[red-knot] Avoid panicking when hitting failures looking up AST information #13701

rtpg commented Oct 10, 2024 •

edited

Loading

MichaReiser Oct 10, 2024

MichaReiser commented Oct 10, 2024

rtpg Oct 10, 2024

rtpg Oct 10, 2024

rtpg Oct 10, 2024

rtpg Oct 10, 2024

rtpg Oct 10, 2024

rtpg Oct 10, 2024

github-actions bot commented Oct 10, 2024

rtpg commented Oct 10, 2024

carljm commented Oct 16, 2024

carljm commented Oct 16, 2024

sharkdp commented Nov 14, 2024

[red-knot] Avoid panicking when hitting failures looking up AST information #13701

[red-knot] Avoid panicking when hitting failures looking up AST information #13701

Conversation

rtpg commented Oct 10, 2024 • edited Loading

MichaReiser Oct 10, 2024

Choose a reason for hiding this comment

MichaReiser commented Oct 10, 2024

rtpg Oct 10, 2024

Choose a reason for hiding this comment

rtpg Oct 10, 2024

Choose a reason for hiding this comment

rtpg Oct 10, 2024

Choose a reason for hiding this comment

rtpg Oct 10, 2024

Choose a reason for hiding this comment

rtpg Oct 10, 2024

Choose a reason for hiding this comment

rtpg Oct 10, 2024

Choose a reason for hiding this comment

github-actions bot commented Oct 10, 2024

ruff-ecosystem results

Linter (stable)

Linter (preview)

rtpg commented Oct 10, 2024

carljm commented Oct 16, 2024

carljm commented Oct 16, 2024

sharkdp commented Nov 14, 2024

rtpg commented Oct 10, 2024 •

edited

Loading

`ruff-ecosystem` results