Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cached inference of all definitions in an unpacking #13979

Merged
merged 4 commits into from
Nov 4, 2024
Merged

Conversation

dhruvmanila
Copy link
Member

@dhruvmanila dhruvmanila commented Oct 29, 2024

Summary

This PR adds a new salsa query and an ingredient to resolve all the variables involved in an unpacking assignment like (a, b) = (1, 2) at once. Previously, we'd recursively try to match the correct type for each definition individually which will result in creating duplicate diagnostics.

This PR still doesn't solve the duplicate diagnostics issue because that requires a different solution like using salsa accumulator or de-duplicating the diagnostics manually.

Related: #13773

Test Plan

Make sure that all unpack assignment test cases pass, there are no panics in the corpus tests.

Todo

  • Look at the performance regression

@dhruvmanila dhruvmanila added the red-knot Multi-file analysis & type inference label Oct 29, 2024
Copy link

codspeed-hq bot commented Oct 29, 2024

CodSpeed Performance Report

Merging #13979 will not alter performance

Comparing dhruv/unpack (c039525) with main (012f385)

Summary

✅ 32 untouched benchmarks

@dhruvmanila dhruvmanila force-pushed the dhruv/diagnostics-builder branch 2 times, most recently from 5cbd74e to 257ce5a Compare October 30, 2024 08:10
@dhruvmanila dhruvmanila force-pushed the dhruv/unpack branch 2 times, most recently from 71c4911 to c03f987 Compare October 30, 2024 10:05
Base automatically changed from dhruv/diagnostics-builder to main October 30, 2024 18:50
dhruvmanila added a commit that referenced this pull request Oct 30, 2024
## Summary

This PR creates a new `TypeCheckDiagnosticsBuilder` for the
`TypeCheckDiagnostics` struct. The main motivation behind this is to
separate the helpers required to build the diagnostics from the type
inference builder itself. This allows us to use such helpers outside of
the inference builder like for example in the unpacking logic in
#13979.

## Test Plan

`cargo insta test`
Copy link
Contributor

github-actions bot commented Oct 30, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@dhruvmanila
Copy link
Member Author

I'm not exactly sure what's causing the performance regression. It might very well be the fact that this PR creates additional ingredients and goes through a new salsa query which adds the overhead. This is because the tomllib contains multiple assignment statements where unpacking is required:

1red_knot_python_semantic::types::unpack::Unpack                             55           55           55

@dhruvmanila dhruvmanila marked this pull request as ready for review November 1, 2024 11:29
@MichaReiser
Copy link
Member

Is it only that there are 55 new unpackings, or are there other ingredients with a different count?

@dhruvmanila
Copy link
Member Author

Is it only that there are 55 new unpackings, or are there other ingredients with a different count?

There are other ingredients with different count:

main

1red_knot_python_semantic::semantic_index::definition::Definition         6_643        6_643        6_643
1red_knot_python_semantic::semantic_index::expression::Expression           553          553          553
1red_knot_python_semantic::semantic_index::symbol::ScopeId                1_978        1_978        1_978
1ruff_db::files::File                                                        59           59           59
1ruff_db::source::SourceText                                                 15           15           15
1                                                                         total     max_live         live

dhruv/unpack

1red_knot_python_semantic::semantic_index::definition::Definition         6_620        6_620        6_620
1red_knot_python_semantic::semantic_index::expression::Expression           549          549          549
1red_knot_python_semantic::semantic_index::symbol::ScopeId                1_974        1_974        1_974
1red_knot_python_semantic::types::unpack::Unpack                             55           55           55
1ruff_db::files::File                                                        59           59           59
1ruff_db::source::SourceText                                                 15           15           15
1                                                                         total     max_live         live

@dhruvmanila
Copy link
Member Author

Oh ok, I got it. I think the Unpack ingredient is not well isolated for each statement.

@dhruvmanila
Copy link
Member Author

Actually, I think it's because I'm creating the ingredient multiple times. I think I need to create it once in the semantic index and store it.

@dhruvmanila dhruvmanila marked this pull request as draft November 1, 2024 13:24
@dhruvmanila
Copy link
Member Author

(I've put this in draft to fix the regression.)

@dhruvmanila dhruvmanila force-pushed the dhruv/unpack branch 2 times, most recently from 1b40c22 to cc123b1 Compare November 4, 2024 07:20
@dhruvmanila dhruvmanila marked this pull request as ready for review November 4, 2024 07:21
@dhruvmanila dhruvmanila merged commit e302c2d into main Nov 4, 2024
20 checks passed
@dhruvmanila dhruvmanila deleted the dhruv/unpack branch November 4, 2024 11:41
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you!! And thanks for starting the process of slimming down infer.rs a little bit :)

Comment on lines +51 to +55
/// The [`Unpack`] ingredient for the current definition that belongs to an unpacking
/// assignment. This is used to correctly map multiple definitions to the *same* unpacking.
/// For example, in `a, b = 1, 2`, both `a` and `b` creates separate definitions but they both
/// belong to the same unpacking.
current_unpack: Option<Unpack<'db>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a small nit, but it seems like maybe this could be part of CurrentAssignment rather than its own separately-maintained state? But maybe I've missed some subtlety why that can't work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I initially started with moving this information via CurrentAssignment -> DefinitionNodeRef -> DefinitionKind but that'll involve adding lifetimes and moving them around because Unpack uses a different lifetime while CurrentAssignment and DefinitionNodeRef uses the AST lifetime. But, now that I look at it again I think it might work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #14101

Comment on lines +1362 to +1363
let unpacked = infer_unpack_types(self.db, unpack);
self.diagnostics.extend(unpacked.diagnostics());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the quick-fix approach to the duplicate diagnostics would be to track on self which unpacks we've queried already, and if we've seen it already, don't merge the diagnostics.

But I think it makes sense to see if accumulators can work, since that's a more general approach that doesn't require us to manually track when we might call a sub-query more than once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I suggest not to spend more time on deduplicating diagnostics because I plan to look into this soon anyway

carljm added a commit to Glyphack/ruff that referenced this pull request Nov 5, 2024
* main: (39 commits)
  Also remove trailing comma while fixing C409 and C419 (astral-sh#14097)
  Re-enable clippy `useless-format` (astral-sh#14095)
  Derive message formats macro support to string (astral-sh#14093)
  Avoid cloning `Name` when looking up function and class types (astral-sh#14092)
  Replace `format!` without parameters with `.to_string()` (astral-sh#14090)
  [red-knot] Do not panic when encountering string annotations (astral-sh#14091)
  [red-knot] Add MRO resolution for classes (astral-sh#14027)
  [red-knot] Remove `Type::None` (astral-sh#14024)
  Cached inference of all definitions in an unpacking (astral-sh#13979)
  Update dependency uuid to v11 (astral-sh#14084)
  Update Rust crate notify to v7 (astral-sh#14083)
  Update cloudflare/wrangler-action action to v3.11.0 (astral-sh#14080)
  Update dependency mdformat-mkdocs to v3.1.1 (astral-sh#14081)
  Update pre-commit dependencies (astral-sh#14082)
  Update dependency ruff to v0.7.2 (astral-sh#14077)
  Update NPM Development dependencies (astral-sh#14078)
  Update Rust crate thiserror to v1.0.67 (astral-sh#14076)
  Update Rust crate syn to v2.0.87 (astral-sh#14075)
  Update Rust crate serde to v1.0.214 (astral-sh#14074)
  Update Rust crate pep440_rs to v0.7.2 (astral-sh#14073)
  ...
dhruvmanila added a commit that referenced this pull request Nov 6, 2024
## Summary

Related to
#13979 (comment),
this PR removes the `current_unpack` state field from
`SemanticIndexBuilder` and passes the `Unpack` ingredient via the
`CurrentAssignment` -> `DefinitionNodeRef` conversion to finally store
it on `DefintionNodeKind`.

This involves updating the lifetime of `AnyParameterRef` (parameter to
`declare_parameter`) to use the `'db` lifetime. Currently, all AST nodes
stored on various enums are marked with `'a` lifetime but they're always
utilized using the `'db` lifetime.

This also removes the dedicated `'a` lifetime parameter on
`add_definition` which is currently being used in `DefinitionNodeRef`.
As mentioned, all AST nodes live through the `'db` lifetime so we can
remove the `'a` lifetime parameter from that method and use the `'db`
lifetime instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants