Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Respect typeshed's VERSIONS file when resolving stdlib modules #12141

Merged
merged 62 commits into from
Jul 5, 2024

Conversation

AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood commented Jul 1, 2024

Summary

In the module resolver, we need to treat standard-library paths differently to paths relative to other module-resolver search paths. This is for three reasons:

  • Unlike for other module-resolver paths, only .pyi files are valid relative to standard-library paths
  • When checking whether standard-library paths "exist", we need to take account of typeshed's VERSIONS file as well as simply checking whether the path exists on disk.
  • Although this isn't implemented now, we will also at some point need to implement support for vendored standard-library files as well as standard-library files in a custom typeshed directory that actually exists on disk.

Although it's not implemented here yet, two other kinds of paths will also need special treatment in the module resolver:

  • The module resolver needs to treat site-packages search paths differently, due to the fact that a module foo could resolve to a site-packages/foo path or a site-packages/foo-stubs/foo path
  • If we decide to vendor third-party stubs from typeshed as well as standard-library stubs, these will need to be treated specially as well.

To account for these differences, this PR introduces a new module, crates/red_knot_module_resolver/src/path.rs. The new module has two enums, ModuleResolutionPathBuf and ModuleResolutionPathRef. Each enum has four different variants representing the four different kinds of search paths the module resolver must handle:

The different variants abstract over the fact that for different kinds of paths, there will be different answers to questions such as "Is this a valid join() operation?", "Does this path exist?" and "Does this path represent a directory?".

Test Plan

Unit tests have been added to src/path.rs and src/typeshed/versions.rs. Integration tests have been added to src/resolver.rs.

@AlexWaygood AlexWaygood added the red-knot Multi-file analysis & type inference label Jul 1, 2024
Copy link
Contributor

github-actions bot commented Jul 1, 2024

ruff-ecosystem results

Linter (stable)

ℹ️ ecosystem check encountered linter errors. (no lint changes; 1 project error)

python-trio/trio (error)

Failed to clone python-trio/trio: warning: Could not find remote branch master to clone.
fatal: Remote branch master not found in upstream origin

Linter (preview)

ℹ️ ecosystem check encountered linter errors. (no lint changes; 1 project error)

python-trio/trio (error)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

Failed to clone python-trio/trio: warning: Could not find remote branch master to clone.
fatal: Remote branch master not found in upstream origin

Formatter (stable)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

python-trio/trio (error)

Failed to clone python-trio/trio: warning: Could not find remote branch master to clone.
fatal: Remote branch master not found in upstream origin

Formatter (preview)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

python-trio/trio (error)

ruff format --preview

Failed to clone python-trio/trio: warning: Could not find remote branch master to clone.
fatal: Remote branch master not found in upstream origin

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel a bit mixed about having all those different Path versions. It feels like the right level of abstraction but it's a lot of code and it becomes hard to spot how they are different because of all the boilerplate that is identical between all paths. I think we should try to reduce the amount of code. I'm not entirely sure yet how, but I think we should try

I think a simpler design would be to just have an enum with different variants. Very similar to what we have with ModuleSearchPath. I think it would help readability because I can then see in the methods how the different path variants differ.

One thing that I think would already help if we wouldn't need both Path and PathBuf variants. Are there many places where we use Path where it helps to avoid allocating and where we can't use a &PathBuf?

It seems that ModuleSearchPath implements a s similar interface to each Path variant.

crates/red_knot_module_resolver/src/db.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/module.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/typeshed/versions.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/typeshed/versions.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/db.rs Outdated Show resolved Hide resolved
@AlexWaygood
Copy link
Member Author

Thanks, agree that it's a lot of code (more than I thought it would be, and more than I'm comfortable with). I'll try to address that.

@AlexWaygood
Copy link
Member Author

d91626c made this PR around 600 lines of code shorter! The path.rs file looks pretty different now 🥳

#[salsa::tracked]
pub(crate) fn parse_typeshed_versions(db: &dyn Db, versions_file: VfsFile) -> TypeshedVersions {
let file_content = source_text(db.upcast(), versions_file);
file_content.parse().unwrap()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unwrapping is obviously the wrong thing to do here... Should I just return a Result from this query? Not sure how to handle this. If the user passes us a custom typeshed directory with an invalid VERSIONS file, we can't infer any information about the standard library

@AlexWaygood
Copy link
Member Author

This still needs a lot more tests, but other than that it's basically where I want it to be now... though there's still a bunch of .unwrap() calls where I should be bubbling up errors instead.

@MichaReiser
Copy link
Member

This still needs a lot more tests, but other than that it's basically where I want it to be now... though there's still a bunch of .unwrap() calls where I should be bubbling up errors instead.

Handling and propagating errors is something that I have yet to fully figure out for Red Knot.

Generally, the idea is that query methods shouldn't fail and instead keep doing something useful to make ruff more error resilient and it simplifies consuming queries (it would be rather annoying if source_text returned a Result because all dependent queries would have to return a Result as well).

I could see this working for source_text where reading a file failed. ' source_text pushes a diagnostic using a Salsa accumulator, returns an empty string, and stores a flag on the result that the read operation failed. For most queries calling source_text, checking the error isn't necessary, an empty string works just fine for them. However, we probably want to check the error flag when calling check_file or format or any other "file-level" operation to short-circuit in this case.

I see two options for how we can handle this with versions:

  • If reading versions fails (or the custom typeshed directory doesn't exist), emit a diagnostic and fall back to the vendored stubs.
  • It's still unclear to me if we handle settings as part of queries or if settings are discovered ahead of time when discovering all the workspace files. If that's the case, then parsing and resolving the settings would happen outside salsa where we could handle the error appropriately.

For now, I think adding a TODO and/or opening an issue to follow up with error handling seems fine to me.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good!

I think we can simplify it a bit further and I would move some code around but we're on a good path here!

crates/red_knot_module_resolver/src/path.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/path.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/path.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/path.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/path.rs Outdated Show resolved Hide resolved
/// Optional (already validated) path to standard-library typeshed stubs.
/// If this is not provided, we will fallback to our vendored typeshed stubs
/// bundled as a zip file in the binary
pub custom_typeshed: Option<FileSystemPathBuf>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should change the settings to Option<(FileSystemPathBuf, TypeshedVersions)>. This way, parsing the typeshed versions happens outside of salsa (solves the error reporting problem).

Copy link
Member Author

@AlexWaygood AlexWaygood Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, but we need to re-parse the VERSIONS file every time it changes if we're in an LSP context. I think a Salsa query is pretty well suited to this, if I understand correctly. The difficulty just comes in bubbling the error out of salsa

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the need for recomputing also applies to the custom_typeshed path?

Salsa will re-run all resolve_module queries when the ModuleResolutionSettings change. The only thing that happens outside (at least for now) is how we react to setting changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user won't pass a direct path to the VERSIONS file on the command line or in a config file. They'll pass a direct path to the custom typeshed directory, and the VERSIONS file is found inside that directory. If they then edit the contents of the VERSIONS file (or delete it entirely), we'll need to react to that change and attempt to re-parse the VERSIONS file in the custom typeshed directory, even though none of the settings passed by the user will have changed in any way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'm okay with delaying it. I do think that we want some form of eager validation of the typeshed path and possibly even abort if the path is invalid. But we can look into this when adding support for settings.

Having said that. I would expect that the CLI eagerly validates the custom typeshed path and aborts if the path doesn't exist or the versions file can't be parsed.

crates/red_knot_module_resolver/src/resolver.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/resolver.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/typeshed/versions.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/typeshed/versions.rs Outdated Show resolved Hide resolved
@MichaReiser MichaReiser self-requested a review July 3, 2024 07:50
@AlexWaygood AlexWaygood force-pushed the custom-typeshed-versions branch 2 times, most recently from 7b453a4 to 378d01d Compare July 3, 2024 10:05
Comment on lines -520 to -542
// TODO: Port typeshed test case. Porting isn't possible at the moment because the vendored zip
// is part of the red knot crate
// #[test]
// fn typeshed_zip_created_at_build_time() -> anyhow::Result<()> {
// // The file path here is hardcoded in this crate's `build.rs` script.
// // Luckily this crate will fail to build if this file isn't available at build time.
// const TYPESHED_ZIP_BYTES: &[u8] =
// include_bytes!(concat!(env!("OUT_DIR"), "/zipped_typeshed.zip"));
// assert!(!TYPESHED_ZIP_BYTES.is_empty());
// let mut typeshed_zip_archive = ZipArchive::new(Cursor::new(TYPESHED_ZIP_BYTES))?;
//
// let path_to_functools = Path::new("stdlib").join("functools.pyi");
// let mut functools_module_stub = typeshed_zip_archive
// .by_name(path_to_functools.to_str().unwrap())
// .unwrap();
// assert!(functools_module_stub.is_file());
//
// let mut functools_module_stub_source = String::new();
// functools_module_stub.read_to_string(&mut functools_module_stub_source)?;
//
// assert!(functools_module_stub_source.contains("def update_wrapper("));
// Ok(())
// }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I already copied over this test into typeshed.rs in a prior PR. I thought I'd deleted this commented out version, but I obviously hadn't.)

crates/red_knot_module_resolver/src/path.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/path.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/path.rs Outdated Show resolved Hide resolved
}

#[derive(Clone, PartialEq, Eq, Hash)]
pub(crate) struct ModuleResolutionPathBuf(ModuleResolutionPathBufInner);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it quite confusing in reading the code, and I think also it is error-prone, that we use the same type to represent both (a) a search path root, and (b) the path to an actual module. I think these are different things that need to obey different constraints (e.g. a search path root should never have any file extension at all) and provide different behaviors (e.g. a search path root shouldn't be mutable; we shouldn't ever push new components to it; many of the other methods provided here are also never used - and should never be used, because they aren't even meaningful - on a search root).

IMO it will be a lot clearer if we explicitly represent these as different types. Right now we only know which is which by context in the calling code; we get no help from the type system in differentiating two things that should never be confused with each other. I don't think it would even be that complex a change; it would just mean that in some cases where we currently "clone" a search root into an initial package path, we'd instead be explicitly constructing one type from the other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might even make sense that a package/module path is always represented as simply a relative path and an Arc to a search root. Then the "kind" is already encoded in the search root kind, and we can also eliminate all these cases of needing to make sure the passed-in search root kind matches the module path kind.

Copy link
Member

@MichaReiser MichaReiser Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alex's first version used different types for each path variant and it added a ton of boilerplate code (about 1000 lines) that made it very hard to find the "important" bits. I also think that it didn't add much value because the only places where the different path types were used were in the match arms of that given variant. In which case using different types doesn't protect from much. If you screw up the variant name, than it is as easy to screw up the path type as well (and it's a very easy error to spot).

I would think differently if the paths were exposed externally, which they are not.

Whether we should expose the search path root from an actual path differently, I don't know. It's also unclear to me how the code would need to change for that.

Copy link
Contributor

@carljm carljm Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think "use different types for each path variant" is at all related to the suggestion I'm making here. I don't think that would be a good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After offline discussion with @MichaReiser, our thinking is that this might be an improvement, but it might be better to record it in a TODO comment and land the PR so we don't have a big PR outstanding for so long.

I'm ok with this: this is all pretty much encapsulated in a few Salsa queries with simple API, so it's not like we will be writing a bunch of new code directly depending on these path abstractions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that what you propose sounds cleaner, but I think what I have now is okay, and I'd like to land this now. If I have time next week, I'm happy to come back to this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a TODO

}

#[must_use]
fn relativize_path(&self, absolute_path: &'a FileSystemPath) -> Option<Self> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding it quite hard to understand the handling of relative vs absolute paths. Here in this method we create instances of ModuleResolutionPathRef that contain relative paths, but then in the above methods it seems like we expect them to contain absolute paths, because for non-stdlib variants we just ignore the passed-in search path entirely.

I think we should establish some clarity/invariants around which kinds of paths should be absolute and which should be relative.

target_version: TargetVersion,
) -> bool {
match (self, search_path) {
(Self::Extra(path), Self::Extra(_)) => db.file_system().is_directory(path),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like that we ignore the passed-in search path entirely for non-stdlib variants in so many of these methods. It seems very error-prone. At the very least I'd want to see some debug assertions of consistency, though I'd rather have an API that didn't have redundant information in the first place (e.g. if module paths were enforced as always relative and required to be combined with a search path to be turned into an absolute path.)

crates/red_knot_module_resolver/src/resolver.rs Outdated Show resolved Hide resolved
crates/red_knot_module_resolver/src/resolver.rs Outdated Show resolved Hide resolved
@MichaReiser
Copy link
Member

MichaReiser commented Jul 5, 2024

You can avoid the Arc by adding a few lifetimes:

Index: crates/red_knot_module_resolver/src/resolver.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/red_knot_module_resolver/src/resolver.rs b/crates/red_knot_module_resolver/src/resolver.rs
--- a/crates/red_knot_module_resolver/src/resolver.rs	(revision 4c7a1061a9e11a1b404e1c5cf6c334a2890a73c1)
+++ b/crates/red_knot_module_resolver/src/resolver.rs	(date 1720208092508)
@@ -309,11 +309,11 @@
     None
 }
 
-fn resolve_package<'a, I>(
-    db: &dyn Db,
+fn resolve_package<'a, 'db, I>(
+    db: &'db dyn Db,
     module_search_path: &ModuleResolutionPathBuf,
     components: I,
-    typeshed_versions: &LazyTypeshedVersions,
+    typeshed_versions: &LazyTypeshedVersions<'db>,
     target_version: TargetVersion,
 ) -> Result<ResolvedPackage, PackageKind>
 where
Index: crates/red_knot_module_resolver/src/typeshed/versions.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/red_knot_module_resolver/src/typeshed/versions.rs b/crates/red_knot_module_resolver/src/typeshed/versions.rs
--- a/crates/red_knot_module_resolver/src/typeshed/versions.rs	(revision 4c7a1061a9e11a1b404e1c5cf6c334a2890a73c1)
+++ b/crates/red_knot_module_resolver/src/typeshed/versions.rs	(date 1720207666907)
@@ -14,11 +14,13 @@
 
 use crate::db::Db;
 use crate::module_name::ModuleName;
+use crate::resolver::ResolvedModuleResolutionSettings;
 use crate::supported_py_version::TargetVersion;
+use crate::ModuleResolutionSettings;
 
-pub(crate) struct LazyTypeshedVersions(OnceCell<TypeshedVersions>);
+pub(crate) struct LazyTypeshedVersions<'db>(OnceCell<&'db TypeshedVersions>);
 
-impl LazyTypeshedVersions {
+impl<'db> LazyTypeshedVersions<'db> {
     #[must_use]
     pub(crate) fn new() -> Self {
         Self(OnceCell::new())
@@ -40,7 +42,7 @@
     pub(crate) fn query_module(
         &self,
         module: &ModuleName,
-        db: &dyn Db,
+        db: &'db dyn Db,
         stdlib_root: &FileSystemPath,
         target_version: TargetVersion,
     ) -> TypeshedVersionsQueryResult {
@@ -56,16 +58,14 @@
             // this should invalidate not just the specific module resolution we're currently attempting,
             // but all type inference that depends on any standard-library types.
             // Unwrapping here is not correct...
-            parse_typeshed_versions(db, versions_file)
-                .as_ref()
-                .unwrap()
-                .clone()
+            parse_typeshed_versions(db, versions_file).as_ref().unwrap()
         });
+
         versions.query_module(module, PyVersion::from(target_version))
     }
 }
 
-#[salsa::tracked]
+#[salsa::tracked(return_ref)]
 pub(crate) fn parse_typeshed_versions(
     db: &dyn Db,
     versions_file: VfsFile,
@@ -152,8 +152,8 @@
     }
 }
 
-#[derive(Debug, Clone, PartialEq, Eq)]
-pub(crate) struct TypeshedVersions(Arc<FxHashMap<ModuleName, PyVersionRange>>);
+#[derive(Debug, PartialEq, Eq)]
+pub(crate) struct TypeshedVersions(FxHashMap<ModuleName, PyVersionRange>);
 
 impl TypeshedVersions {
     #[must_use]
@@ -300,7 +300,7 @@
                 reason: TypeshedVersionsParseErrorKind::EmptyVersionsFile,
             })
         } else {
-            Ok(Self(Arc::new(map)))
+            Ok(Self(map))
         }
     }
 }
Index: crates/red_knot_module_resolver/src/path.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/red_knot_module_resolver/src/path.rs b/crates/red_knot_module_resolver/src/path.rs
--- a/crates/red_knot_module_resolver/src/path.rs	(revision 4c7a1061a9e11a1b404e1c5cf6c334a2890a73c1)
+++ b/crates/red_knot_module_resolver/src/path.rs	(date 1720208063481)
@@ -108,11 +108,11 @@
     }
 
     #[must_use]
-    pub(crate) fn is_regular_package(
+    pub(crate) fn is_regular_package<'db>(
         &self,
-        db: &dyn Db,
+        db: &'db dyn Db,
         search_path: &Self,
-        typeshed_versions: &LazyTypeshedVersions,
+        typeshed_versions: &LazyTypeshedVersions<'db>,
         target_version: TargetVersion,
     ) -> bool {
         ModuleResolutionPathRef::from(self).is_regular_package(
@@ -124,11 +124,11 @@
     }
 
     #[must_use]
-    pub(crate) fn is_directory(
+    pub(crate) fn is_directory<'db>(
         &self,
-        db: &dyn Db,
+        db: &'db dyn Db,
         search_path: &Self,
-        typeshed_versions: &LazyTypeshedVersions,
+        typeshed_versions: &LazyTypeshedVersions<'db>,
         target_version: TargetVersion,
     ) -> bool {
         ModuleResolutionPathRef::from(self).is_directory(
@@ -158,11 +158,11 @@
     }
 
     /// Returns `None` if the path doesn't exist, isn't accessible, or if the path points to a directory.
-    pub(crate) fn to_vfs_file(
+    pub(crate) fn to_vfs_file<'db>(
         &self,
-        db: &dyn Db,
+        db: &'db dyn Db,
         search_path: &Self,
-        typeshed_versions: &LazyTypeshedVersions,
+        typeshed_versions: &LazyTypeshedVersions<'db>,
         target_version: TargetVersion,
     ) -> Option<VfsFile> {
         ModuleResolutionPathRef::from(self).to_vfs_file(
@@ -198,12 +198,12 @@
 
 impl<'a> ModuleResolutionPathRefInner<'a> {
     #[must_use]
-    fn query_stdlib_version(
+    fn query_stdlib_version<'db>(
         module_path: &'a FileSystemPath,
-        typeshed_versions: &LazyTypeshedVersions,
+        typeshed_versions: &LazyTypeshedVersions<'db>,
         stdlib_search_path: Self,
         stdlib_root: &FileSystemPath,
-        db: &dyn Db,
+        db: &'db dyn Db,
         target_version: TargetVersion,
     ) -> TypeshedVersionsQueryResult {
         let Some(module_name) = stdlib_search_path
@@ -216,11 +216,11 @@
     }
 
     #[must_use]
-    fn is_directory(
+    fn is_directory<'db>(
         &self,
-        db: &dyn Db,
+        db: &'db dyn Db,
         search_path: Self,
-        typeshed_versions: &LazyTypeshedVersions,
+        typeshed_versions: &LazyTypeshedVersions<'db>,
         target_version: TargetVersion,
     ) -> bool {
         match (self, search_path) {
@@ -241,11 +241,11 @@
     }
 
     #[must_use]
-    fn is_regular_package(
+    fn is_regular_package<'db>(
         &self,
-        db: &dyn Db,
+        db: &'db dyn Db,
         search_path: Self,
-        typeshed_versions: &LazyTypeshedVersions,
+        typeshed_versions: &LazyTypeshedVersions<'db>,
         target_version: TargetVersion,
     ) -> bool {
         fn is_non_stdlib_pkg(path: &FileSystemPath, db: &dyn Db) -> bool {
@@ -274,11 +274,11 @@
         }
     }
 
-    fn to_vfs_file(
+    fn to_vfs_file<'db>(
         self,
-        db: &dyn Db,
+        db: &'db dyn Db,
         search_path: Self,
-        typeshed_versions: &LazyTypeshedVersions,
+        typeshed_versions: &LazyTypeshedVersions<'db>,
         target_version: TargetVersion,
     ) -> Option<VfsFile> {
         match (self, search_path) {
@@ -386,11 +386,11 @@
 
 impl<'a> ModuleResolutionPathRef<'a> {
     #[must_use]
-    pub(crate) fn is_directory(
+    pub(crate) fn is_directory<'db>(
         &self,
-        db: &dyn Db,
+        db: &'db dyn Db,
         search_path: impl Into<Self>,
-        typeshed_versions: &LazyTypeshedVersions,
+        typeshed_versions: &LazyTypeshedVersions<'db>,
         target_version: TargetVersion,
     ) -> bool {
         self.0
@@ -398,11 +398,11 @@
     }
 
     #[must_use]
-    pub(crate) fn is_regular_package(
+    pub(crate) fn is_regular_package<'db>(
         &self,
-        db: &dyn Db,
+        db: &'db dyn Db,
         search_path: impl Into<Self>,
-        typeshed_versions: &LazyTypeshedVersions,
+        typeshed_versions: &LazyTypeshedVersions<'db>,
         target_version: TargetVersion,
     ) -> bool {
         self.0
@@ -410,11 +410,11 @@
     }
 
     #[must_use]
-    pub(crate) fn to_vfs_file(
+    pub(crate) fn to_vfs_file<'db>(
         self,
-        db: &dyn Db,
+        db: &'db dyn Db,
         search_path: impl Into<Self>,
-        typeshed_versions: &LazyTypeshedVersions,
+        typeshed_versions: &LazyTypeshedVersions<'db>,
         target_version: TargetVersion,
     ) -> Option<VfsFile> {
         self.0
@@ -794,7 +794,7 @@
         );
     }
 
-    fn py38_stdlib_test_case() -> (TestDb, ModuleResolutionPathBuf, LazyTypeshedVersions) {
+    fn py38_stdlib_test_case() -> (TestDb, ModuleResolutionPathBuf) {
         let TestCase {
             db,
             custom_typeshed,
@@ -802,12 +802,14 @@
         } = create_resolver_builder().unwrap().build();
         let stdlib_module_path =
             ModuleResolutionPathBuf::stdlib_from_typeshed_root(&custom_typeshed).unwrap();
-        (db, stdlib_module_path, LazyTypeshedVersions::new())
+        (db, stdlib_module_path)
     }
 
     #[test]
     fn mocked_typeshed_existing_regular_stdlib_pkg_py38() {
-        let (db, stdlib_path, versions) = py38_stdlib_test_case();
+        let (db, stdlib_path) = py38_stdlib_test_case();
+
+        let versions = LazyTypeshedVersions::new();
 
         let asyncio_regular_package = stdlib_path.join("asyncio");
         assert!(asyncio_regular_package.is_directory(
@@ -855,7 +857,9 @@
 
     #[test]
     fn mocked_typeshed_existing_namespace_stdlib_pkg_py38() {
-        let (db, stdlib_path, versions) = py38_stdlib_test_case();
+        let (db, stdlib_path) = py38_stdlib_test_case();
+
+        let versions = LazyTypeshedVersions::new();
 
         let xml_namespace_package = stdlib_path.join("xml");
         assert!(xml_namespace_package.is_directory(
@@ -886,7 +890,9 @@
 
     #[test]
     fn mocked_typeshed_single_file_stdlib_module_py38() {
-        let (db, stdlib_path, versions) = py38_stdlib_test_case();
+        let (db, stdlib_path) = py38_stdlib_test_case();
+
+        let versions = LazyTypeshedVersions::new();
 
         let functools_module = stdlib_path.join("functools.pyi");
         assert!(functools_module
@@ -903,9 +909,10 @@
 
     #[test]
     fn mocked_typeshed_nonexistent_regular_stdlib_pkg_py38() {
-        let (db, stdlib_path, versions) = py38_stdlib_test_case();
+        let (db, stdlib_path) = py38_stdlib_test_case();
 
         let collections_regular_package = stdlib_path.join("collections");
+        let versions = LazyTypeshedVersions::new();
         assert_eq!(
             collections_regular_package.to_vfs_file(
                 &db,
@@ -931,7 +938,8 @@
 
     #[test]
     fn mocked_typeshed_nonexistent_namespace_stdlib_pkg_py38() {
-        let (db, stdlib_path, versions) = py38_stdlib_test_case();
+        let (db, stdlib_path) = py38_stdlib_test_case();
+        let versions = LazyTypeshedVersions::new();
 
         let importlib_namespace_package = stdlib_path.join("importlib");
         assert_eq!(
@@ -972,7 +980,8 @@
 
     #[test]
     fn mocked_typeshed_nonexistent_single_file_module_py38() {
-        let (db, stdlib_path, versions) = py38_stdlib_test_case();
+        let (db, stdlib_path) = py38_stdlib_test_case();
+        let versions = LazyTypeshedVersions::new();
 
         let non_existent = stdlib_path.join("doesnt_even_exist");
         assert_eq!(
@@ -988,7 +997,7 @@
         ));
     }
 
-    fn py39_stdlib_test_case() -> (TestDb, ModuleResolutionPathBuf, LazyTypeshedVersions) {
+    fn py39_stdlib_test_case() -> (TestDb, ModuleResolutionPathBuf) {
         let TestCase {
             db,
             custom_typeshed,
@@ -999,12 +1008,13 @@
             .build();
         let stdlib_module_path =
             ModuleResolutionPathBuf::stdlib_from_typeshed_root(&custom_typeshed).unwrap();
-        (db, stdlib_module_path, LazyTypeshedVersions::new())
+        (db, stdlib_module_path)
     }
 
     #[test]
     fn mocked_typeshed_existing_regular_stdlib_pkgs_py39() {
-        let (db, stdlib_path, versions) = py39_stdlib_test_case();
+        let (db, stdlib_path) = py39_stdlib_test_case();
+        let versions = LazyTypeshedVersions::new();
 
         // Since we've set the target version to Py39,
         // `collections` should now exist as a directory, according to VERSIONS...
@@ -1057,7 +1067,9 @@
 
     #[test]
     fn mocked_typeshed_existing_namespace_stdlib_pkg_py39() {
-        let (db, stdlib_path, versions) = py39_stdlib_test_case();
+        let (db, stdlib_path) = py39_stdlib_test_case();
+
+        let versions = LazyTypeshedVersions::new();
 
         // The `importlib` directory now also exists...
         let importlib_namespace_package = stdlib_path.join("importlib");
@@ -1100,7 +1112,8 @@
 
     #[test]
     fn mocked_typeshed_nonexistent_namespace_stdlib_pkg_py39() {
-        let (db, stdlib_path, versions) = py39_stdlib_test_case();
+        let (db, stdlib_path) = py39_stdlib_test_case();
+        let versions = LazyTypeshedVersions::new();
 
         // The `xml` package no longer exists on py39:
         let xml_namespace_package = stdlib_path.join("xml");

You also don't need a new dependency. You can use std::cell::OnceCell

But all the methods that take a db as argument feel a bit awkward because of the many arguments. It makes me wonder if we shouldn't create a stateful ModuleResolver struct that handles a single resolution

struct ModuleResolver<'db> {
	db: &'db Db,
	versions: Option<&'db TypeshedVersions>,
	target_version: PyVersion
}

impl ModuleResolver {
	fn resolve_package(&mut self, module_resolution_path, components) -> Result<ResolvedPackage, PackageKind> {
		...
	}

	fn is_directory(&mut self, module_resolution_path: &ModuleResolutionPath) -> bool {
		...
	}

	fn is_regular_package(&mut self, ...)
}

The methods can all take &self if versions uses a OnceCell, but I also don't mind passing a mut. Anyway, I don't think we have to do this now. I do like @carljm suggestion. Let's revisit this when also making that change.

@AlexWaygood AlexWaygood enabled auto-merge (squash) July 5, 2024 22:39
@AlexWaygood AlexWaygood merged commit a62a432 into main Jul 5, 2024
18 checks passed
@AlexWaygood AlexWaygood deleted the custom-typeshed-versions branch July 5, 2024 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants