Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dependency on pkg_resources from setuptools #1536

Merged
merged 22 commits into from
May 23, 2022

Conversation

jacobtylerwalls
Copy link
Member

@jacobtylerwalls jacobtylerwalls commented May 1, 2022

Steps

  • For new features or bug fixes, add a ChangeLog entry describing what your PR does.
  • Write a good description on what the PR does.

Description

Remove dependence on pkg_resources from setuptools, which is discouraged, and which increases import time.

Type of Changes

Type
🔨 Refactoring

Related Issue

Closes #1103
Closes #1320

  • [unscientific benchmark]: 50% reduction of import time on an M1 mac, 67% reduction on a cloud console

@jacobtylerwalls jacobtylerwalls added topic-performance Maintenance Discussion or action around maintaining astroid or the dev workflow labels May 1, 2022
@jacobtylerwalls jacobtylerwalls added this to the 2.12.0 milestone May 1, 2022
@DanielNoord
Copy link
Collaborator

Haven't looked at the code, but please see #1326 and it's reversal in #1409.

I don't have the exact reasons on the top of my head, but we should be careful here as this already regressed once.

@Pierre-Sassoulas Pierre-Sassoulas added the High effort 🏋 Difficult solution or problem to solve label May 1, 2022
@jacobtylerwalls
Copy link
Member Author

jacobtylerwalls commented May 1, 2022

the exact reasons

Thanks for the heads up! Looks like it was because parent packages were actually being imported. I believe I worked around that (I see one fewer failure in the pylint test suite that leads me to that belief).

There is still this one failure in pylint, though:

_____________________________________________________________________________________________________________ TestRunTC.test_regression_recursive_current_dir _____________________________________________________________________________________________________________

self = <test_self.TestRunTC object at 0x10fa44b20>

def test_regression_recursive_current_dir(self):
    with _test_sys_path():
        # pytest is including directory HERE/regrtest_data to sys.path which causes
        # astroid to believe that directory is a package.
        sys.path = [
            path
            for path in sys.path
            if not os.path.basename(path) == "regrtest_data"
        ]
        with _test_cwd():
            os.chdir(join(HERE, "regrtest_data", "directory"))
          self._test_output(
                ["."],
                expected_output="No such file or directory",
            )

tests/test_self.py:1270:


self = <test_self.TestRunTC object at 0x10fa44b20>, args = ['--rcfile=/Users/jacobwalls/pylint/pylint/testutils/testing_pylintrc', '.'], expected_output = 'No such file or directory'

def _test_output(self, args: list[str], expected_output: str) -> None:
    out = StringIO()
    args = _add_rcfile_default_pylintrc(args)
    self._run_pylint(args, out=out)
    actual_output = self._clean_paths(out.getvalue())
    expected_output = self._clean_paths(expected_output)
  assert expected_output.strip() in actual_output.strip()

E AssertionError: assert 'No such file or directory' in ''
E + where 'No such file or directory' = <built-in method strip of str object at 0x10f9d0800>()
E + where <built-in method strip of str object at 0x10f9d0800> = 'No such file or directory'.strip
E + and '' = <built-in method strip of str object at 0x10ca30030>()
E + where <built-in method strip of str object at 0x10ca30030> = ''.strip

tests/test_self.py:169: AssertionError

@coveralls
Copy link

coveralls commented May 3, 2022

Pull Request Test Coverage Report for Build 2331561683

  • 27 of 27 (100.0%) changed or added relevant lines in 3 files are covered.
  • 3 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.06%) to 91.864%

Files with Coverage Reduction New Missed Lines %
astroid/inference_tip.py 1 97.22%
astroid/interpreter/_import/spec.py 2 98.25%
Totals Coverage Status
Change from base Build 2321941977: 0.06%
Covered Lines: 9213
Relevant Lines: 10029

💛 - Coveralls

@jacobtylerwalls jacobtylerwalls added Blocked 🚧 A PR or issue blocked by another PR or issue pylint-tested PRs that don't cause major regressions with pylint labels May 3, 2022
@jacobtylerwalls
Copy link
Member Author

No more failures in pylint. I'd be more comfortable merging after dropping python3.6, but apparently this isn't blocked any more.

@jacobtylerwalls jacobtylerwalls marked this pull request as ready for review May 3, 2022 23:55
@jacobtylerwalls jacobtylerwalls removed the Blocked 🚧 A PR or issue blocked by another PR or issue label May 3, 2022
astroid/interpreter/_import/util.py Show resolved Hide resolved
@@ -128,7 +126,6 @@ def test_implicit_namespace_package(self) -> None:
def test_namespace_package_pth_support(self) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these tests still fail without certain code? I thought these lines were needed to make these packages namespace packages in a hacky way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if you just make is_namespace() unconditionally return False, these tests fail. My understanding is that the hacks were just for the reliance on the private variables of pkg_resources. (Part of the reason pkg_resources is discouraged now, I take it!)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't gotten around to testing it, but are we sure this still catches namespace packages defined in the pkg_resources way? I know I had made a test package when I initially worked on this and was then able to keep the same tests. See: https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#pkg-resources-style-namespace-packages

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestPackage.zip

The above zip contains a package as defined by that guideline. With the following diff:

diff --git a/astroid/interpreter/_import/spec.py b/astroid/interpreter/_import/spec.py
index 7a1de6cdb..cc963d722 100644
--- a/astroid/interpreter/_import/spec.py
+++ b/astroid/interpreter/_import/spec.py
@@ -328,6 +328,9 @@ def _find_spec_with_path(search_path, modname, module_parts, processed, submodul
         spec = finder.find_module(modname, module_parts, processed, submodule_path)
         if spec is None:
             continue
+        if spec.type == ModuleType.PY_NAMESPACE:
+            print(spec)
+            print(finder)
         return finder, spec
 
     raise ImportError(f"No module named {'.'.join(module_parts)}")

And the following command: pylint ../namespace_package/myotherpackage

main returns:

ModuleSpec(name='myotherpackage', type=<ModuleType.PY_NAMESPACE: 10>, location=None, origin=None, submodule_search_locations=['/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/myotherpackage'])
<astroid.interpreter._import.spec.PathSpecFinder object at 0x1062e12d0>
ModuleSpec(name='mynamespace', type=<ModuleType.PY_NAMESPACE: 10>, location='', origin='namespace', submodule_search_locations=['/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace'])
<astroid.interpreter._import.spec.ExplicitNamespacePackageFinder object at 0x1062e17b0>
ModuleSpec(name='mynamespace', type=<ModuleType.PY_NAMESPACE: 10>, location='', origin='namespace', submodule_search_locations=['/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace'])
<astroid.interpreter._import.spec.ExplicitNamespacePackageFinder object at 0x1062e18d0>
ModuleSpec(name='mynamespace', type=<ModuleType.PY_NAMESPACE: 10>, location='', origin='namespace', submodule_search_locations=['/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace'])
<astroid.interpreter._import.spec.ExplicitNamespacePackageFinder object at 0x1062e1000>

Whereas this PR returns:

ModuleSpec(name='myotherpackage', type=<ModuleType.PY_NAMESPACE: 10>, location=None, origin=None, submodule_search_locations=['/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/myotherpackage'])
<astroid.interpreter._import.spec.PathSpecFinder object at 0x1053054b0>

Without the if:

ModuleSpec(name='myotherpackage', type=<ModuleType.PY_NAMESPACE: 10>, location=None, origin=None, submodule_search_locations=['/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/myotherpackage'])
<astroid.interpreter._import.spec.PathSpecFinder object at 0x10df415a0>
ModuleSpec(name='mynamespace', type=<ModuleType.PKG_DIRECTORY: 3>, location='/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace', origin=None, submodule_search_locations=None)
<astroid.interpreter._import.spec.ImportlibFinder object at 0x10df41570>
ModuleSpec(name='mynamespace', type=<ModuleType.PKG_DIRECTORY: 3>, location='/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace', origin=None, submodule_search_locations=None)
<astroid.interpreter._import.spec.ImportlibFinder object at 0x10df412a0>
ModuleSpec(name='subpackage_a', type=<ModuleType.PKG_DIRECTORY: 3>, location='/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace/subpackage_a', origin=None, submodule_search_locations=None)
<astroid.interpreter._import.spec.ImportlibFinder object at 0x10df40340>
ModuleSpec(name='mynamespace', type=<ModuleType.PKG_DIRECTORY: 3>, location='/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace', origin=None, submodule_search_locations=None)
<astroid.interpreter._import.spec.ImportlibFinder object at 0x10df41c60>
ModuleSpec(name='mynamespace', type=<ModuleType.PKG_DIRECTORY: 3>, location='/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace', origin=None, submodule_search_locations=None)
<astroid.interpreter._import.spec.ImportlibFinder object at 0x10df41e40>
ModuleSpec(name='subpackage_a', type=<ModuleType.PKG_DIRECTORY: 3>, location='/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace/subpackage_a', origin=None, submodule_search_locations=None)
<astroid.interpreter._import.spec.ImportlibFinder object at 0x10df40340>
ModuleSpec(name='mynamespace', type=<ModuleType.PKG_DIRECTORY: 3>, location='/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace', origin=None, submodule_search_locations=None)
<astroid.interpreter._import.spec.ImportlibFinder object at 0x10df417b0>
ModuleSpec(name='subpackage_a', type=<ModuleType.PKG_DIRECTORY: 3>, location='/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace/subpackage_a', origin=None, submodule_search_locations=None)
<astroid.interpreter._import.spec.ImportlibFinder object at 0x10df41e40>
ModuleSpec(name='mynamespace', type=<ModuleType.PKG_DIRECTORY: 3>, location='/Users/daniel/DocumentenLaptop/Programming/Github/namespace_package/mynamespace', origin=None, submodule_search_locations=None)
<astroid.interpreter._import.spec.ImportlibFinder object at 0x10df41e70>

mynamespace is thus interpreted as a different type of package. I don't know if this is actually a bug or if we are actually fixing a bug here, but I would need to do some more investigation before signing off on this. Perhaps you already know why this is different and whether this is a good thing from your own investigations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this. I see the same output on main vs. the PR. Would you mind checking one more time? I also tried varying my cwd and invoking pylint differently.

I see PKG_DIRECTORY: 3 for mynamespace and PY_NAMESPACE: 10 for myotherpackage on both:

% python3 -m pylint mynamespace --disable=all 
ModuleSpec(name='mynamespace', type=<ModuleType.PKG_DIRECTORY: 3>, location='./mynamespace', origin=None, submodule_search_locations=None)

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 0.00/10, +10.00)

% python3 -m pylint myotherpackage --disable=all
ModuleSpec(name='myotherpackage', type=<ModuleType.PY_NAMESPACE: 10>, location=None, origin=None, submodule_search_locations=['/Users/myuser/TestPackage/./myotherpackage'])

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 0.00/10, +10.00)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you did say cd /tmp/TestPackage. Weird.

Is misread your question and thought you asked for my config. Well, there you have it 😄

I believe cd into the folder is required for this "test" to work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH. This is the part I wasn't expecting:

python -m pip install -e .

So the test case has to do with installing the package under site-packages. I wasn't expecting that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess that is required for importlib to figure out that mynamespace is indeed a namespace package (although I don't know the specifics).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's possible it's the other way around. Installing it under site-packages with pip is enough to get astroid's ImportlibFinder to find it first rather than ExplicitNamespacePackageFinder. And in that case, if installed with pip, it is a regular package, not a namespace package. So we might have fixed something here if this wasn't happening before.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh hm, that might actually be true. I just assumed changed behaviour was a regression.

I am not completely sure though. So I'll need to play around with this a little bit more to be sure. Should find time for this this week!

@jacobtylerwalls
Copy link
Member Author

As I can't hit the now uncovered lines on my CPython install even on main, I'm assuming the coverage reduction is due to PyPy objects being built more like CPython now?

@jacobtylerwalls jacobtylerwalls added Blocked 🚧 A PR or issue blocked by another PR or issue and removed Blocked 🚧 A PR or issue blocked by another PR or issue labels May 7, 2022
@jacobtylerwalls
Copy link
Member Author

jacobtylerwalls commented May 14, 2022

@DanielNoord I played around with the old-style setuptools packages here:
https://github.com/PyCQA/astroid/blob/826f6efc14bb63531fafe67da26802e6e77005b4/tests/testdata/python3/data/path_pkg_resources_1/package/__init__.py#L1

I did manage to find a behavior difference with main, so I'll push changes in a second, but it doesn't seem to be the behavior change you observed. If this package already in the test suite is not correct somehow, let's try to identify how. 🤔

This is what I observed:

>>> import astroid
>>> import tests.testdata.python3.data.path_pkg_resources_1.package.foo
>>> astroid.interpreter._import.util.is_namespace('tests.testdata.python3.data.path_pkg_resources_1')  # was False with my original PR, True on main
True

@DanielNoord
Copy link
Collaborator

@DanielNoord I played around with the old-style setuptools packages here:

https://github.com/PyCQA/astroid/blob/826f6efc14bb63531fafe67da26802e6e77005b4/tests/testdata/python3/data/path_pkg_resources_1/package/__init__.py#L1

I did manage to find a behavior difference with main, so I'll push changes in a second, but it doesn't seem to be the behavior change you observed. If this package already in the test suite is not correct somehow, let's try to identify how. 🤔

This is what I observed:

>>> import astroid
>>> import tests.testdata.python3.data.path_pkg_resources_1.package.foo
>>> astroid.interpreter._import.util.is_namespace('tests.testdata.python3.data.path_pkg_resources_1')  # was False with my original PR, True on main
True

This was the original test case that my attempt bonked on as well. I'm excited to see what you came up with!

@jacobtylerwalls jacobtylerwalls removed the pylint-tested PRs that don't cause major regressions with pylint label May 14, 2022
Copy link
Collaborator

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some last nitpicks. I have tested and played around with this some more and I think this might indeed fix all the issues.

Let's go ahead and get this in! 🎉

astroid/interpreter/_import/util.py Outdated Show resolved Hide resolved
astroid/interpreter/_import/util.py Outdated Show resolved Hide resolved
tests/unittest_manager.py Show resolved Hide resolved
tests/unittest_manager.py Show resolved Hide resolved
astroid/interpreter/_import/util.py Outdated Show resolved Hide resolved
Comment on lines 57 to 64
if found_spec.submodule_search_locations is not None:
for search_location in found_spec.submodule_search_locations:
if any(
_is_setuptools_namespace(directory)
for directory in pathlib.Path(search_location).iterdir()
if directory.is_dir()
):
return True
Copy link
Member Author

@jacobtylerwalls jacobtylerwalls May 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaaand with better distinguishing between builtins and namespace packages, we ... don't even need this anymore? Highlighting in a comment in case we pop the hood on this again.

astroid/interpreter/_import/util.py Outdated Show resolved Hide resolved
astroid/interpreter/_import/util.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@jacobtylerwalls jacobtylerwalls changed the title Remove dependence on pkg_resources from setuptools Remove dependency on pkg_resources from setuptools May 16, 2022
@jacobtylerwalls jacobtylerwalls merged commit 5067f08 into pylint-dev:main May 23, 2022
@jacobtylerwalls jacobtylerwalls deleted the pkg_resources branch May 23, 2022 02:07
@DanielNoord
Copy link
Collaborator

DanielNoord commented May 29, 2022

@jacobtylerwalls I'm seeing a crash in one of my pylint primer runs. I haven't investigated, but this was my first guess. See:
https://github.com/DanielNoord/pylint/runs/6641534527?check_suite_focus=true
Seems to happen when lining django? Do you have any idea?

@jacobtylerwalls
Copy link
Member Author

Yep, this PR caused it. I'll take a look.

@DanielNoord
Copy link
Collaborator

@jacobtylerwalls Thanks!

@jacobtylerwalls
Copy link
Member Author

@DanielNoord See #1575, I'm waiting on a response to the CPython issue linked there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
High effort 🏋 Difficult solution or problem to solve Maintenance Discussion or action around maintaining astroid or the dev workflow topic-performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decrease the time required to import astroid Remove dependency to setuptools and distutil
4 participants