Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Importlib Metadata shows two distributions with same name for editable installs #4170

Open
Jacob-Stevens-Haas opened this issue Jan 3, 2024 · 6 comments

Comments

@Jacob-Stevens-Haas
Copy link

Jacob-Stevens-Haas commented Jan 3, 2024

setuptools version

setuptools 69.0.3

Python version

3.10.12

OS

Ubuntu

Additional environment information

Applies to both src/ and flat layout

Description

I was trying to identify editable packages installed in my current environment by looking at direct_url.json for a package given by importlib.metadata.distribution(name). It was showing that file didn't exist. Upon further investigation, importlib.metadata.distributions() had two entries for my package - one PathDistribution who's files contains dist-info in site-packages, and another PathDistribution who's files contain egg-info, built by setuptools in the local directory. distribution(name) only finds the local version. Interestingly, importlib.metadata.packages_distributions() shows that the distribution package foo has two import packages associated, both with the same names.

Expected behavior

I would've expected just one distribution package for an editable install, in this case with a single import package associated. At a lower level, I'm not sure it really makes sense to ever have two distributions of the same name installed, and therefore perhaps setuptools should have internally raised an error when distributions finds two of the same name or two import packages with the same name in the same distribution.

How to Reproduce

I've got an example distribution package, foo, with one import package, also named foo:

  1. clone and cd into repo at https://github.com/Jacob-Stevens-Haas/setuptools_test
  2. create and activate a virtual environment (I'm using venv)
  3. pip install -e .
  4. python show_dists.py

This will print the results of distributions(), showing two named "foo", the files in the two matching distributions, and then the packages_distributions() results.

  1. (optional) playing around with working directory or switching to an src layout (see src branch) has same result. pip freeze shows just a single distribution package

Output

'foo'
'foo'
'pip'
'setuptools'
[PackagePath('pyproject.toml'),
 PackagePath('foo/__init__.py'),
 PackagePath('foo.egg-info/PKG-INFO'),
 PackagePath('foo.egg-info/SOURCES.txt'),
 PackagePath('foo.egg-info/dependency_links.txt'),
 PackagePath('foo.egg-info/top_level.txt')]
[PackagePath('__editable__.foo-0.1.0.pth'),
 PackagePath('__editable___foo_0_1_0_finder.py'),
 PackagePath('__pycache__/__editable___foo_0_1_0_finder.cpython-310.pyc'),
 PackagePath('foo-0.1.0.dist-info/INSTALLER'),
 PackagePath('foo-0.1.0.dist-info/METADATA'),
 PackagePath('foo-0.1.0.dist-info/RECORD'),
 PackagePath('foo-0.1.0.dist-info/REQUESTED'),
 PackagePath('foo-0.1.0.dist-info/WHEEL'),
 PackagePath('foo-0.1.0.dist-info/direct_url.json'),
 PackagePath('foo-0.1.0.dist-info/top_level.txt')]
{'_distutils_hack': ['setuptools'],
 'debian': ['setuptools'],
 'foo': ['foo', 'foo'],
 'pip': ['pip'],
 'pkg_resources': ['setuptools'],
 'setuptools': ['setuptools']}
@abravalheri
Copy link
Contributor

Hi @Jacob-Stevens-Haas, thank you very much for opening this discussion.

For the time being this is a limitation for the combo setuptools and importlib-metadata interoperating together...

The current design of setuptools requires the .egg-info folders as part of the building process and intentionally places them at the root of the repository for flat-layout projects.

We do have a milestone for removing the egg-info, https://github.com/pypa/setuptools/milestone/3, but I don't think that is a goal that can be achieved in the short term.

If this turns up to be problematic for you, please consider the following workarounds while the long-term implementation is not ready:

  • Consider using a src-layout (if I am not mistaken with src-layout, the .egg-info folder will be placed inside the src folder and then not picked up by importlib-metadata).
  • Consider filtering-out .egg-info paths from the output of importlib-metata.

If any member of the community is interested in contributing towards the goal of removing the reliance on .egg-info directories, contributions are always welcomed.

@abravalheri abravalheri added help wanted Needs Implementation Issues that are ready to be implemented. Needs Design Proposal Long Term and removed bug Needs Triage Issues that need to be evaluated for severity and status. labels Jan 3, 2024
@abravalheri abravalheri changed the title [BUG] Importlib Metadata shows two distributions with same name for editable installs [Improvement] Importlib Metadata shows two distributions with same name for editable installs Jan 3, 2024
@abravalheri abravalheri changed the title [Improvement] Importlib Metadata shows two distributions with same name for editable installs [ENH] Importlib Metadata shows two distributions with same name for editable installs Jan 3, 2024
@Jacob-Stevens-Haas
Copy link
Author

Jacob-Stevens-Haas commented Jan 3, 2024

Thanks for the quick reply! And yeah, that would be a fine workaround. Given that removing egg-info might take a while... would implementing the workaround (ignoring egg-info distributions if there's a same-name dist-info distribution) inside importlib.metadata be reasonable?

  • Consider using a src-layout (if I am not mistaken with src-layout, the .egg-info folder will be placed inside the src folder and then not picked up by importlib-metadata).

I have a branch in the above repo with an src layout, and starting everything from scratch with that layout gives similar results:

 'foo'
'pip'
'setuptools'
'foo'
[PackagePath('__editable__.foo-0.1.0.pth'),
 PackagePath('foo-0.1.0.dist-info/INSTALLER'),
 PackagePath('foo-0.1.0.dist-info/METADATA'),
 PackagePath('foo-0.1.0.dist-info/RECORD'),
 PackagePath('foo-0.1.0.dist-info/REQUESTED'),
 PackagePath('foo-0.1.0.dist-info/WHEEL'),
 PackagePath('foo-0.1.0.dist-info/direct_url.json'),
 PackagePath('foo-0.1.0.dist-info/top_level.txt')]
[PackagePath('pyproject.toml'),
 PackagePath('src/foo/__init__.py'),
 PackagePath('src/foo.egg-info/PKG-INFO'),
 PackagePath('src/foo.egg-info/SOURCES.txt'),
 PackagePath('src/foo.egg-info/dependency_links.txt'),
 PackagePath('src/foo.egg-info/top_level.txt')]
{'_distutils_hack': ['setuptools'],
 'debian': ['setuptools'],
 'foo': ['foo', 'foo'],
 'pip': ['pip'],
 'pkg_resources': ['setuptools'],
 'setuptools': ['setuptools']}

Is this because during python startup, importing site reads __editable__.foo-0.1.0.pth and adds the src directory to sys.path? Interestingly, this changes the order of the packages, as the egg-info is no longer found in "". It thus means that importlib.metadata.distribution("foo") finds the correct package... which is a win for me, but IDk if this behavior is reliable.

If any member of the community is interested in contributing towards the goal of removing the reliance on .egg-info directories, contributions are always welcomed.

I'd love to, but realistically I'll probably just learn more about setuptools and why removing reliance on egg-info is so daunting... the classic "know enough to be dangerous... but not to be useful" stage.

@abravalheri
Copy link
Contributor

abravalheri commented Jan 3, 2024

Thanks for the quick reply! And yeah, that would be a fine workaround. Given that removing egg-info might take a while... would implementing the workaround (ignoring egg-info distributions if there's a same-name dist-info distribution) inside importlib.metadata be reasonable?

That is something to be discussed in the importlib.metadata repo, but that would break setuptools 😅 (because the existing design relies on that).

I have a branch in the above repo with an src layout, and starting everything from scratch with that layout gives similar results
Is this because during python startup, importing site reads __editable__.foo-0.1.0.pth and adds the src directory to sys.path?

I see... Yeap, that is correct. The src layout will add the src-directory as a new entry to sys.path, end then impotlib.metadata will catch it. That makes sense, sorry I didn't think about that.

Interestingly, this changes the order of the packages, as the egg-info is no longer found in "". It thus means that importlib.metadata.distribution("foo") finds the correct package... which is a win for me, but IDk if this behavior is reliable.

That is probably 90% reliable 😅. The "" directory (which corresponds to the current work dir) is added by default as the first entry in sys.path automatically depending on how you run a Python script, module or REPL. This is the reference (https://docs.python.org/3/using/cmdline.html):

-c <command>
If this option is given, the first element of sys.argv will be "-c" and the current directory will be added to the start of sys.path (allowing modules in that directory to be imported as top level modules).

-m <module-name>
... As with the -c option, the current directory will be added to the start of sys.path.

<script>
If the script name refers directly to a Python file, the directory containing that file is added to the start of sys.path, and the file is executed as the main module.
If the script name refers to a directory or zipfile, the script name is added to the start of sys.path and the main.py file in that location is executed as the main module.

-I option can be used to run the script in isolated mode where sys.path contains neither the current directory nor the user’s site-packages directory. All PYTHON* environment variables are ignored, too.

And this is the reference for the .pth file mechanism we use for adding entries to sys.path in the editable install for the src-layout: https://docs.python.org/3/library/site.html.

@Jacob-Stevens-Haas
Copy link
Author

Jacob-Stevens-Haas commented Jan 3, 2024

Ah, thanks for all that! After a cursory reading, does setuptools create an editable install as a PEP660 editable wheel? Or does the presence of an egg-info directory locally imply otherwise?

Also, and this might not be the ideal solution, would it be possible to add direct_url.json to the egg-info directory?

@abravalheri
Copy link
Contributor

abravalheri commented Jan 4, 2024

Ah, thanks for all that! After a cursory reading, does setuptools create an editable install as a PEP660 editable wheel?

Ideally yes. But that will depend on how pip calls setuptools. pip has its own heuristics to decide when and how to call setuptools and in some edge cases it will rely on setuptools deprecated code paths.

Or does the presence of an egg-info directory locally imply otherwise?

The presence of the .egg-info directory is NOT a direct/unequivocal indicator of the installation method that was used. It may be found even when the process described in PEP 660 is employed.

would it be possible to add direct_url.json to the egg-info directory?

The direct_url.json file is a installer's thing. It is not something covered in the setuptools codebase/scope. Instead, pip is the tool producing it.

jeffwidman added a commit to pallets-eco/flask-debugtoolbar that referenced this issue Apr 11, 2024
`pkg_resources` has been deprecated by `setuptools` for quite a while:
https://setuptools.pypa.io/en/latest/pkg_resources.html

It's got some bugs/warts:

* pypa/setuptools#2531
* https://discuss.python.org/t/will-setuptools-remove-pkg-resource-module-in-the-future/27182

So switch to using `importlib` functions which are part of the Python
standard library as of `3.8`.

This is less error-prone, and also removes the need for `setuptools` to
be installed in order for this panel to work.

I realize we technically still support `3.7`, but I thought it was fine
to change this particular panel to require `3.8`, as `3.7` support is
best effort given that it's now EOL'd by the core Python team.

I also removed the relative path location for specific libraries as it
was simply blank for me on Python 3.12... I think showing the location
of the site packages directory should suffice. If someone later wants to
build this out further, they're more than welcome to.

Note that `importlib.metadata.distributions()` does have an outstanding
issue that it reports a local editable install twice, but they plan to
eventually fix that:
* pypa/setuptools#4170
jeffwidman added a commit to pallets-eco/flask-debugtoolbar that referenced this issue Apr 11, 2024
`pkg_resources` has been deprecated by `setuptools` for quite a while:
https://setuptools.pypa.io/en/latest/pkg_resources.html

It's got some bugs/warts:

* pypa/setuptools#2531
* https://discuss.python.org/t/will-setuptools-remove-pkg-resource-module-in-the-future/27182

So switch to using `importlib` functions which are part of the Python
standard library as of `3.8`.

This is less error-prone, and also removes the need for `setuptools` to
be installed in order for this panel to work.

I realize we technically still support `3.7`, but I thought it was fine
to change this particular panel to require `3.8`, as `3.7` support is
best effort given that it's now EOL'd by the core Python team.

I also removed the relative path location for specific libraries as it
was simply blank for me on Python 3.12... I think showing the location
of the site packages directory should suffice. If someone later wants to
build this out further, they're more than welcome to.

Note that `importlib.metadata.distributions()` does have an outstanding
issue that it reports a local editable install twice, but they plan to
eventually fix that:
* pypa/setuptools#4170
jeffwidman added a commit to pallets-eco/flask-debugtoolbar that referenced this issue Apr 11, 2024
`pkg_resources` has been deprecated by `setuptools` for quite a while:
https://setuptools.pypa.io/en/latest/pkg_resources.html

It's got some bugs/warts:

* pypa/setuptools#2531
* https://discuss.python.org/t/will-setuptools-remove-pkg-resource-module-in-the-future/27182

So switch to using `importlib` functions which are part of the Python
standard library as of `3.8`.

This is less error-prone, and also removes the need for `setuptools` to
be installed in order for this panel to work.

I realize we technically still support `3.7`, but I thought it was fine
to change this particular panel to require `3.8`, as `3.7` support is
best effort given that it's now EOL'd by the core Python team.

I also removed the relative path location for specific libraries as it
was simply blank for me on Python 3.12... I think showing the location
of the site packages directory should suffice. If someone later wants to
build this out further, they're more than welcome to.

Note that `importlib.metadata.distributions()` does have an outstanding
issue that it reports a local editable install twice, but they plan to
eventually fix that:
* pypa/setuptools#4170
jeffwidman added a commit to pallets-eco/flask-debugtoolbar that referenced this issue Apr 11, 2024
`pkg_resources` has been deprecated by `setuptools` for quite a while:
https://setuptools.pypa.io/en/latest/pkg_resources.html

It's got some bugs/warts:

* pypa/setuptools#2531
* https://discuss.python.org/t/will-setuptools-remove-pkg-resource-module-in-the-future/27182

So switch to using `importlib` functions which are part of the Python
standard library as of `3.8`.

This is less error-prone, and also removes the need for `setuptools` to
be installed in order for this panel to work.

I realize we technically still support `3.7`, but I thought it was fine
to change this particular panel to require `3.8`, as `3.7` support is
best effort given that it's now EOL'd by the core Python team.

I also removed the relative path location for specific libraries as it
was simply blank for me on Python 3.12... I think showing the location
of the site packages directory should suffice. If someone later wants to
build this out further, they're more than welcome to.

Note that `importlib.metadata.distributions()` does have an outstanding
issue that it reports a local editable install twice, but they plan to
eventually fix that:
* pypa/setuptools#4170
jeffwidman added a commit to pallets-eco/flask-debugtoolbar that referenced this issue Apr 12, 2024
`pkg_resources` has been deprecated by `setuptools` for quite a while:
https://setuptools.pypa.io/en/latest/pkg_resources.html

It's got some bugs/warts:

* pypa/setuptools#2531
* https://discuss.python.org/t/will-setuptools-remove-pkg-resource-module-in-the-future/27182

So switch to using `importlib` functions which are part of the Python
standard library as of `3.8`.

This is less error-prone, and also removes the need for `setuptools` to
be installed in order for this panel to work.

I realize we technically still support `3.7`, but I thought it was fine
to change this particular panel to require `3.8`, as `3.7` support is
best effort given that it's now EOL'd by the core Python team.

I also removed the relative path location for specific libraries as it
was simply blank for me on Python 3.12... I think showing the location
of the site packages directory should suffice. If someone later wants to
build this out further, they're more than welcome to.

Note that `importlib.metadata.distributions()` does have an outstanding
issue that it reports a local editable install twice, but they plan to
eventually fix that:
* pypa/setuptools#4170
jeffwidman added a commit to pallets-eco/flask-debugtoolbar that referenced this issue Apr 12, 2024
`pkg_resources` has been deprecated by `setuptools` for quite a while:
https://setuptools.pypa.io/en/latest/pkg_resources.html

It's got some bugs/warts:

* pypa/setuptools#2531
* https://discuss.python.org/t/will-setuptools-remove-pkg-resource-module-in-the-future/27182

So switch to using `importlib` functions which are part of the Python
standard library as of `3.8`.

This is less error-prone, and also removes the need for `setuptools` to
be installed in order for this panel to work.

I realize we technically still support `3.7`, but I thought it was fine
to change this particular panel to require `3.8`, as `3.7` support is
best effort given that it's now EOL'd by the core Python team.

I also removed the relative path location for specific libraries as it
was simply blank for me on Python 3.12... I think showing the location
of the site packages directory should suffice. If someone later wants to
build this out further, they're more than welcome to.

Note that `importlib.metadata.distributions()` does have an outstanding
issue that it reports a local editable install twice, but they plan to
eventually fix that:
* pypa/setuptools#4170
jeffwidman added a commit to pallets-eco/flask-debugtoolbar that referenced this issue Apr 12, 2024
`pkg_resources` has been deprecated by `setuptools` for quite a while:
https://setuptools.pypa.io/en/latest/pkg_resources.html

It's got some bugs/warts:

* pypa/setuptools#2531
* https://discuss.python.org/t/will-setuptools-remove-pkg-resource-module-in-the-future/27182

So switch to using `importlib` functions which are part of the Python
standard library as of `3.8`.

This is less error-prone, and also removes the need for `setuptools` to
be installed in order for this panel to work.

I realize we technically still support `3.7`, but I thought it was fine
to change this particular panel to require `3.8`, as `3.7` support is
best effort given that it's now EOL'd by the core Python team.

I also removed the relative path location for specific libraries as it
was simply blank for me on Python 3.12... I think showing the location
of the site packages directory should suffice. If someone later wants to
build this out further, they're more than welcome to.

Note that `importlib.metadata.distributions()` does have an outstanding
issue that it reports a local editable install twice, but they plan to
eventually fix that:
* pypa/setuptools#4170
jeffwidman added a commit to pallets-eco/flask-debugtoolbar that referenced this issue Apr 13, 2024
`pkg_resources` has been deprecated by `setuptools` for quite a while:
https://setuptools.pypa.io/en/latest/pkg_resources.html

It's got some bugs/warts:

* pypa/setuptools#2531
* https://discuss.python.org/t/will-setuptools-remove-pkg-resource-module-in-the-future/27182

So switch to using `importlib` functions which are part of the Python
standard library as of `3.8`.

This is less error-prone, and also removes the need for `setuptools` to
be installed in order for this panel to work.

I realize we technically still support `3.7`, but I thought it was fine
to change this particular panel to require `3.8`, as `3.7` support is
best effort given that it's now EOL'd by the core Python team.

I also removed the relative path location for specific libraries as it
was simply blank for me on Python 3.12... I think showing the location
of the site packages directory should suffice. If someone later wants to
build this out further, they're more than welcome to.

Note that `importlib.metadata.distributions()` does have an outstanding
issue that it reports a local editable install twice, but they plan to
eventually fix that:
* pypa/setuptools#4170
jeffwidman added a commit to pallets-eco/flask-debugtoolbar that referenced this issue Apr 13, 2024
`pkg_resources` has been deprecated by `setuptools` for quite a while: https://setuptools.pypa.io/en/latest/pkg_resources.html

It's got some bugs/warts:

* pypa/setuptools#2531
* https://discuss.python.org/t/will-setuptools-remove-pkg-resource-module-in-the-future/27182

So switch to using `importlib` functions which are part of the Python standard library as of `3.8`.

This is less error-prone, and also removes the need for `setuptools` to be installed in order for this panel to work.

I realize we technically still support `3.7`, but I thought it was fine to change this particular panel to require `3.8`, as `3.7` support is best effort given that it's now EOL'd by the core Python team.

I also removed the relative path location for specific libraries as it was simply blank for me on Python 3.12... I think showing the location of the site packages directory should suffice. If someone later wants to build this out further, they're more than welcome to.

Note that `importlib.metadata.distributions()` does have an outstanding issue that it reports a local editable install twice, but they plan to eventually fix that:
* pypa/setuptools#4170
@zjp
Copy link

zjp commented Jun 25, 2024

In my team's case I've found that as soon as I install a package in editable mode, I get duplicates of every package in my site-packages folder from importlib.metadata.distributions() even if I install that package the normal way again. We're tracking the issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants