Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve interop with importlib.metadata #195

Merged
merged 5 commits into from
Sep 29, 2024
Merged

Conversation

abravalheri
Copy link
Contributor

@abravalheri abravalheri commented Apr 30, 2024

In #192, it is reported that backends not only need the ability to import code from backend-path but also depend on other features of importlib (e.g importlib.metadata).

This PR is an attempt to return that ability.

Closes #192

/cc @jaraco


This issue is a bit tricky...

From pypa/pip#11812 it is clear that only manipulating sys.path is not enough to guarantee compliance with the following part of the spec:

The backend code MUST be loaded from one of the directories specified in backend-path

The alternative is maybe to do both? (i.e. manipulate sys.path and add the meta path finder that guarantees that the backed is loaded from the backend-path)? I can prepare a second PR for that if this is the best direction to go.

(But then there is also the risk that another MetaPathFinder that implements find_distributions can take precedence and load metadata from outside backend-path...).

# Delayed import: Python 3.7 does not contain importlib.metadata
from importlib.metadata import DistributionFinder, MetadataPathFinder

context = DistributionFinder.Context(path=self.backend_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this line be in an if context is None block so that passing a context in works correctly? Or perhaps it should intersect the search path from a passed in context with its own self.backend_path?

Copy link
Contributor Author

@abravalheri abravalheri Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if the passed context will ever contain the backend_path, given that the importlib.metadata does not know about it1.

My intuition is that we can simply discard the given context, and other finders will handle it.

Footnotes

  1. at least for the current state of the implementation, in which backend_path is not added to sys.path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, it defaults to sys.path if no other path is passed. And neither of the places you can pass something else - Distribution.discover() or distributions() - are mentioned in the docs. So I think you're probably right that we can get away with ignoring the context argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this looks right to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaraco is there any other optional part that needs to be implemented from importlib.metadata, importlib.resources point of view?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all that's needed for metadata. I'm unsure if resources need an explicit hook. I'm 90% sure that because the __spec__ is a PathFinder, it will have resources support.

@takluyver
Copy link
Member

Thanks for looking into this promptly!

I find importlib really hard to make sense of - the different types of finders & loaders, optional methods, deprecated methods... it's too much to keep in my head at once. It sounds like you and @jaraco both reckon that adding this method ought to solve the issue, though. Have you been able to confirm whether it actually fixes things for setuptools' test suite? Or can that only be verified once there's a release on PyPI?

@takluyver takluyver added this to the 1.2 milestone Apr 30, 2024
@@ -89,6 +90,16 @@ def test_intree_backend_loaded_from_correct_backend_path():
assert res == ["intree_backend_called"]


@pytest.mark.skipif(sys.version_info < (3, 8), reason="no importlib.metadata")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may prefer to use pytest.importorskip('importlib.metadata') (in the code).

@jaraco
Copy link
Member

jaraco commented May 1, 2024

I find importlib really hard to make sense of - the different types of finders & loaders, optional methods, deprecated methods... it's too much to keep in my head at once.

I sympathize. I too struggle with it, and I have a latent working model in my head 😬 . I've taken a stab at improving the docs in python/cpython#113187. I should probably get that merged so at least there's a bit more context as imperfect as it is. I've done my best to keep "deprecated" interfaces and concepts encapsulated so the concerns are separate and can be largely ignored. You may find the importlib_metadata implementation to be fresher and potentially less encumbered by legacy concerns.

@@ -106,6 +106,13 @@ def find_spec(self, fullname, _path, _target=None):

return spec

def find_distributions(self, context=None):
# Delayed import: Python 3.7 does not contain importlib.metadata
from importlib.metadata import DistributionFinder, MetadataPathFinder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The importlib_metadata provides these interfaces going back to Python 2.7 (and maybe 2.6). If this project wishes to support Python 3.7, it should consider using importlib_metadata.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is something on the back of my mind: what would be the best approach here to avoid errors in 3.7 if importlib_metadata is being used, but without adding a new dependency on this project.

Initially I thought to minimise the complexity, but it is true that the initial implementation is error prone.

I added 455b77f to account for that. Not sure if the best is to try first importlib_metadata and fallback to importlib.metadata or the other way around (please let me know if the opposite is better).

@takluyver I understand that it is the approach in this module to keep supporting 3.7, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that adding importlib_metadata adds an additional dependency. It also provides forward-compatibility for features introduced in later Pythons. I do see that in 4.0 (Python 3.10), there was improved alignment with naming standards, so that could affect discovery. For setuptools, I don't think that's a problem.

@takluyver
Copy link
Member

Thanks @jaraco !

Have you been able to run the setuptools with this branch to verify that the problem is solved? Or is it something that can only be properly tested after a release?

@jaraco
Copy link
Member

jaraco commented May 1, 2024

Have you been able to confirm whether it actually fixes things for setuptools' test suite? Or can that only be verified once there's a release on PyPI?

We should definitely do this verification. I can potentially help with this. At the very least, it should be possible to verify by putting a release on TestPyPI if necessary.

@jaraco jaraco self-assigned this May 1, 2024
dev-requirements.txt Outdated Show resolved Hide resolved
@abravalheri
Copy link
Contributor Author

I find importlib really hard to make sense of - the different types of finders & loaders, optional methods, deprecated methods... it's too much to keep in my head at once.

I couldn't agree more 😅.

The complexity of importlib is also what causes the need for a MetaPathFinder. Simply prepending sys.path does not work to provide the guarantees the build-system spec needs because sys.meta_path has "higher precedence" than sys.path. But that is a whole can of worms...

Have you been able to confirm whether it actually fixes things for setuptools' test suite? Or can that only be verified once there's a release on PyPI?

We can probably verify that by pinning pyproject-hooks @ git+https://github.com/abravalheri/pyproject-hooks@issue-192 in tox.ini, I will try to have a look on this at some point this week.

@jaraco jaraco removed their assignment May 1, 2024
@abravalheri
Copy link
Contributor Author

abravalheri commented May 1, 2024

We can probably verify that by pinning pyproject-hooks @ git+https://github.com/abravalheri/pyproject-hooks@issue-192 in tox.ini, I will try to have a look on this at some point this week.

@jaraco, it is not very trivial to reconciliate importlib.metadata/importlib_metadata with setuptools.extern.importlib_metadata.

I hitting the following "dependency-version-conflict" when testing on setuptools:

Command '['/tmp/build-env-aon6abmf/bin/python', '/home/abravalheri/workspace/setuptools/.tox/py/lib/python3.8/site-packages/pyproject_hooks/_in_process/_in_process.py', 'get_requires_for_build_wheel', '/tmp/tmpp401rn2h']' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/home/abravalheri/workspace/setuptools/.tox/py/lib/python3.8/site-packages/pyproject_hooks/_in_process/_in_process.py", line 392, in <module>
    main()
  File "/home/abravalheri/workspace/setuptools/.tox/py/lib/python3.8/site-packages/pyproject_hooks/_in_process/_in_process.py", line 376, in main
    json_out["return_val"] = hook(**hook_input["kwargs"])
  File "/home/abravalheri/workspace/setuptools/.tox/py/lib/python3.8/site-packages/pyproject_hooks/_in_process/_in_process.py", line 146, in get_requires_for_build_wheel
    return hook(config_settings)
  File "/home/abravalheri/workspace/setuptools/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
    return self._get_build_requires(config_settings, requirements=['wheel'])
  File "/home/abravalheri/workspace/setuptools/setuptools/build_meta.py", line 295, in _get_build_requires
    self.run_setup()
  File "/home/abravalheri/workspace/setuptools/setuptools/build_meta.py", line 311, in run_setup
    exec(code, locals())
  File "<string>", line 93, in <module>
  File "/home/abravalheri/workspace/setuptools/setuptools/__init__.py", line 103, in setup
    _install_setup_requires(attrs)
  File "/home/abravalheri/workspace/setuptools/setuptools/__init__.py", line 74, in _install_setup_requires
    dist.parse_config_files(ignore_option_errors=True)
  File "/home/abravalheri/workspace/setuptools/setuptools/dist.py", line 625, in parse_config_files
    self._parse_config_files(filenames=inifiles)
  File "/home/abravalheri/workspace/setuptools/setuptools/dist.py", line 476, in _parse_config_files
    opt = self.warn_dash_deprecation(opt, section)
  File "/home/abravalheri/workspace/setuptools/setuptools/dist.py", line 513, in warn_dash_deprecation
    self._setuptools_commands(),
  File "/home/abravalheri/workspace/setuptools/setuptools/dist.py", line 538, in _setuptools_commands
    return metadata.distribution('setuptools').entry_points.names
AttributeError: 'list' object has no attribute 'names'

That happens because the vendored version of importlib_metadata in setuptools relies on APIs that are implemented recently, isn't it?

@jaraco
Copy link
Member

jaraco commented May 1, 2024

That happens because the vendored version of importlib_metadata in setuptools relies on APIs that are implemented recently, isn't it?

Yes. Probably. It probably implies that this project may need to use importlib_metadata on even later versions of Python (for better forward compatibility), likely 3.9 and earlier.

I was worried about this when we were vendoring importlib_metadata. It's already worrisome enough to support importlib.metadata and importlib_metadata, but introducing _vendor.importlib_metadata adds another layer of complexity. Ideally, I'd like for setuptools to be able to rely on whatever version of "importlib metadata" is available in the environment.

@abravalheri
Copy link
Contributor Author

abravalheri commented May 1, 2024

In abravalheri/setuptools#12 I show that this patch addresses the problem in setuptools (provided a small change for backwards compatibility in setuptools is considered).

@jaraco, I think that is a better approach then imposing a new dependency on importlib_metadata to pyproject-hooks.

@pfmoore
Copy link
Member

pfmoore commented May 1, 2024

+1 on avoiding the dependency here. I'm a fairly minor contributor to this project, so my opinion shouldn't carry too much weight, but I'm rather concerned that the lower levels of the packaging stack are starting to get overburdened with dependencies, both vendored and unvendored, and managing those dependencies is becoming an exercise in itself. Anything we can do to limit that issue is worthwhile, IMO.

Copy link
Member

@jaraco jaraco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @abravalheri for the detailed analysis and testing.

Comment on lines +109 to +116
def find_distributions(self, context=None):
# Delayed import: Python 3.7 does not contain importlib.metadata
# If this method is being called it must be because
# `importlib.metadata`/`importlib_metadata` is available.
try:
from importlib_metadata import DistributionFinder, MetadataPathFinder
except ImportError:
from importlib.metadata import DistributionFinder, MetadataPathFinder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we only define this if importlib.metadata is available?

I'm mildly concerned about importing importlib_metadata from here since pip would need to modify its vendored copy of pyproject-hooks to not import an installed module (we've had issues due to attempting to do this in the past, usually resulting in breakage of pip because the user somehow ended up with a bad copy of said package in their environment and the user needing to understand the nuances + details of how stuff works to fix their environment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @pradyunsg here. Pip would need to patch this out.

Can we not simply do a version check? Or even just prefer the stdlib version always? That way, pip (which only supports Python 3.8+) would never pick up importlib_metadata, avoiding any problems. Is there any justification for using importlib_metadata in preference to the stdlib implementation, when the latter is present?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any justification for using importlib_metadata in preference to the stdlib implementation, when the latter is present?

IIRC, importlib.metadata was added to the stdlib, but then had significant API changes in Python 3.10, and importlib_metadata made those changes available to previous Python versions. So various tools using it prefer the backport over the stdlib module.

I also don't really like the idea that this will do different things depending on whether a separate package is installed or not. Maybe we should pretend that importlib.metadata was only added from Python 3.10 and do a version check?

(I remember this backporting causing issues before as well, though I don't recall which project that was in. I think for most stdlib modules doing backports like this would have been fine, but because importlib.metadata defines an interface between third party packages, the packages using it need to agree on whether they use the backport or not. With the benefit of hindsight, I think once it was in the stdlib, backports should have stopped, so everyone just used the stdlib version, even if that meant we had to wait longer to use new APIs.)

Copy link
Contributor Author

@abravalheri abravalheri May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, unfortunately importlib_metadata is not 100% backward compatible with importlib.metadata.

In general, I find that the following works relatively fine:

  • using importlib_metadata.DistributionFinder, MetadataPathFinder to implement find_distrubitions, but then using importlib.metadata high-level APIs

And the following does not work:

  • using importlib.metadata.DistributionFinder, MetadataPathFinder to implement find_distrubitions, but then using importlib_metadata high-level APIs

(Of course that depends on the exact functionalities being used by all the other packages in the virtual environment)

That is the main reason why I preferred this order.
But as discussed in #195 (comment), I have no personal problem in exchanging the order in the try...except... if that is a better approach. Or even just using importlib.metadata all the time in Python 3.8+. Please let me know and I will proceed accordingly.


Can we not simply do a version check? Or even just prefer the stdlib version always? That way, pip (which only supports Python 3.8+) would never pick up importlib_metadata, avoiding any problems.

I believe that preferring the stdlib version always (or not attempting to use importlib_metadata at all) will have the same limitations.

An anecdote about that:

Think about abravalheri/setuptools#12 and the patch necessary to make it work. What happens there is that the except branch is executed and importlib.metadata is always used.
However setuptools uses its own vendored version of importlib_metadata high-level API which causes the incompatibility documented in python/importlib_metadata#486. The good side is that, if setuptools was not forced to vendor dependencies and use importlib_metadata directly, no patch would be necessary.

In fact any package using importlib_metadata is always subject to these problems as mentioned by Thomas in #195 (comment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, unfortunately importlib_metadata is not 100% backward compatible with importlib.metadata.

That's a problem for importlib_metadata/importlib.metadata IMO. But it's been discussed elsewhere and I don't want to reiterate old arguments here.

For this PR, the main point is that we do not want a situation where pip's behaviour can change as a result of the user installing importlib_metadata (whether it's a pristine copy, or some patched nightmare that a distro, in their wisdom, has created). That's something we try very hard to avoid in pip, because it has caused significant issues for us in the past - it's at the root of why our policy on debundling is basically "please don't, but if you must, it's your problem to deal with".

Pip doesn't support Python 3.7, so I'm happy with any solution that only uses the stdlib importlib.metadata for Python 3.8+. At worst, it'll be pip patching this library when we vendor it, so nothing has to change here. But I'd expect other users of this library, like build, to have similar issues, and so I think it's worth discussing a solution that isn't pip-specific. But if the conclusion is "we don't want to bother about that", then that's OK - pip will patch it.

Copy link
Contributor Author

@abravalheri abravalheri May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to go with importlib.metadata always. That was more similar to my initial implementation.

But I'd expect other users of this library, like build, to have similar issues, and so I think it's worth discussing a solution that isn't pip-specific.

I believe that build also uses the same approach of preferring importlib_metadata and falling back to importlib.metadata (for Python < 3.10.2). In this sense, the current implementation of this PR may be more compatible with build.

Copy link
Contributor Author

@abravalheri abravalheri May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will open a separate PR with that approach so the maintainers can choose which one they prefer and close the other one: #199

@ofek
Copy link

ofek commented Jun 26, 2024

What are the next steps for this to be merged?

@abravalheri
Copy link
Contributor Author

What are the next steps for this to be merged?

The version without importlib_metadata (only importlib.metadata) #199 is more likely to be merged according to @takluyver.

@ofek
Copy link

ofek commented Jun 26, 2024

I missed that, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Release 1.1: Regression breaks setuptools builds
6 participants