-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve interop with importlib.metadata
#195
Conversation
# Delayed import: Python 3.7 does not contain importlib.metadata | ||
from importlib.metadata import DistributionFinder, MetadataPathFinder | ||
|
||
context = DistributionFinder.Context(path=self.backend_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line be in an if context is None
block so that passing a context in works correctly? Or perhaps it should intersect the search path from a passed in context with its own self.backend_path
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if the passed context will ever contain the backend_path
, given that the importlib.metadata
does not know about it1.
My intuition is that we can simply discard the given context, and other finders will handle it.
Footnotes
-
at least for the current state of the implementation, in which
backend_path
is not added tosys.path
. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, it defaults to sys.path
if no other path is passed. And neither of the places you can pass something else - Distribution.discover()
or distributions()
- are mentioned in the docs. So I think you're probably right that we can get away with ignoring the context argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this looks right to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaraco is there any other optional part that needs to be implemented from importlib.metadata
, importlib.resources
point of view?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all that's needed for metadata
. I'm unsure if resources need an explicit hook. I'm 90% sure that because the __spec__
is a PathFinder
, it will have resources support.
Thanks for looking into this promptly! I find importlib really hard to make sense of - the different types of finders & loaders, optional methods, deprecated methods... it's too much to keep in my head at once. It sounds like you and @jaraco both reckon that adding this method ought to solve the issue, though. Have you been able to confirm whether it actually fixes things for setuptools' test suite? Or can that only be verified once there's a release on PyPI? |
tests/test_inplace_hooks.py
Outdated
@@ -89,6 +90,16 @@ def test_intree_backend_loaded_from_correct_backend_path(): | |||
assert res == ["intree_backend_called"] | |||
|
|||
|
|||
@pytest.mark.skipif(sys.version_info < (3, 8), reason="no importlib.metadata") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may prefer to use pytest.importorskip('importlib.metadata')
(in the code).
I sympathize. I too struggle with it, and I have a latent working model in my head 😬 . I've taken a stab at improving the docs in python/cpython#113187. I should probably get that merged so at least there's a bit more context as imperfect as it is. I've done my best to keep "deprecated" interfaces and concepts encapsulated so the concerns are separate and can be largely ignored. You may find the importlib_metadata implementation to be fresher and potentially less encumbered by legacy concerns. |
@@ -106,6 +106,13 @@ def find_spec(self, fullname, _path, _target=None): | |||
|
|||
return spec | |||
|
|||
def find_distributions(self, context=None): | |||
# Delayed import: Python 3.7 does not contain importlib.metadata | |||
from importlib.metadata import DistributionFinder, MetadataPathFinder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The importlib_metadata
provides these interfaces going back to Python 2.7 (and maybe 2.6). If this project wishes to support Python 3.7, it should consider using importlib_metadata
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is something on the back of my mind: what would be the best approach here to avoid errors in 3.7 if importlib_metadata
is being used, but without adding a new dependency on this project.
Initially I thought to minimise the complexity, but it is true that the initial implementation is error prone.
I added 455b77f to account for that. Not sure if the best is to try first importlib_metadata
and fallback to importlib.metadata
or the other way around (please let me know if the opposite is better).
@takluyver I understand that it is the approach in this module to keep supporting 3.7, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that adding importlib_metadata
adds an additional dependency. It also provides forward-compatibility for features introduced in later Pythons. I do see that in 4.0 (Python 3.10), there was improved alignment with naming standards, so that could affect discovery. For setuptools, I don't think that's a problem.
Thanks @jaraco ! Have you been able to run the setuptools with this branch to verify that the problem is solved? Or is it something that can only be properly tested after a release? |
We should definitely do this verification. I can potentially help with this. At the very least, it should be possible to verify by putting a release on TestPyPI if necessary. |
I couldn't agree more 😅. The complexity of importlib is also what causes the need for a
We can probably verify that by pinning |
@jaraco, it is not very trivial to reconciliate I hitting the following "dependency-version-conflict" when testing on setuptools:
That happens because the vendored version of |
Yes. Probably. It probably implies that this project may need to use I was worried about this when we were vendoring |
In abravalheri/setuptools#12 I show that this patch addresses the problem in setuptools (provided a small change for backwards compatibility in setuptools is considered). @jaraco, I think that is a better approach then imposing a new dependency on |
+1 on avoiding the dependency here. I'm a fairly minor contributor to this project, so my opinion shouldn't carry too much weight, but I'm rather concerned that the lower levels of the packaging stack are starting to get overburdened with dependencies, both vendored and unvendored, and managing those dependencies is becoming an exercise in itself. Anything we can do to limit that issue is worthwhile, IMO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @abravalheri for the detailed analysis and testing.
def find_distributions(self, context=None): | ||
# Delayed import: Python 3.7 does not contain importlib.metadata | ||
# If this method is being called it must be because | ||
# `importlib.metadata`/`importlib_metadata` is available. | ||
try: | ||
from importlib_metadata import DistributionFinder, MetadataPathFinder | ||
except ImportError: | ||
from importlib.metadata import DistributionFinder, MetadataPathFinder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we only define this if importlib.metadata is available?
I'm mildly concerned about importing importlib_metadata
from here since pip would need to modify its vendored copy of pyproject-hooks to not import an installed module (we've had issues due to attempting to do this in the past, usually resulting in breakage of pip because the user somehow ended up with a bad copy of said package in their environment and the user needing to understand the nuances + details of how stuff works to fix their environment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @pradyunsg here. Pip would need to patch this out.
Can we not simply do a version check? Or even just prefer the stdlib version always? That way, pip (which only supports Python 3.8+) would never pick up importlib_metadata
, avoiding any problems. Is there any justification for using importlib_metadata
in preference to the stdlib implementation, when the latter is present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any justification for using importlib_metadata in preference to the stdlib implementation, when the latter is present?
IIRC, importlib.metadata
was added to the stdlib, but then had significant API changes in Python 3.10, and importlib_metadata
made those changes available to previous Python versions. So various tools using it prefer the backport over the stdlib module.
I also don't really like the idea that this will do different things depending on whether a separate package is installed or not. Maybe we should pretend that importlib.metadata
was only added from Python 3.10 and do a version check?
(I remember this backporting causing issues before as well, though I don't recall which project that was in. I think for most stdlib modules doing backports like this would have been fine, but because importlib.metadata
defines an interface between third party packages, the packages using it need to agree on whether they use the backport or not. With the benefit of hindsight, I think once it was in the stdlib, backports should have stopped, so everyone just used the stdlib version, even if that meant we had to wait longer to use new APIs.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, unfortunately importlib_metadata
is not 100% backward compatible with importlib.metadata
.
In general, I find that the following works relatively fine:
- using
importlib_metadata.DistributionFinder, MetadataPathFinder
to implementfind_distrubitions
, but then usingimportlib.metadata
high-level APIs
And the following does not work:
- using
importlib.metadata.DistributionFinder, MetadataPathFinder
to implementfind_distrubitions
, but then usingimportlib_metadata
high-level APIs
(Of course that depends on the exact functionalities being used by all the other packages in the virtual environment)
That is the main reason why I preferred this order.
But as discussed in #195 (comment), I have no personal problem in exchanging the order in the try...except...
if that is a better approach. Or even just using importlib.metadata
all the time in Python 3.8+. Please let me know and I will proceed accordingly.
Can we not simply do a version check? Or even just prefer the stdlib version always? That way, pip (which only supports Python 3.8+) would never pick up importlib_metadata, avoiding any problems.
I believe that preferring the stdlib
version always (or not attempting to use importlib_metadata
at all) will have the same limitations.
An anecdote about that:
Think about abravalheri/setuptools#12 and the patch necessary to make it work. What happens there is that the except
branch is executed and importlib.metadata
is always used.
However setuptools uses its own vendored version of importlib_metadata
high-level API which causes the incompatibility documented in python/importlib_metadata#486. The good side is that, if setuptools
was not forced to vendor dependencies and use importlib_metadata
directly, no patch would be necessary.
In fact any package using importlib_metadata
is always subject to these problems as mentioned by Thomas in #195 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, unfortunately importlib_metadata is not 100% backward compatible with importlib.metadata.
That's a problem for importlib_metadata
/importlib.metadata
IMO. But it's been discussed elsewhere and I don't want to reiterate old arguments here.
For this PR, the main point is that we do not want a situation where pip's behaviour can change as a result of the user installing importlib_metadata
(whether it's a pristine copy, or some patched nightmare that a distro, in their wisdom, has created). That's something we try very hard to avoid in pip, because it has caused significant issues for us in the past - it's at the root of why our policy on debundling is basically "please don't, but if you must, it's your problem to deal with".
Pip doesn't support Python 3.7, so I'm happy with any solution that only uses the stdlib importlib.metadata
for Python 3.8+. At worst, it'll be pip patching this library when we vendor it, so nothing has to change here. But I'd expect other users of this library, like build
, to have similar issues, and so I think it's worth discussing a solution that isn't pip-specific. But if the conclusion is "we don't want to bother about that", then that's OK - pip will patch it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to go with importlib.metadata
always. That was more similar to my initial implementation.
But I'd expect other users of this library, like build, to have similar issues, and so I think it's worth discussing a solution that isn't pip-specific.
I believe that build
also uses the same approach of preferring importlib_metadata
and falling back to importlib.metadata
(for Python < 3.10.2). In this sense, the current implementation of this PR may be more compatible with build
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will open a separate PR with that approach so the maintainers can choose which one they prefer and close the other one: #199
What are the next steps for this to be merged? |
The version without |
I missed that, thanks! |
In #192, it is reported that backends not only need the ability to
import
code frombackend-path
but also depend on other features ofimportlib
(e.gimportlib.metadata
).This PR is an attempt to return that ability.
Closes #192
/cc @jaraco
This issue is a bit tricky...
From pypa/pip#11812 it is clear that only manipulating
sys.path
is not enough to guarantee compliance with the following part of the spec:The alternative is maybe to do both? (i.e. manipulate
sys.path
and add the meta path finder that guarantees that the backed is loaded from thebackend-path
)? I can prepare a second PR for that if this is the best direction to go.(But then there is also the risk that another
MetaPathFinder
that implementsfind_distributions
can take precedence and load metadata from outsidebackend-path
...).