Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide a Distribution class that provides up2date entrypoints and files metadata #23

Open
RonnyPfannschmidt opened this issue Jan 20, 2023 · 23 comments

Comments

@RonnyPfannschmidt
Copy link

currently editable wheels have static data for entrypoints and no data for files

it would be a nice help of there was a EditableDistribution that provides reasonably accurate file lists for the mapped files
as well as entrypoints as read from the pyproject.toml rather than

@RonnyPfannschmidt RonnyPfannschmidt changed the title provide a Distrbution class that provides pu2date entrypoints an files metadata provide a Distribution class that provides up2date entrypoints and files metadata Jan 20, 2023
@pfmoore
Copy link
Owner

pfmoore commented Jan 20, 2023

I'm not quite sure what you mean here. Entry points aren't anything to do with what this project supports, they should be readable via any standard wheel parser, or once installed by importlib.metadata. As for file lists, isn't that what EditableProject.files() provides?

@RonnyPfannschmidt
Copy link
Author

This is about providing data to importlib_metadata/importlib_resources

Aka instead of entrypoints from dist-info, take them out of the source tree

Same goes for the list of source files when only a path in the pth is provided

@pfmoore
Copy link
Owner

pfmoore commented Jan 20, 2023

Sorry, I'm still confused. Isn't this the responsibility of the build backend? All this library does is create a .pth file and possibly an import hook that exposes certain files, and tells the backend that those need to be added to the wheel (along with whatever else the backend puts in the wheel). I'm not aware of how you'd "provide data to importlib_*" in the way you suggest - there's no standard for that as far as I know.

Could you maybe describe the API you're looking for, and an example of what you'd expect the library to do, specifically?

Sorry, I feel like either I'm being very dense, or we're misunderstanding each other very badly here.

@pfmoore
Copy link
Owner

pfmoore commented Jan 20, 2023

Also, what's the use case here? How would a backend benefit from having this?

@RonnyPfannschmidt
Copy link
Author

Currently editable wheels supported by editables go stale and provide incorrect files Metadata

As the pep now has standardised Metadata, a editable wheel can act non-stale and provide correct listing of distribution metadata

This would mean that there would be a distribution object avaliable, which fills in the details for importlib.metadata

@pfmoore
Copy link
Owner

pfmoore commented Jan 20, 2023

Currently editable wheels supported by editables go stale and provide incorrect files Metadata

Can you provide an example? How does a wheel that is only intended to be built, immediately installed, and then discarded, "go stale"? And what "files Metadata" are you talking about? If you mean the direct_url.json file, that's the responsibility of the installer to produce, as stated in PEP 660. If you mean some other metadata, I don't know what PEP/specification you're referring to.

As the pep now has standardised Metadata, a editable wheel can act non-stale and provide correct listing of distribution metadata

As I say, I don't know what PEP and what standardised metadata value(s) you are referring to.

@RonnyPfannschmidt
Copy link
Author

The pyproject.toml in the source tree, if that changes, the Metadata of the installed editable wheel goes stale

@pfmoore
Copy link
Owner

pfmoore commented Jan 20, 2023

Yes. That would need a reinstall. From PEP 660:

Some kind of changes, such as the addition or modification of entry points, or the addition of new dependencies, require a new installation step to become effective. These changes are typically made in build backend configuration files (such as pyproject.toml), so it is consistent with the general user expectation that python source code is imported from the source tree.

I don't know how it would be possible to reflect dynamic changes in the pyproject.toml file into the installed project - especially in a backend-independent way. So I'd suggest that you ask build backends to implement this and see what ideas they have. If there's a way of doing it that isn't too tied to any individual backend, I'd be OK with adding it to this project. Or if you know of a way of doing this, I'd happily accept a PR. But to be honest, I'm not convinced it's even possible.

@RonnyPfannschmidt
Copy link
Author

All it needs is a extension of the meta path finder with a distribution finder that returns a distribution which looks at pyproject.toml

As pyproject.toml is standardised its even backed in depend.

@pfmoore
Copy link
Owner

pfmoore commented Jan 21, 2023

Ah, I see (finally!) I find the docs for importlib.metadata pretty hard to follow, I didn’t even realise that existed. I have some concerns, for example the fact that I don’t see how to easily integrate this with the backend-provided METADATA file, and dynamic fields, but I’ll take a look.

Presumably this would not work with the .pth file based approach, as that doesn’t use a hook. I don’t know which approach is most common in practice (the only backend that I know of using this library is hatchling and I haven’t checked what they use) but .pth based is the recommended choice. So the benefit seems limited.

@RonnyPfannschmidt
Copy link
Author

A starting point would be to allow generating basis files lists as well as returning entrypoints from pyproject.toml

I would recommend against supporting full backed data off the bat, but rather allowing backends to register utilities to the meta path finder to provide it instead

Those helpers Could be partly independent of the backend as well (for example a setuptools_scm based provider for version which could be added by any backend that integrates it)

@pfmoore
Copy link
Owner

pfmoore commented Jan 21, 2023

I would still like to know what use case(s) you have in mind here. The examples you give feel rather theoretical, without context. For example, given that the most common backends don’t actually use this project, how could people rely on having dynamically changing metadata anyway?

Assuming this gets added, I would probably make this an opt-in feature, so backends would need to say that they wanted it. That’s because this library is intended as a toolkit for backends, not a reference implementation of how editable installs should work. I don’t know how that would impact the use cases you have in mind.

@RonnyPfannschmidt
Copy link
Author

I'd add support to hatchling myself

Also I have a bootstrapping usecase where I'd just use editables, register the git checkouts of hatchling, hatch-vcs and all of the dependencies in order to

In addition to that, the self bootstrapping of setuptools_scm would be far easier if it had access to its own entrypoints when self-using it to make its own wheels

@pfmoore
Copy link
Owner

pfmoore commented Jan 21, 2023

OK. Let me look into it.

Also I have a bootstrapping usecase where I'd just use editables, register the git checkouts of hatchling, hatch-vcs and all of the dependencies in order to

Incomplete sentence here? I'm not sure what you mean by "register the git checkouts of (stuff)". Are you using this library to generate the support files and then installing them manually, not by building a wheel using a normal backend? That isn't intended use here. Of course, writing your own custom backend that uses this library is a supported use case, so the line is a little blurred - but it's why I'm comfortable saying that certain things are the responsibility of the backend (because there needs to be a backend involved).

In addition to that, the self bootstrapping of setuptools_scm would be far easier if it had access to its own entrypoints when self-using it to make its own wheels

But this isn't about access to entrypoints, it's about having entrypoints change when the code changes without a reinstall. Do you change your entry points often enough that running pip install -e . whenever you do is a major issue?

@pfmoore
Copy link
Owner

pfmoore commented Jan 21, 2023

One other consideration is around dependencies. This library currently has no dependencies outside of the stdlib, but adding this without using external libraries would be pointlessly difficult. Yes, the dependencies would probably "only" be the usual suspects like tomli and packaging, and you could argue that those aren't an issue because most (all?) backends will need them anyway. But people have strong opinions around debundling and installation from pure source, and it's an area I'm reluctant to get sucked into. Maybe I could put the dependencies behind a (non-default) extra, though, so that backends wanting the capability have to opt into the implications.

@pfmoore
Copy link
Owner

pfmoore commented Jan 21, 2023

I've just taken a look at the documentation for importlib_metadata1. To provide a DistributionFinder, I'd need to be able to create a Distribution for the package. But I can't do that, because I don't have access to enough information - for example, the files property and the locate_file method need data that only the backend has.

There's also a problem with the API definition in general. I don't know if I'm allowed to return a Distribution subclass from a DistributionFinder - I assume I am, but then I don't know what I should do with classmethods like at and discover. I'm sure this can be addressed by experimentation and, in the worst case, raising issues on the CPython project to ask for clarification. But it seems like it could be an awful lot of work, and potentially rather fragile, for something that ultimately the standard explicitly allows implementations to omit.

I'm not rejecting the idea, but I do want to set expectations here - without a more compelling benefit, this isn't going to be something I'll be able to do quickly, and it still might not even be possible at all. I'm still happy to discuss this, and if you think I'm missing an obvious solution, I'd be happy to look at a PR for it, but otherwise it could be a while before I even have a prototype.

Footnotes

  1. It's extremely annoying that key information on how to use a stdlib interface is published in a 3rd party library's documentation. I can't easily tell, for example, what Python versions the doc apply to. Which affects whether I can continue to support Python 3.7+, for example.

@RonnyPfannschmidt
Copy link
Author

I do understand the limitations, which is why I recommend supporting only obtainable meta (like entrypoints, and leaving files up to either simple heusterics or the backend

Just being able to support entrypoints and standard Metadata from pyproject.toml would already enable to use editables as a bootstrapping helper for libraries like setuptools_scm, hatch /hatchling & co

@pfmoore
Copy link
Owner

pfmoore commented Jan 21, 2023

It's not obvious to me from the docs that it's allowed to implement a "partial" Distribution subclass, which is why I'll need to experiment.

But you've still not explained why this is so much more convenient than simply reinstalling when you (occasionally) change the entrypoints. That's what's frustrating me - it seems like a lot of work to save you having to run a very simple command rather infrequently, and I know you're not asking lightly, so I feel like I'm missing something important here. If you could give me a pointer to the existing code/process docs that you'd change if this was implemented, that would be a huge help to me. Not the build backend code that would need to be modified to use this feature, but the pip install -e . command (or equivalent) that would be removed.

Also, I'm not clear what you mean by a "bootstrapping helper". It appears that setuptools_scm uses setuptools as its build backend, which doesn't even use editables, so would you not be better asking the setuptools to add this feature to their editable install implementation? Again, it feels like I'm missing something important here.

Is this related to pypa/setuptools-scm#642? In that PR, there's a discussion where you say

the classical setuptools one just needs to regen the egg info
modern pakcages need a pip install -e for PEP660

The PR author seems unhappy with this, and I can understand that, but it's unfortunately a limitation of PEP 660 - probably because no-one explained during the PEP discussions that the "traditional" approach had this property of dynamically updating the metadata. I can certainly see that this is a regression in functionality with PEP 660, but while I really don't want to open old wounds here, I think we spent too much time debating "better than the existing method" solutions, and not enough time making sure the proposals offered at least feature parity with what setup.py develop provided. The impression I got was that none of the proposers felt that was a priority compared to "enabling better solutions" 🙁

But that's water under the bridge, for better or worse, and while I'd be fine with a proposal to modify PEP 660 to require editable implementations to provide this (by some agreed, and standardised, means), that would be a standards change.

What I would be willing to do here is to implement a new function in the library, similar to add_to_path, which does whatever setuptools used to do in order to tell the import machinery about the .egg_info directory stored in the project directory. That would let backends implement feature parity with "old style" setuptools. The problem is that I'm pretty sure that setuptools method (.egg-link files?) never got standardised. And the problem with adding a metadata hook is that you're not allowed to omit the metadata from the wheel - so you end up with two differing sources of the metadata, and I haven't (yet) been able to find any documentation in importlib.resources to confirm that's allowed, and how to guarantee that the correct version gets exposed. Given the lack of any usable prior art in this area, I'm willing to look into this as a proof of concept, but I don't want to offer any promises. And of course, that wouldn't help unless backends chose to use the method (and when I say "backends", I mean "hatchling", as no other backends to my knowledge use editables, so for anything else you'd have to ask the backend to implement this directly).

I don't know how much of this is relevant to what you're asking. Hopefully it gives you some insight into why I don't think this issue is as straightforward as you'd hoped. Any clarifications you can still offer would be much appreciated.

@RonnyPfannschmidt
Copy link
Author

Both bootstrapping and protection against user errors

@RonnyPfannschmidt
Copy link
Author

Also the key reason why I rejected the updating in Setuptools_scm is precisely that a library like editables, which provides the downstream was completely unavailable

Completely creating everything is not a viable solution

Slowly expanding a base level with dynamic Details is much more sustainable

@pfmoore
Copy link
Owner

pfmoore commented Jan 21, 2023

A further thought, which I note here so that I don't forget it later:

The RECORD file in the installed wheel (the one in site-packages) must be the one that importlib.metadata clients see, as otherwise uninstall support will be broken (in a particularly bad way, as uninstallers might end up "uninstalling" the source code, i.e., deleting the developer's working copy!)

So it looks like we'd have to ask the backend to pass the path of the local METADATA file, and the entry-points.txt file, and other files will be taken from the installed versions. But importlib.metadata doesn't have APIs that take the file, which means I'd have to implement parsing of the file content myself. And I don't want to do that.

OK, I've raised python/importlib_metadata#427 to ask for an example in the importlib.metadata documentation on how to do this. We'll see what comes of that.

@RonnyPfannschmidt
Copy link
Author

I did some more research

I was wrong about needing to have filesDistribution.files lie, importlib Metadata has own apis for the details of resource reading

I may be able to provide an opt in for this feature within the next month, but I cannot commit to such an timeframe yet

@RonnyPfannschmidt
Copy link
Author

I did some more research,

It's feasible based on the apis provided by the importlib_metadata or the stdlib in 3.9 plus

To make it safe I'll have to add some tests to importlib_metadata that encapsulates the expectations that distributions by editables "override" Standard apis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants