-
Notifications
You must be signed in to change notification settings - Fork 963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What metadata/installability checks should Warehouse check uploads for? #194
Comments
What I'd love is to see the packaging/build tooling help prevent this before it ever gets up to pypi/warehouse, although it might be nice to do some sanity checking of its own as well. Its a weird issue since the way this is done is people reading files in setup.py, but would be nice if this was somehow detected and alert you that you are referencing things that wont end up in the dist. |
I don't think executing untrusted code on the PyPI server is an option. Can this be done without? |
This is partly what the cheesecake project attempted to do; I like the idea in theory but it requires a bunch of work and support to make happen in practise. |
I guess I note that CPAN does automated testing of packages - I have no idea how they sandbox. VMs would be my guess. Obviously Travis CI and drone.io sandbox code to run tests. A system with VMs or Docker containers could help. That said I recognize that this requires non-trivial resources so it's not easy. |
How awesome would it be to search PyPI and be able to sort by test coverage?! What if |
What would be even more awesome is if we ran pyroma against uploads and users could filter out projects with low scores? I would personally find that VERY helpful. I would even vote for rejecting uploads that get lower than a minimum score. IIRC pyroma doesn't even require sandboxing. |
Example of how CPAN runs tests on packages: We can't let Perl show us up! 😄 ActiveState is active in Python as well as Perl -- maybe they are interesting in hosting a service like this like they do for the Perl community? |
@agronholm Ok so Looking at pyroma it has the following checks
So going through these checks!
|
What I propose is:
|
|
This would be a fantastic feature. I'm only moving this to a future milestone because it is not on the critical path for the immediate goal of switching from the old PyPI to Warehouse. |
Is this something people are still interested in? If we had a system in place that allows to fully install a package there’s lots of other useful information we can extract from it. Transitive dependencies, metadata, even tests. |
Yes, people are still interested in this but it's been deferred until the immediate goals have been met and Warehouse has replaced Cheeseshop as the official PyPI. |
Best thing is probably to make this a sandboxed standalone project with some kind of API that warehouse can call. I’m currently working on a POC for this. If anyone is interested, please let me know. |
@jayfk could you post to pypa-dev or the distutils-sig list about your proof of concept? Thanks! |
Jannis posted to distutils-sig -- thanks @jayfk! |
Yep! Thanks for pointing that out @brainwane! |
Actually, Perl has CPANTesters, which lets the community help to do the testing over a lot of different platforms and version combinations. This is definitely something where Python can learn from Perl. |
This is an issue that came up in a discussion of improving pip's automated tests last week; pip would find this feature helpful. |
Reminder to folks following this: as @ewdurbin and @pradyunsg and I talked about in a meeting about the pip resolver work, getting this implemented might help us smooth testing of and the effects of the resolver rollout. So if any volunteers could help get this finished and merged, we'd appreciate that! cc @yeraydiazdiaz |
@uranusjr @pfmoore @pradyunsg I'd appreciate if you could review this issue, especially #194 (comment) and the comments below it, which discuss some upload checks we could implement in Warehouse (blocking noncompliant uploads). Which of those checks would be particularly valuable to the pip resolver work? Once we have that list, we can make a checklist so this issue becomes more completable. |
One wild wish I’ve always hoped for is to eliminate dynamic dependency altogether, and ensure all dists of a given version specify exactly the same set of dependencies. One of the most resource-consuming (both for development and runtime) part of dependency resolution is the need to download a package matching the host environment, extract it, and potentially build it to get dependencies. This can cost a tremendous amount of time if you’re on a platform without wheels (e.g. musl), and it would be a vast improvement if we are able to download (say) a wheel for Windows and know its metadata would match if I build it on a random Linux machine. m |
It would be great to have a check that the name and version in the metadata match the filename (sdist or wheel), so that we don't have to abort on mismatches. (We'd still have to check, for local files and other edge cases, but knowing that the filename is reliable would still be useful). As @uranusjr mentioned, the only really useful checks from the pip resolver POV are ones which would allow us to minimise downloading of distributions. All we use is name, version and dependencies. So for us:
|
I'd argue this is already something you should be able to assume. If those two things don't line up I'd just call it an error. We've never made any promises that it would work, and if it does currently work I'd call it an implementation detail that just happened to allow it for awhile.
I'm fairly sure that at some point we will supplant the simple API with a better one that will include the needed information, but that is obviously not yet there. That being said, it's probably useful to start checking/enforcing any useful checks now to get better data in the long run. |
I'm not sure that this is possible, or even desirable. At least when we talked about PEP 517 @njsmith argued pretty strongly that we need some sort of hook that enabled programatic dependencies, and that we simply could not depend on static only metadata for sdists. |
At the summit last year @crwilcox asked:
@dstufft said, above (in 2014):
@dstufft do you agree with your 2014 self that author email and project URL should not be mandatory? |
@dstufft @pfmoore @pradyunsg @ewdurbin @di @uranusjr @techalchemy OK, I reread the above discussion. It sounds like the metadata/installability checks that make sense for Warehouse to mandate are:
Does this sound right? Are any of these already mandated? |
Sounds reasonable to me. I'd argue that there should be some means of contacting the author, but I concede that "issue tracker on the project webpage" is entirely reasonable for that. So we move out of the area of what should be checked, and into a new metadata field. |
|
Recently, a project that I worked on pushed a new version to PyPI and it turned out to be completely broken, because the
setup.py
was referencing aREADME.rst
file that was not present in the sdist.It would be awesome if PyPI could so checking of packages that are uploaded.
To start with, it could create a virtualenv and try pip installing the package and make sure that exits with status 0. Perhaps it could also try easy_install to make sure that works too.
There are more elaborate things that could be done like run tests if they are included (many packages don't bundle their tests though) or try to validate the RST, but I think just pip installing is a very good first start, as it detects packages that are completely broken.
The text was updated successfully, but these errors were encountered: