-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests section of the guide (which isn't started yet!) #59
Comments
I did want to bring up the idea of when to package tests. The current python-package-structure recommends a top-level tests/ alongside the top-level src/ and then mentions how this will usually leave tests out of distributions. It goes on to be explicit about this
I completely agree with this point. However, tests in sdists never gets mentioned and I believe that an sdist should include all artifacts necessary for the consumer to execute all tests. [1] That takes me to my other recommendation that I would like to make: to have tests runnable by consumers not only do the test directory and any configuration files (tox.ini) need to be available but so does any test data. The guide already has the recommendation
However, any data that is included should absolutely be packaged up and loaded as importlib resources
|
Generally agree with this, adding a couple of other points:
|
I was going to make the same comment about pooch for larger datasets. Another point is that I disagree with the no- tests policy on wheels. For pure Python packages? Sure. For wheels that ship compiled code? Please send me your tests b/c I want to be sure I can reproduce them on my platform! |
It will leave tests out of built distributions (wheels, etc) but not source distributions (sdists). So long as that directory is included in the distribution package manifest (automatically so long as its in source control for most modern backends/plugins and not manually excluded, or manually otherwise), if it is outside the import package directory (
Yes, agreed (particularly as a Conda-Forge packager)—I initially had a different opinion, but my participation in the linked Python Discourse thread quickly changed my mind.
In general, the modern workflow with most backends includes everything checked in by default, unless specifically excluded. Especially for a sdist, where size is much less critical, its better to just include data files for tests that are not too horribly large (perhaps over a ideally up to a few MB, but up to a few tens of MB total). If data is larger, it should be made as minimal as necessary to properly exercise the package's functionality, or if that is simply not possible, it could be downloaded externally using poodle, etc., adding a Pytest marker to those tests that require network access and not running them by default (so downstream distributors can decide whether it makes sense to run them).
Yup, though if its test-only data, you could potentially get away with making it (import) package resources in the |
this is a great conversation! thank you all! here is the page / section that i think we are discussing - Here we talk about the nuance of tests and data being included. it sounds like a small pr is in order but some of hte above is already in the text.
what language specifically do we want to finesse on this page? and then we need an entire page / section devoted to tests specifically as well. @NickleDave you might recall we had started to add a lot about tests here but we want a tests page devoted to that topic. So what specifically shall we modify on this page to make it more clear / accurate and inline with what pypa and the python discourse recommends? FWIW all of the package maintainers for the core scientific python tools (that all have extensions / wrap other languages) wanted to ship tests for exactly the reason @ocefpaf mentioned above. then let's work on a tests page separately that includes more about what goes in sdist vs wheels. |
I sort of get it, but why do you want to test bdists with extension modules, but not pure python bdists? If you are not confident that the developer is responsible to have tested their releases, wouldn't you test all packages? And if you want to test compiled code why not just grab the sdist? I suppose because you may not have the toolchain to compile locally but then the provided binaries might not be able to be tested anyways for any number of reasons; you may very well have to compile the module with different flags to properly test it.
Thanks, that clarifies things (and for that reason maybe should go into the guide). Is this true of every builder covered in the guide? I haven't done this sort of comparison exhaustively so it might very well be.
Yes, that is the page I have been alluding to. Although the final work may be broad enough to need to update a few pages.
Exactly. Test distribution is already covered but IMO since only wheel are explicitly mentioned when the guide switches to "package distribution" it's not clear that a separate recommendation is being made for sdists. Also, I don't think the size or type of test should matter - if you decide to distribute tests it is nice if you distribute all of them. Ideally complex/ flaky/ non-isolated tests are marked in some way that the consumer can disable or enable what they want as @CAM-Gerlach mentioned. |
Sidenote, but the standard CI testing approach I generally do and recommend as an upstream developer, pure Python or not, and which aligns with best practices as far as I'm aware, is in your CI test pipeline (or using a workflow tool like Tox/Nox/etc,), is to build a wheel for your project (in turn built from a sdist, as By contrast, as a downstream redistributor you're going to be rebuilding the project from source anyway, and will have the sdist or the source tree, so you don't need the tests in the wheel either. I could envision certain scenarios where it could be useful, but it seems a bit of an edge case vs. having tests in the sdist.
Assuming you've configured the backend to add the tests (or it does so automatically via VCS), it is a nessesary consequence of how the sdist vs. wheel formats work; sdists contain the entire source tree minus anything excluded, while aside from explicitly specified license files and metadata, wheels contain only the actual import package(s) themselves, so anything outside the package directory is excluded (unless added via the legacy data-files mechanism, or added under one of the other |
The changes that I would propose right now:
|
Just to note, I meant to mention this before, this is explicitly required by the sdist spec and implemented by all conforming backends, so there's no need to mention this here as it could confuse users or make them think they have to explicitly specify it (as opposed to the other things they should include that may or may not be automatically included depending on their backend).
I wouldn't even go so far as to suggest excluding |
Good to know! I didn't realize this preference made it into any formal specification. It should still be noted though that pyproject.toml is not the end-all-be-all of build files because of the possible
Sure, I am comfortable only recommending excluding |
It is not about trust but the fact that a passing test for a binary built under certain circumstances may fail with the running platform is slightly different. That would be super rare and unexpected for a pure Python package. |
Actually, it's not a mere "preference", but is rather the defining feature of the modern sdist format 🙂 , i.e. a
Sure, though it is the responsibility of the backend to ensure the sdist contains any backend-specific files (e.g.
Right, though for the sake of pedantry, excluding VCS directories will generally be handled automatically by the backend:
|
I believe I've taken us somewhat off-topic. We are not building a backend here that has to make some of these decisions, only making recommendations of current tools. I guess we need to know if all the modern build toolchains already do everything I before proposed as a change to the recommendations (including setuptools - likely using an extension). If they do then we don't really need to call all this out for readers and what I would probably suggest is to drop the mention of leaving tests out since that's just the normal. The other question I have is if we want to make the recommendation that wheels do include tests (some level of smoke checks I assume?). My vote is very much -1 but it appears @ocefpaf and @lwasser and maybe others believe this is a good practice when compiled code is involved. |
I guess that the reason to keep tests for compiled code is not well explained here. The thing is, when a scientist installs numpy, for example, from PyPI, the wheel was built under certain circumstances and the tests for those circumstances passed. However, the local machine is slightly different and the devil is in the details, most of the tests can pass but some may present a precision difference that may lead to different results and/or test failures. Someone may argue that one should always build their own library when compiled code is involved but I would argue that installing and running the tests (and hope everything passes) is easier than start off by building your own. The topic of not shipping tests is a light pkg vs completeness most of the time but, when compiled code is involved, there a real danger of different results. |
Sorry for the confusion—my fault, I was focusing too much on pedantic details. I just meant that there's no need to mention to users that they have to include However, as for tests, I'm +1 on including a mention that they should be included in the sdists if practical, and +0.5 on ensuring they are excluded from the wheels if practical (with the former taking priority, which will happen by default if they are outside the package per the recommended layout).
Couldn't the small proportion of wheel users who want to run the tests just also download the sdist of the package, the source archive of the GIt tag or the repo at that tag, and run the tests again the installed wheel from there? Given the overwhelming majority of wheel users will never run the tests, ISTM that the minor amount of additional inconvenience this requires (if any at all, if they already have the repo cloned somewhere) seems to outweigh the additional bandwidth and install time costs for most cases. Or is there something I'm missing here? |
I'm from a time were the install instructions were always: install, run the tests, use it, but I'm old. Still, at least in my bubble, we are very worried about reproducibility and accurate results across experiments in various platforms, so we always run the tests. One could flip this argument around and say that the dangers of having wrong results are not worth the little extra bandwidth necessary for shipping the tests? I believe scipy and numpy libs are unlikely to stop shipping the tests and removing a direct call like Note that this is my last comment on this thread b/c I don't feel strongly were this documents goes. Just wanted to point out what part of the scientific community thinks about this issue. PS: maybe it is worth asking numpy and scipy devs what they think. |
Hey @ocefpaf , sorry if I got us too sidetracked into the relative proportions of wheel users who do vs. don't run the tests, and the relative merits thereof. I can certainly see the argument of running the tests against the installed wheel, and in fact that's what I (and many others) do for the projects I maintain, and recommend that others do the same. My main question was whether it would not simply work for your use case to run the tests themselves from an unpacked sdist, source archive or repo checkout but still against the installed wheel as you want, in cases where tests are not included with the wheel itself (i.e. if the Conversely, if the tests are included in the import package, then they will presumably still be in both the wheel and the sdist, given most backends don't really offer an easy mechanism to include things in one but not the other for things inside the import package itself (you technically can with Setuptools, but I certainly wouldn't call it easy). So it seems to me that either way, you can still run the tests against the installed wheel, even if the tests themselves are outside the package, no? Or maybe I am missing something here? |
As an attempt to put a pin in this here is my updated proposal:
Also, I want to make sure that sdists include all files necessary to build wheels. I have run into real difficulties with scientific packages in the past that do not do this. Almost always it is because they are generating files during the build process that are not in revision control (setup.py creating the README dynamically...) or that the build system is reading files that are explicitly excluded (MANIFEST.in skips a file that is ready by setup.py). However, I don't think that new projects using new build systems will produce these same quirks, so I am reluctant to confuse our readers with this caution. WDYT? |
I also wonder looking back at the page if the hierarchy could be reconsidered
This would make a lot more sense to me:
|
i hear you! i think the only reason we did it that way was to highlight we suggest src/ layout to avoid some of the confusion between choosing but also that data piece does really fit in another section. im thinking we should work in a google document where we can have lots of comments and input prior to a pr and we could totally reorg this section. we probably want to create an outline for a large chunk of the guide and then start writing. maybe we can even have some meetings that are working focused soley on writing? |
some very helpful discussion came up in a recent review here about tests and when to include them and what to do with data, etc.
When we work on our tests section we need to consider and weave this into that section.
cc'ing @NickleDave @aragilar on this - thank you both for this feedback!
The text was updated successfully, but these errors were encountered: