-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distributing Pygrackle
(and Grackle
, itself)
#220
Comments
Upon reflection, I don't think the Consequently, resolving the hdf5-distribution problem actually has the following solutions:
There are some other crazier solutions too (probably not worth going into) For now I say we ship a SDist and punt on the rest of the hdf5 issues. Footnotes
|
Some additional questions:
As I talk this out, I posit we should ship as little as possible in a normal wheel. To experiment with shipping more, we should distribute Grackle and pygrackle separately on conda-forge |
I'm late to this, and it seems that you have this largely under control. My main comment to make is that I agree with your assertion that I believe that we should ship |
(Note: this comment was heavily revised for clarity)
I'm happy to do that. That makes some other decisions easier. As I understand it, (and please correct me if I'm wrong), the cythonize functionality needs to know where to find There is a wrinkle here. There are actually 2(ish) ways to build pygrackle with the new build-system and the locations of the Grackle header depends on which way is used. These approaches include
The problem is that the Grackle headers will be in different locations for both cases (we can't just hardcode them into Pygrackle).
For concreteness, I'm talking about adding functionality that would allow an Extension Module compiled against import pygrackle
#...
Extension('my-extension-name', #...
include_dirs=pygrackle.get_include_list()) At the moment, it's not clear the best way to do this sort of thing... If I generated a toml or ini file, do you know if this is generally a use-case for EDIT: it turns out scikit-build-core has some trouble right now with editable installs and Footnotes
|
@matthewturk I've thought about this some more, and now I think I can articulate my concerns a little better. Some Examples:Off the top of my head, I can name 3 python popular packages that include First, there is
Next, there is
Finally, there is
Relating this to
|
I think you make a compelling case that this is too complex to ship the |
I wanted to record a few thoughts that I've had about this topic over the past few weeks (hopefully this is semi-coherent) Area I: Distributing binary wheels from PYPIDistributing binary wheels from PYPI may be a lot more doable than I originally thought! My primary concern has always been with cross-compiling hdf5 (I also have a much better understanding about the cross-compilation process than I used to) I was concerned because the scripts used by h5py and pytables for this purpose are fairly complex (things seemed tricky in the julia packaging ecosystem as well) With that said, it looks like a lot of effort has gone into making hdf5 version Area II. What to include in a wheelIn light of the fact that we don't initially plan to ship the cython Area III. Obstacles to distributing precompiled
|
Once #182 and #208 are merged, I think we will be in a good position to consider packaging (to let people install things without manually building from source).
This issue primarily exists to get the discussion going.
This Issue primarily focuses on packaging/distributing
Pygrackle
. It also touches on the idea of packagingGrackle
, itself. The latter idea is mostly discussed as a hypothetical idea -- it is primarily mentioned to highlight that some challenges are common in both scenarios (and that we may want to pick solutions that would help resolve the issue in both scenarios).The biggest challenge in most scenarios is the fact that Grackle requires the hdf5 library. A lesser, common issue is convient distribution of datafiles (that was acknowledged in #210).
Packaging/Distributing
Pygrackle
I think there are 2 primary things we want to do here:
dlopen
with this information to open the hdf5 library at runtime and would initialize the function-pointers in that internal struct- The main disadvantage: I think that
dlopen
has some weird platform dependent behaviordlopen
Packaging/Distributing
Grackle
This is all a much lower priority than shipping Pygrackle, but it's worth briefly mentioning that we could also distribute precompiled Grackle binaries. I've seen this sort of thing done alongside new software releases for other open-source projects on GitHub.
Since we will already be doing a lot of work to ship Grackle as a precompiled binary alongside Pygrackle, I actually think this wouldn't be hard to do. Moreover, the new CMake build-system should be compatible with the CPack tool.
Again, this is a hypothetical low-priority idea. But, there are 2 reasons I bring this up:
if __name__ == __main__
section that lets the file be installed alongside Grackle as a self-contained, executable script that serves as a command-line tool for fetching data files?1Footnotes
@matthewturk previously suggested using pooch. Using that module probably wouldn't be ideal for helping in the case of distribution Grackle (without Pygrackle). For portability, it might be better to rely upon functionality from the python standard-library, rather than the pooch package. OR since distributing Grackle itself is mostly a hypothetical, maybe we should use pooch and limit ourselves to a subset of pooch-functionality that we could re-implement later in terms of standard library functionality (if/when it becomes an issue). ↩
The text was updated successfully, but these errors were encountered: