-
-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swapped the netCDF4 dependency to h5netcdf #2122
Conversation
I think you need to define https://docs.xarray.dev/en/stable/user-guide/io.html We probably should define cc @OriolAbril |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the late review. I don't know much about the differences between the two libraries. Can you develop a bit on the differences and advantages of h5netcdf over netcdf4 package? Seeing .legacy
in the import makes me a bit nervous.
arviz/data/inference_data.py
Outdated
data.to_netcdf(filename, mode=mode, group=group, **kwargs) | ||
data.close() | ||
mode = "a" | ||
else: # creates a netcdf file for an empty InferenceData object. | ||
empty_netcdf_file = nc.Dataset(filename, mode="w", format="NETCDF4") | ||
empty_netcdf_file = nc.Dataset(filename, mode="w") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the format always netcdf4 with h4netcdf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @OriolAbril,
The basic reason for this is to remove the dependency on the netCDF4 C library. Since netCDF4 is a layer on top of HDF5, you could imagine using a basic HDF5 library in python to implement the netCDF4 standard, instead of calling netCDF4 C routines. h5netcdf
does this using the h5py
library.
I find installing the netCDF4 C binaries to be quite painful, especially when I need to support linux, windows, intel OSX, Silicon OSX, etc. On Windows, there are some times issues with binary incompatibilities between the HDF5 libraries installed by h5py
and netCDF4
on conda
Switching to using h5py
as the backend would make the install significantly more straightforward. I think most scientific python users probably already have h5py
installed, so practically I would imagine this wouldn't actually be a new dependency in those cases.
I see where you're coming from with something called legacy
showing up in an import - but I believe that the legacy API you saw allows h5netcdf
to be used as a 'drop-in' replacement for netCDF4
while the 'new' API looks more like h5py
, rather than this being some kind of deprecated support issue.
Is the format always netcdf4 with h4netcdf?
Not sure, but I would guess yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, FWIW xarray
seems to fully support h5netcdf
as a backend. That's an appeal to authority, but it does seem to be ok for them!
Hey, just to let you know the netcdf4 import is currently failing on the latest python image. So on that particular image, I can't install arviz. netcdf4 is always a pain to install, so having a more reliable dependency would be great (and a v0.13.1 release) |
Co-authored-by: Ari Hartikainen <[email protected]>
Co-authored-by: Ari Hartikainen <[email protected]>
I have rebased to fix the git conflicts and done some changes to allow either library as io engine. I think that makes sense as imo long term we shouldn't have either as a dependency and allow users to install whichever they one (or none if they don't use that subset of features). Thanks for your patience, I'll merge in the following days. |
Codecov Report
@@ Coverage Diff @@
## main #2122 +/- ##
==========================================
- Coverage 89.96% 89.92% -0.04%
==========================================
Files 119 119
Lines 12408 12419 +11
==========================================
+ Hits 11163 11168 +5
- Misses 1245 1251 +6
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
https://build.opensuse.org/request/show/1073712 by user dgarcia + dimstar_suse - skip python 3.11 and python 3.8 because some dependencies doesn't support that python versions - Update to 0.15.1 - Fix memory usage and improve efficiency in `from_emcee` ([2215](arviz-devs/arviz#2215)) - Lower pandas version needed ([2217](arviz-devs/arviz#2217)) - 0.15.0 - Adds Savage-Dickey density ratio plot for Bayes factor approximation. ([2037](arviz-devs/arviz#2037), [2152](arviz-devs/arviz#2152)) - Add `CmdStanPySamplingWrapper` and `PyMCSamplingWrapper` classes ([2158](arviz-devs/arviz#2158)) - Changed dependency on netcdf4-python to h5netcdf ([2122](arviz-devs/arviz#2122)) - Fix `reloo` outdate
Description
Addresses #2029 and #2028 by swapping out the netCDF4 dependency to h5netcdf, which should be a lighter dependency, since it avoids needing the netCDF4 C binaries.
Checklist
📚 Documentation preview 📚: https://arviz--2122.org.readthedocs.build/en/2122/