Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Printing ClimateData object throws KeyError from h5netcdf #210

Closed
fkuehlein opened this issue Dec 21, 2023 · 2 comments · Fixed by #215
Closed

Printing ClimateData object throws KeyError from h5netcdf #210

fkuehlein opened this issue Dec 21, 2023 · 2 comments · Fixed by #215
Labels
bug maintenance something should be improved or is outdated
Milestone

Comments

@fkuehlein
Copy link
Collaborator

Discovered this when running the tutorial_ClimateNetworks.ipynb notebook, see full output below. Probably not a Problem with pyunicorn itself, right? Could someone confirm this happening to make sure its not the result of some corrupted conda environment I'm using?


When running the cell where the ClimateData is loaded,

#  Print some information on the data set
print(data)

will return

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File /opt/miniconda3/envs/pyunicorn/lib/python3.10/site-packages/h5netcdf/legacyapi.py:67, in HasAttributesMixin.__getattr__(self, name)
     66 try:
---> 67     return self.attrs[name]
     68 except KeyError:

File /opt/miniconda3/envs/pyunicorn/lib/python3.10/site-packages/h5netcdf/attrs.py:32, in Attributes.__getitem__(self, key)
     31 if self._h5py.__name__ == \"h5py\":
---> 32     attr = self._h5attrs.get_id(key)
     33 else:

File /opt/miniconda3/envs/pyunicorn/lib/python3.10/site-packages/h5py/_hl/attrs.py:94, in AttributeManager.get_id(self, name)
     92 \"\"\"Get a low-level AttrID object for the named attribute.
     93 \"\"\"
---> 94 return h5a.open(self._id, self._e(name))

File h5py/_objects.pyx:54, in h5py._objects.with_phil.wrapper()

File h5py/_objects.pyx:55, in h5py._objects.with_phil.wrapper()

File h5py/h5a.pyx:80, in h5py.h5a.open()

KeyError: \"Can't open attribute (can't locate attribute in name index)\"

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
Cell In[34], line 7
      1 data = climate.ClimateData.Load(
      2     file_name=DATA_FILENAME, observable_name=OBSERVABLE_NAME,
      3     data_source=DATA_SOURCE, file_type=FILE_TYPE,
      4     window=WINDOW, time_cycle=TIME_CYCLE)
      6 #  Print some information on the data set
----> 7 print(data)

File ~/Desktop/23_H2_PIK/pyunicorn/src/pyunicorn/climate/climate_data.py:103, in ClimateData.__str__(self)
     99 def __str__(self):
    100     \"\"\"
    101     Returns a string representation.
    102     \"\"\"
--> 103     return 'ClimateData:\
' + Data.__str__(self)

File pyunicorn/src/pyunicorn/core/data.py:113, in Data.__str__(self)
    111 \"\"\"Return a string representation of the object.\"\"\"
    112 if self.file_name:
--> 113     self.print_data_info()
    115 return (f\"Data: {self.grid.N} grid points, \"
    116         f\"{self.grid.n_grid_points} measurements.\
\"
    117         f\"Geographical boundaries:\
{self.grid.print_boundaries()}\")

File pyunicorn/src/pyunicorn/core/data.py:390, in Data.print_data_info(self)
    388 # Open netCDF4 file
    389 f = Dataset(self.file_name, \"r\")
--> 390 print(\"File format:\", f.file_format)
    391 print(\"Global attributes:\")
    392 for name in f.ncattrs():

File /opt/miniconda3/envs/pyunicorn/lib/python3.10/site-packages/h5netcdf/legacyapi.py:69, in HasAttributesMixin.__getattr__(self, name)
     67     return self.attrs[name]
     68 except KeyError:
---> 69     raise AttributeError(
     70         f\"NetCDF: attribute {type(self).__name__}:{name} not found\"
     71     )

AttributeError: NetCDF: attribute Dataset:file_format not found"
@fkuehlein fkuehlein added the bug label Dec 21, 2023
@fkuehlein fkuehlein added this to the Release 0.7 milestone Dec 21, 2023
@ntfrgl
Copy link
Member

ntfrgl commented Dec 23, 2023

This specific error goes back to cd8ee00 in the context of #160, which made an untested assumption that netCDF4.Dataset and h5netcdf.legacyapi.Dataset are sufficiently compatible. Of course, in fact the former has more legacy functionality, such as the Dataset.file_format variable, which in this case is used only to print out basic metadata.

The broader question is about which API version should be supported going forward. Let's move that discussion, which has been awaiting input from @jdonges, from #12 to here. The easiest solution would be to switch the strict dependency to netCDF4, which had been an optional dependency before the commit above. Alternatively, we could stick with h5netcdf and remove or replace the legacy API usage. The primary advantage of h5netcdf is a smaller installation footprint, because it relies on fewer C libraries.

@ntfrgl ntfrgl added the maintenance something should be improved or is outdated label Dec 23, 2023
@fkuehlein
Copy link
Collaborator Author

Alright, I will then make sure to avoid running into this in the tutorial for now.

fkuehlein added a commit to fkuehlein/pyunicorn that referenced this issue Jan 8, 2024
- delete old tutorials in `examples/tutorials`
- move new tutorials from `notebooks` to `examples/tutorials`
- reorder image and data files in `examples/tutorials`
- update to tqdm progressbar in `RecurrenceNetwork.ipynb`
- avoid pik-copan#210 in `ClimateNetworks.ipynb`
- clean up some typos and latex sequences in all tutorials
- largely resolves pik-copan#185
@fkuehlein fkuehlein linked a pull request Jan 24, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug maintenance something should be improved or is outdated
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants