Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add intake_xarray_kwargs to ThreddsCatalog #52

Merged
merged 8 commits into from
Mar 18, 2022
Merged

Conversation

andersy005
Copy link
Member

@andersy005 andersy005 commented Sep 15, 2021

@andersy005
Copy link
Member Author

@raybellwaves, this is my attempt at addressing #51. I opted for intake_xarray_kwargs instead of xarray_kwargs so as to use it as a catch-all argument for all options that can be passed to the intake_xarray sources (such as https://github.com/intake/intake-xarray/blob/f1ca02d5c7734bb9de79074fe128ac5e7d598165/intake_xarray/netcdf.py#L48).

Right now it appears that things are broken due to how fsspec is interfering with the path passed to xarray's open_dataset.

In [1]: import intake

In [2]: cat_url = "https://thredds.ucar.edu/thredds/catalog/grib/NCEP/GFS/Global_0p25deg/GFS_Global_0p25d
   ...: eg_20210913_1800.grib2/catalog.xml"

In [3]: catalog = intake.open_thredds_cat(cat_url, driver="netcdf", intake_xarray_kwargs={'xarray_kwargs'
   ...: : {'engine': "netcdf4"}})

In [4]: catalog = intake.open_thredds_cat(cat_url, driver="netcdf", intake_xarray_kwargs={'xarray_kwargs': {'engine': "netcdf4"}})

In [5]: source = catalog["GFS_Global_0p25deg_20210913_1800.grib2"]
In [7]: source.to_dask()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-75a159db6bd7> in <module>
----> 1 source.to_dask()

~/.mambaforge/envs/intake-thredds-dev/lib/python3.9/site-packages/intake_xarray/base.py in to_dask(self)
     67     def to_dask(self):
     68         """Return xarray object where variables are dask arrays"""
---> 69         return self.read_chunked()
     70 
     71     def close(self):

~/.mambaforge/envs/intake-thredds-dev/lib/python3.9/site-packages/intake_xarray/base.py in read_chunked(self)
     42     def read_chunked(self):
     43         """Return xarray object (which will have chunks)"""
---> 44         self._load_metadata()
     45         return self._ds
     46 

~/.mambaforge/envs/intake-thredds-dev/lib/python3.9/site-packages/intake/source/base.py in _load_metadata(self)
    234         """load metadata only if needed"""
    235         if self._schema is None:
--> 236             self._schema = self._get_schema()
    237             self.dtype = self._schema.dtype
    238             self.shape = self._schema.shape

~/.mambaforge/envs/intake-thredds-dev/lib/python3.9/site-packages/intake_xarray/base.py in _get_schema(self)
     16 
     17         if self._ds is None:
---> 18             self._open_dataset()
     19 
     20             metadata = {

~/.mambaforge/envs/intake-thredds-dev/lib/python3.9/site-packages/intake_xarray/netcdf.py in _open_dataset(self)
     90             url = fsspec.open(self.urlpath, **self.storage_options).open()
     91 
---> 92         self._ds = _open_dataset(url, chunks=self.chunks, **kwargs)
     93 
     94     def _add_path_to_ds(self, ds):

~/.mambaforge/envs/intake-thredds-dev/lib/python3.9/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, *args, **kwargs)
    495 
    496     overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 497     backend_ds = backend.open_dataset(
    498         filename_or_obj,
    499         drop_variables=drop_variables,

~/.mambaforge/envs/intake-thredds-dev/lib/python3.9/site-packages/xarray/backends/netCDF4_.py in open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
    549 
    550         filename_or_obj = _normalize_path(filename_or_obj)
--> 551         store = NetCDF4DataStore.open(
    552             filename_or_obj,
    553             mode=mode,

~/.mambaforge/envs/intake-thredds-dev/lib/python3.9/site-packages/xarray/backends/netCDF4_.py in open(cls, filename, mode, format, group, clobber, diskless, persist, lock, lock_maker, autoclose)
    351 
    352         if not isinstance(filename, str):
--> 353             raise ValueError(
    354                 "can only read bytes or file-like objects "
    355                 "with engine='scipy' or 'h5netcdf'"

ValueError: can only read bytes or file-like objects with engine='scipy' or 'h5netcdf'

@andersy005 andersy005 added the enhancement New feature or request label Sep 15, 2021
tests/test_cat.py Outdated Show resolved Hide resolved
tests/test_cat.py Outdated Show resolved Hide resolved
@aaronspring
Copy link
Collaborator

Just fixed that one test. Can this PR be merged?

@aaronspring aaronspring self-requested a review March 17, 2022 23:06
@andersy005
Copy link
Member Author

Thank you, @aaronspring! Let's go ahead and merge this as is. If there is any issue, we can address it later.

@andersy005 andersy005 merged commit 5ec2660 into main Mar 18, 2022
@andersy005 andersy005 deleted the xarray-open-kwargs branch March 18, 2022 02:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add xarray_kwargs to ThreddsCatalog
2 participants