-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiple zarr files + fsspec.get_mapper #286
Comments
This falls between some concepts:
So indeed, intake-xarray could do this (glob-> list of mappers -> list of xarrays to be joined) or xarray itslef could do this like mfdataset. Note that since zarr may lean more on fsspec in the future ( zarr-developers/zarr-python#546 ), it may make sense to discuss this with them and/or xarray. |
Thanks @martindurant , this is very helpful. I agree that it would be nice to follow up with zarr developers. One thing: url = fsspec.open_local(self.url_path, **self.storage_options) If url is originally a glob, I wrote a Finally if "*" in url or isinstance(url, list):
self._mapper = self.urlpath
else:
self._mapper = fsspec.get_mapper(self.urlpath, **self.storage_options)
self._ds = _open_zarr(self._mapper, chunks=self.chunks, **kwargs) with |
I have a sequence of zarr files distributed across different nodes that I want to read in parallel, while only providing a string (glob-like) path.
The behavior I want to emulate:
For netcdf-files, we can do this using
where paths is given by
such that
len(glob(paths)) = len(url)
e.g. 5 (5 nc-files distributed on different directories). The
url
is then used as an argument forxarray.open_mfdataset
The problem
zarr files open with a mapper (
url=fsspec.get_mapper(paths)
with url as an argument toxarray.open_zarr
), and a glob-like path does not work as nicely (compact) as it does withfsspec.open_local()
and nc-files. That is, given(where the zarr stores appear as directories) we get
If you just try, the right hand side is zero, while the LHS > 0.
A solution to the problem is to just pass the
glob-like
path directly to_open_zarr
(with proper modifications to_open_zarr
function much likexarray.open_mfdataset
). I am just wondering iffsspec.get_mapper(paths)
can take a glob-like path string and I just haven't figured out yet how...The text was updated successfully, but these errors were encountered: