You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently, when a user wants to write multiple netCDF files in parallel with xarray and dask, they can take full advantage of xr.save_mfdataset() function. This function in its current state works fine, but the existing API requires that
the user generates file paths themselves
the user maps each chunk or dataset to a corresponding output file
A few months ago, I wrote a blog post showing how to save an xarray dataset backed by dask into multiple netCDF files, and since then I've been meaning to request a new feature to make this process convenient for users.
Describe the solution you'd like
Would it be useful to actually refactor the existing xr.save_mfdataset() to automatically save an xarray object backed by dask arrays to multiple files without needing to create paths ourselves? Today, this can be achieved via xr.map_blocks. In other words, is it possible to have something analogous to to_zarr(....) but for netCDF:
It might make sense to split this into a few pieces of functionality:
A new helper function that splits an xarray object into separate objects for each chunk, including some representation of the "chunk id". Perhaps split_chunks()?
A new higher level function that combines (1) and the existing save_mfdataset to automatically save an xarray object into multiple files. This probably should be a new function rather than using the existing save_mfdataset because the API is different.
Is your feature request related to a problem? Please describe.
Currently, when a user wants to write multiple netCDF files in parallel with xarray and dask, they can take full advantage of
xr.save_mfdataset()
function. This function in its current state works fine, but the existing API requires thatA few months ago, I wrote a blog post showing how to save an xarray dataset backed by dask into multiple netCDF files, and since then I've been meaning to request a new feature to make this process convenient for users.
Describe the solution you'd like
Would it be useful to actually refactor the existing
xr.save_mfdataset()
to automatically save an xarray object backed by dask arrays to multiple files without needing to create paths ourselves? Today, this can be achieved viaxr.map_blocks
. In other words, is it possible to have something analogous toto_zarr(....)
but for netCDF:---->
The text was updated successfully, but these errors were encountered: