Skip to content

Commit

Permalink
Fix saving chunked datasets with zero length dimensions (#5742)
Browse files Browse the repository at this point in the history
* Added test_save_emptydim for zarr backends, which fails when chunking

Added test that fails when saving to zarr a dataset with a chunked array
that has a dimension of length zero (Issue #5741)

* Load all variables with zero entries before saving to_zarr

This addresses Issue #5741 and allows `test_save_emptydim` to pass. We
get around `to_zarr` not liking dask arrays with zero length dimensions
by giving it numpy arrays, which works for some reason

* Updated whats-new.rst with information about fix for #5741

Co-authored-by: Maximilian Roos <[email protected]>
  • Loading branch information
jaicher and max-sixty authored Oct 10, 2021
1 parent fd42449 commit 97887fd
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 0 deletions.
3 changes: 3 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,9 @@ Deprecations

Bug fixes
~~~~~~~~~

- Fix ZeroDivisionError from saving dask array with empty dimension (:issue: `5741`).
By `Joseph K Aicher <https://github.com/jaicher>`_.
- Fixed performance bug where ``cftime`` import attempted within various core operations if ``cftime`` not
installed (:pull:`5640`).
By `Luke Sewell <https://github.com/lusewell>`_
Expand Down
5 changes: 5 additions & 0 deletions xarray/backends/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -1322,6 +1322,11 @@ def to_zarr(
See `Dataset.to_zarr` for full API docs.
"""

# Load empty arrays to avoid bug saving zero length dimensions (Issue #5741)
for v in dataset.variables.values():
if v.size == 0:
v.load()

# expand str and Path arguments
store = _normalize_path(store)
chunk_store = _normalize_path(chunk_store)
Expand Down
10 changes: 10 additions & 0 deletions xarray/tests/test_backends.py
Original file line number Diff line number Diff line change
Expand Up @@ -2186,6 +2186,16 @@ def test_to_zarr_append_compute_false_roundtrip(self):
with self.open(store) as actual:
assert_identical(xr.concat([ds, ds_to_append], dim="time"), actual)

@pytest.mark.parametrize("chunk", [False, True])
def test_save_emptydim(self, chunk):
if chunk and not has_dask:
pytest.skip("requires dask")
ds = Dataset({"x": (("a", "b"), np.empty((5, 0))), "y": ("a", [1, 2, 5, 8, 9])})
if chunk:
ds = ds.chunk({}) # chunk dataset to save dask array
with self.roundtrip(ds) as ds_reload:
assert_identical(ds, ds_reload)

@pytest.mark.parametrize("consolidated", [False, True])
@pytest.mark.parametrize("compute", [False, True])
@pytest.mark.parametrize("use_dask", [False, True])
Expand Down

0 comments on commit 97887fd

Please sign in to comment.