Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatically chunk in groupby binary ops #7683

Closed
4 tasks
dcherian opened this issue Mar 27, 2023 · 0 comments · Fixed by #7684
Closed
4 tasks

automatically chunk in groupby binary ops #7683

dcherian opened this issue Mar 27, 2023 · 0 comments · Fixed by #7684

Comments

@dcherian
Copy link
Contributor

What happened?

From https://discourse.pangeo.io/t/xarray-unable-to-allocate-memory-how-to-size-up-problem/3233/4

Consider

# ds is dataset with big dask arrays
mean = ds.groupby("time.day").mean()
mean.to_netcdf()
mean = xr.open_dataset(...)

ds.groupby("time.day") - mean

In GroupBy._binary_op

expanded = other.sel({name: group})

we will eagerly construct other that is of the same size as ds.

What did you expect to happen?

I think the only solution is to automatically chunk if ds has dask arrays, and other (or mean) isn't backed by dask arrays. A chunk size of 1 seems sensible.

Minimal Complete Verifiable Example

No response

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

@dcherian dcherian added bug needs triage Issue that has not been reviewed by xarray team member topic-groupby and removed needs triage Issue that has not been reviewed by xarray team member labels Mar 27, 2023
dcherian added a commit to dcherian/xarray that referenced this issue Mar 27, 2023
dcherian added a commit that referenced this issue Jul 27, 2023
* Automatically chunk `other` in GroupBy binary ops.

Closes #7683

* Update xarray/core/groupby.py

* Add test

* Update xarray/core/groupby.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant