Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support duplicate dimensions in .chunk #9099

Merged
merged 7 commits into from
Jun 17, 2024

Conversation

mraspaud
Copy link
Contributor

@mraspaud mraspaud commented Jun 12, 2024

This PR allows duplicate dimension when chunking an array, when the chunk sizes is provided as a dict.
A typical example of the usefulness of this PR is trying to open a netcdf file (with chunking) containing a covariance matrix.

@mraspaud
Copy link
Contributor Author

Any feedback welcome on how I can improve this PR!

xarray/tests/test_dask.py Outdated Show resolved Hide resolved
xarray/namedarray/core.py Outdated Show resolved Hide resolved
dcherian added 3 commits June 13, 2024 09:09
* main:
  new whats-new section (pydata#9115)
  release v2024.06.0 (pydata#9113)
  release notes for 2024.06.0 (pydata#9092)
  [skip-ci] Try fixing hypothesis CI trigger (pydata#9112)
  Undo custom padding-top. (pydata#9107)
  add remaining core-dev citations [skip-ci][skip-rtd] (pydata#9110)
  Add user survey announcement to docs (pydata#9101)
  skip the `pandas` datetime roundtrip test with `pandas=3.0` (pydata#9104)
  Adds Matt Savoie to CITATION.cff (pydata#9103)
  [skip-ci] Fix skip-ci for hypothesis (pydata#9102)
  open_datatree performance improvement on NetCDF, H5, and Zarr files (pydata#9014)
Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mraspaud

@dcherian dcherian added the plan to merge Final call for comments label Jun 13, 2024
@dcherian dcherian changed the title Allow duplicate dimensions in chunking Support duplicate dimensions in .chunk Jun 13, 2024
@dcherian dcherian merged commit be8e17e into pydata:main Jun 17, 2024
28 checks passed
@mraspaud mraspaud deleted the fix-duplicate-dimensions branch June 18, 2024 12:20
dcherian added a commit to dcherian/xarray that referenced this pull request Jun 21, 2024
* main:
  Split out distributed writes in zarr docs (pydata#9132)
  Update zendoo badge link (pydata#9133)
  Support duplicate dimensions in `.chunk` (pydata#9099)
  Bump the actions group with 2 updates (pydata#9130)
  adjust repr tests to account for different platforms (pydata#9127) (pydata#9128)
dcherian added a commit that referenced this pull request Jul 24, 2024
* main: (48 commits)
  Add test for #9155 (#9161)
  Remove mypy exclusions for a couple more libraries (#9160)
  Include numbagg in type checks (#9159)
  Improve zarr chunks docs (#9140)
  groupby: remove some internal use of IndexVariable (#9123)
  Improve `to_zarr` docs (#9139)
  Split out distributed writes in zarr docs (#9132)
  Update zendoo badge link (#9133)
  Support duplicate dimensions in `.chunk` (#9099)
  Bump the actions group with 2 updates (#9130)
  adjust repr tests to account for different platforms (#9127) (#9128)
  Grouper refactor (#9122)
  Update docstring in api.py for open_mfdataset(), clarifying "chunks" argument (#9121)
  Add test for rechunking to a size string (#9117)
  Move Sphinx directives out of `See also` (#8466)
  new whats-new section (#9115)
  release v2024.06.0 (#9113)
  release notes for 2024.06.0 (#9092)
  [skip-ci] Try fixing hypothesis CI trigger (#9112)
  Undo custom padding-top. (#9107)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow .chunk for datasets with duplicated dimension names, e.g. Sentinel-3 OLCI files
2 participants