Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move parallelcompat and chunkmanagers to NamedArray #8319

Merged
merged 72 commits into from
Feb 12, 2024

Conversation

TomNicholas
Copy link
Member

@TomNicholas TomNicholas commented Oct 16, 2023

@dcherian I got to this point before realizing that simply moving parallelcompat.py over isn't what it says in the design doc, which instead talks about

  • Could this functionality be left in Xarray proper for now? Alternative array types like JAX also have some notion of "chunks" for parallel arrays, but the details differ in a number of ways from the Dask/Cubed.
  • Perhaps variable.chunk/load methods should become functions defined in xarray that convert Variable objects. This is easy so long as xarray can reach in and replace .data

I personally think that simply moving parallelcompat makes sense so long as you expect people to use chunked NamedArray objects. I see the chunked arrays as special cases of duck arrays, and my understanding is that NamedArray is supposed to have full support for duckarrays.

cc @andersy005

  • As requested in NamedArray tracking issue #8238
  • Tests added
  • User visible changes (including notable bug fixes) are documented in whats-new.rst
  • New functions/methods are listed in api.rst

@TomNicholas TomNicholas added topic-chunked-arrays Managing different chunked backends, e.g. dask topic-NamedArray Lightweight version of Variable labels Oct 16, 2023
@github-actions github-actions bot added topic-backends topic-indexing topic-zarr Related to zarr storage library topic-arrays related to flexible array support io labels Oct 16, 2023
@andersy005
Copy link
Member

thank you for doing this work, @TomNicholas! moving these modules unblock me.

I personally think that simply moving parallelcompat makes sense so long as you expect people to use chunked NamedArray objects. I see the chunked arrays as special cases of duck arrays, and my understanding is that NamedArray is supposed to have full support for duckarrays.

👍🏽 i concur. and yes, NamedArray will have full support for duckarrays.

@TomNicholas
Copy link
Member Author

Moving utils.either_dict_or_kwargs to namedarray.utils required touching a lot of files to fix!

Also moving the entrypoint seems to have broken the discovery of the DaskManager at least in my local environment, let's see what happens in the CI...

@TomNicholas
Copy link
Member Author

Also moving the entrypoint seems to have broken the discovery of the DaskManager at least in my local environment, let's see what happens in the CI...

This works in the CI! But it still breaks my local environment 🙁 I suspect this means it will break other developer's local environments too. The problem is these zombie entrypoints that I cannot seem to get rid of (even after deleting the __pycache__ directory). There should just be one in this list:

In [1]: from importlib.metadata import entry_points

In [2]: entry_points().get("xarray.chunkmanagers", ())
Out[2]: 
(EntryPoint(name='dask', value='xarray.namedarray.daskmanager:DaskManager', group='xarray.chunkmanagers'),
 EntryPoint(name='dask', value='xarray.core.daskmanager:DaskManager', group='xarray.chunkmanagers'),
 EntryPoint(name='dask', value='xarray.namedarray.daskmanager:DaskManager', group='xarray.chunkmanagers'))

@andersy005 andersy005 requested a review from dcherian February 9, 2024 23:18
@andersy005
Copy link
Member

this is ready for another round of review :)

Copy link
Member Author

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't approve my own PR, but I approve! 👍

@andersy005 andersy005 enabled auto-merge (squash) February 12, 2024 21:53
@andersy005 andersy005 merged commit d644607 into pydata:main Feb 12, 2024
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants