Add to_numpy() and as_numpy() methods #5568

TomNicholas · 2021-07-02T20:17:40Z

Closes sparse and other duck array issues #3245
Tests added
Passes pre-commit run --all-files
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

…o_numpy()

github-actions · 2021-07-02T20:22:13Z

Unit Test Results

        6 files ±0         6 suites ±0 51m 2s ⏱️ ±0s
16 202 tests ±0 14 476 ✔️ ±0 1 726 💤 ±0 0 ❌ ±0
90 408 runs ±0 82 242 ✔️ ±0 8 166 💤 ±0 0 ❌ ±0

Results for commit c5ee050. ± Comparison against base commit c5ee050.

♻️ This comment has been updated with latest results.

TomNicholas · 2021-07-02T20:25:33Z

Irritating linting error:

xarray/core/variable.py:1095: error: Incompatible return value type (got "Variable", expected "VariableType")  [return-value]
xarray/core/dataarray.py:661: error: Incompatible return value type (got "DataArray", expected "T_DataArray")  [return-value]
Found 2 errors in 2 files (checked 143 source files)

max-sixty · 2021-07-02T22:50:10Z

@TomNicholas I think that should fix it. But it's become Haskell-esque!

… to_numpy

pep8speaks · 2021-07-03T23:15:29Z

Hello @TomNicholas! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-07-21 21:11:19 UTC

xarray/core/variable.py

TomNicholas · 2021-07-16T03:11:44Z

The only failure is the same as in #5600 , so I think this should be ready to merge.

dcherian

looks great. Thanks @TomNicholas

xarray/core/dataarray.py

xarray/core/pycompat.py

xarray/core/variable.py

dcherian · 2021-07-16T16:38:14Z

xarray/core/variable.py

+        try:
+            data = data.to_numpy()
+        except AttributeError:


Isn't this always failing? Should we avoid it until such a protocol actually exists?

That would be my inclination, too.

I can also imagine some other library implementing to_numpy() in a weird way, which could cause issues. At the least, there should be a defensive check of isinstance(data, np.ndarray) afterwards.

xarray/core/variable.py

dcherian · 2021-07-16T16:42:52Z

xarray/tests/test_variable.py

@@ -2540,3 +2542,68 @@ def test_clip(var):
            var.mean("z").data[:, :, np.newaxis],
        ),
    )
+
+
+@pytest.mark.parametrize("Var", [Variable, IndexVariable])


you could stick these in VariableSubclassObjects? Could also do that later if it's a good idea

I just had a go at that and it broke the pint-only test through interaction with TestVariableWithDask, which tries to chunk.

TomNicholas · 2021-07-16T16:52:56Z

@dcherian I think `.to_numpy()` does actually work in the case of an IndexVariable, because the underlying pandas index has that method. That's not going to be used yet, but we should perhaps leave it in to prepare for the flexible indexes? Other than that I like your other suggestions, please commit and merge as you see fit!

…

On Fri, 16 Jul 2021, 10:44 Deepak Cherian, ***@***.***> wrote: ***@***.**** approved this pull request. looks great. Thanks @TomNicholas <https://github.com/TomNicholas> ------------------------------ In xarray/core/dataarray.py <#5568 (comment)>: > @@ -623,7 +623,7 @@ def __len__(self) -> int: @Property def data(self) -> Any: - """The array's data as a dask or numpy array""" + """The array's data as a numpy-like array""" ⬇️ Suggested change - """The array's data as a numpy-like array""" + """ + The DataArray's data as an array. The underlying array type + (e.g. dask, sparse, pint) is preserved. + + See Also + -------- + DataArray.to_numpy + DataArray.as_numpy + DataArray.values + """ ------------------------------ In xarray/core/dataarray.py <#5568 (comment)>: > @@ -632,13 +632,42 @@ def data(self, value: Any) -> None: @Property def values(self) -> np.ndarray: - """The array's data as a numpy.ndarray""" + """ + The array's data as a numpy.ndarray. + + If the array's data is not a numpy.ndarray this will attempt to convert + it naively using np.array(), which will raise an error if the array + type does not support coercion like this. ⬇️ Suggested change - type does not support coercion like this. + type does not support coercion like this (e.g. cupy). ------------------------------ In xarray/core/dataarray.py <#5568 (comment)>: > return self.variable.values @values.setter def values(self, value: Any) -> None: self.variable.values = value + def to_numpy(self) -> np.ndarray: + """ + Coerces wrapped data to numpy and returns a numpy.ndarray. + + See also + -------- + DataArray.as_numpy : Same but returns the surrounding DataArray instead. + Dataset.as_numpy ⬇️ Suggested change - Dataset.as_numpy + Dataset.as_numpy + DataArray.values + DataArray.data ------------------------------ In xarray/core/dataarray.py <#5568 (comment)>: > + + See also + -------- + DataArray.as_numpy : Same but returns the surrounding DataArray instead. + Dataset.as_numpy + """ + return self.variable.to_numpy() + + def as_numpy(self: T_DataArray) -> T_DataArray: + """ + Coerces wrapped data and coordinates into numpy arrays, returning a DataArray. + + See also + -------- + DataArray.to_numpy : Same but returns only the data as a numpy.ndarray object. + Dataset.as_numpy : Converts all variables in a Dataset. ⬇️ Suggested change - Dataset.as_numpy : Converts all variables in a Dataset. + Dataset.as_numpy : Converts all variables in a Dataset. + DataArray.values + DataArray.data ------------------------------ In xarray/core/pycompat.py <#5568 (comment)>: > + self.module = duck_array_module + self.version = duck_array_version + self.type = duck_array_type + self.available = self.version > "0.0.0" ⬇️ Suggested change - self.available = self.version > "0.0.0" + self.available = duck_array_module is not None v. minor: this seems cleaner? ------------------------------ In xarray/core/variable.py <#5568 (comment)>: > +dask_array_type = DuckArrayModule("dask").type +cupy_array_type = DuckArrayModule("cupy").type +sparse_array_type = DuckArrayModule("sparse").type import these from pycompat as earlier? ------------------------------ In xarray/core/variable.py <#5568 (comment)>: > + try: + data = data.to_numpy() + except AttributeError: Isn't this always failing? Should we avoid it until such a protocol actually exists? ------------------------------ In xarray/core/variable.py <#5568 (comment)>: > @@ -1069,6 +1069,32 @@ def chunk(self, chunks={}, name=None, lock=False): return self._replace(data=data) + def to_numpy(self) -> np.ndarray: + """Coerces wrapped data to numpy and returns a numpy.ndarray""" + # TODO an entrypoint so array libraries can choose coercion method? + data = self.data + try: + data = data.to_numpy() + except AttributeError: + if isinstance(data, dask_array_type): + data = self.compute().data ⬇️ Suggested change - data = self.compute().data + data = data.compute() to match the rest? ------------------------------ In xarray/tests/test_variable.py <#5568 (comment)>: > @@ -2540,3 +2542,68 @@ def test_clip(var): var.mean("z").data[:, :, np.newaxis], ), ) + + ***@***.***("Var", [Variable, IndexVariable]) you could stick these in VariableSubclassObjects? Could also do that later if it's a good idea — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5568 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AISNPIYHDVQ7DOWCTQH4REDTYBOW5ANCNFSM47XKDGNQ> .

xarray/core/variable.py

TomNicholas · 2021-07-21T16:27:54Z

The only error is the ffspec error now

shoyer · 2021-07-21T20:49:32Z

xarray/core/variable.py

+        try:
+            data = data.to_numpy()
+        except AttributeError:


That would be my inclination, too.

I can also imagine some other library implementing to_numpy() in a weird way, which could cause issues. At the least, there should be a defensive check of isinstance(data, np.ndarray) afterwards.

xarray/core/variable.py

Co-authored-by: Stephan Hoyer <[email protected]>

TomNicholas · 2021-07-21T21:11:58Z

Okay thanks @shoyer - I've removed the type check and the attempt to call .to_numpy().

* main: (31 commits) Refactor index vs. coordinate variable(s) (pydata#5636) pre-commit: autoupdate hook versions (pydata#5685) Flexible Indexes: Avoid len(index) in map_blocks (pydata#5670) Speed up _mapping_repr (pydata#5661) update the link to `scipy`'s intersphinx file (pydata#5665) Bump styfle/cancel-workflow-action from 0.9.0 to 0.9.1 (pydata#5663) pre-commit: autoupdate hook versions (pydata#5660) fix the binder environment (pydata#5650) Update api.rst (pydata#5639) Kwargs to rasterio open (pydata#5609) Bump codecov/codecov-action from 1 to 2.0.2 (pydata#5633) new blank whats-new for v0.19.1 v0.19.0 release notes (pydata#5632) remove deprecations scheduled for 0.19 (pydata#5630) Make typing-extensions optional (pydata#5624) Plots get labels from pint arrays (pydata#5561) Add to_numpy() and as_numpy() methods (pydata#5568) pin fsspec (pydata#5627) pre-commit: autoupdate hook versions (pydata#5617) Add dataarray scatter with 3d support (pydata#4909) ...

* upstream/main: (31 commits) Refactor index vs. coordinate variable(s) (pydata#5636) pre-commit: autoupdate hook versions (pydata#5685) Flexible Indexes: Avoid len(index) in map_blocks (pydata#5670) Speed up _mapping_repr (pydata#5661) update the link to `scipy`'s intersphinx file (pydata#5665) Bump styfle/cancel-workflow-action from 0.9.0 to 0.9.1 (pydata#5663) pre-commit: autoupdate hook versions (pydata#5660) fix the binder environment (pydata#5650) Update api.rst (pydata#5639) Kwargs to rasterio open (pydata#5609) Bump codecov/codecov-action from 1 to 2.0.2 (pydata#5633) new blank whats-new for v0.19.1 v0.19.0 release notes (pydata#5632) remove deprecations scheduled for 0.19 (pydata#5630) Make typing-extensions optional (pydata#5624) Plots get labels from pint arrays (pydata#5561) Add to_numpy() and as_numpy() methods (pydata#5568) pin fsspec (pydata#5627) pre-commit: autoupdate hook versions (pydata#5617) Add dataarray scatter with 3d support (pydata#4909) ...

* upstream/main: (34 commits) Use same bool validator as other inputs (pydata#5703) conditionally disable bottleneck (pydata#5560) Refactor index vs. coordinate variable(s) (pydata#5636) pre-commit: autoupdate hook versions (pydata#5685) Flexible Indexes: Avoid len(index) in map_blocks (pydata#5670) Speed up _mapping_repr (pydata#5661) update the link to `scipy`'s intersphinx file (pydata#5665) Bump styfle/cancel-workflow-action from 0.9.0 to 0.9.1 (pydata#5663) pre-commit: autoupdate hook versions (pydata#5660) fix the binder environment (pydata#5650) Update api.rst (pydata#5639) Kwargs to rasterio open (pydata#5609) Bump codecov/codecov-action from 1 to 2.0.2 (pydata#5633) new blank whats-new for v0.19.1 v0.19.0 release notes (pydata#5632) remove deprecations scheduled for 0.19 (pydata#5630) Make typing-extensions optional (pydata#5624) Plots get labels from pint arrays (pydata#5561) Add to_numpy() and as_numpy() methods (pydata#5568) pin fsspec (pydata#5627) ...

* upstream/main: (307 commits) Use same bool validator as other inputs (pydata#5703) conditionally disable bottleneck (pydata#5560) Refactor index vs. coordinate variable(s) (pydata#5636) pre-commit: autoupdate hook versions (pydata#5685) Flexible Indexes: Avoid len(index) in map_blocks (pydata#5670) Speed up _mapping_repr (pydata#5661) update the link to `scipy`'s intersphinx file (pydata#5665) Bump styfle/cancel-workflow-action from 0.9.0 to 0.9.1 (pydata#5663) pre-commit: autoupdate hook versions (pydata#5660) fix the binder environment (pydata#5650) Update api.rst (pydata#5639) Kwargs to rasterio open (pydata#5609) Bump codecov/codecov-action from 1 to 2.0.2 (pydata#5633) new blank whats-new for v0.19.1 v0.19.0 release notes (pydata#5632) remove deprecations scheduled for 0.19 (pydata#5630) Make typing-extensions optional (pydata#5624) Plots get labels from pint arrays (pydata#5561) Add to_numpy() and as_numpy() methods (pydata#5568) pin fsspec (pydata#5627) ...

added to_numpy() and as_numpy() methods

17c5755

TomNicholas mentioned this pull request Jul 2, 2021

sparse and other duck array issues #3245

Closed

remove special-casing of cupy arrays in .values in favour of using .t…

48ba107

…o_numpy()

TomNicholas mentioned this pull request Jul 2, 2021

Plots get labels from pint arrays #5561

Merged

5 tasks

TomNicholas added API design topic-arrays related to flexible array support labels Jul 2, 2021

max-sixty added 2 commits July 2, 2021 14:54

lint

ae6e931

Fix mypy (I think?)

dc24d3f

TomNicholas added 5 commits July 3, 2021 18:43

Merge branch 'main' of https://github.com/pydata/xarray into to_numpy

6ce6b05

Merge branch 'to_numpy' of https://github.com/TomNicholas/xarray into…

04d7b02

… to_numpy

added Dataset.as_numpy()

ee34649

improved docstrings

552b322

add what's new

1215e69

TomNicholas added 4 commits July 3, 2021 19:31

add to API docs

af8a1ee

linting

e095bf0

fix failures by only importing pint when needed

eb7d84d

refactor pycompat into class

74c05e3

TomNicholas mentioned this pull request Jul 8, 2021

Release v0.19? #5588

Closed

8 tasks

dcherian requested a review from shoyer July 8, 2021 17:15

TomNicholas added 2 commits July 8, 2021 15:23

compute instead of load

45245d0

added tests

27fc4e5

TomNicholas added the needs review label Jul 8, 2021

TomNicholas added 4 commits July 8, 2021 16:20

fixed sparse test

3e8cb24

tests and fixes for ds.as_numpy()

f9d6370

fix sparse tests

50fdf4c

fix linting

1c94a97

TomNicholas commented Jul 15, 2021

View reviewed changes

xarray/core/variable.py Show resolved Hide resolved

TomNicholas added 3 commits July 15, 2021 12:01

Merge branch 'main' into to_numpy

afd35e2

Force tests again after pydata#5600

6d33b35

Merge branch 'main' into to_numpy

eae95f5

TomNicholas added plan to merge Final call for comments and removed needs review labels Jul 16, 2021

dcherian approved these changes Jul 16, 2021

View reviewed changes

dcherian reviewed Jul 16, 2021

View reviewed changes

xarray/core/variable.py Outdated Show resolved Hide resolved

dcherian reviewed Jul 16, 2021

View reviewed changes

xarray/core/variable.py Outdated Show resolved Hide resolved

Apply suggestions from code review

b90b7e3

dcherian reviewed Jul 16, 2021

View reviewed changes

xarray/core/variable.py Outdated Show resolved Hide resolved

dcherian and others added 3 commits July 16, 2021 10:57

Update xarray/core/variable.py

f39b301

fix import

8b346d3

formatting

576ab7b

Fix fsspec error by merging branch 'main' into to_numpy

4ed1dd8

dcherian requested a review from shoyer July 21, 2021 20:43

shoyer approved these changes Jul 21, 2021

View reviewed changes

TomNicholas and others added 2 commits July 21, 2021 17:06

remove type check

976f89a

Co-authored-by: Stephan Hoyer <[email protected]>

remove attempt to call to_numpy

7bc5d6f

TomNicholas merged commit c5ee050 into pydata:main Jul 21, 2021

jthielen mentioned this pull request Jul 30, 2021

Duck array compatibility meeting #5648

Open

maho3 mentioned this pull request Feb 7, 2023

to_numpy -> np.array(...) florpi/ili-summarizer#14

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add to_numpy() and as_numpy() methods #5568

Add to_numpy() and as_numpy() methods #5568

TomNicholas commented Jul 2, 2021 •

edited

Loading

github-actions bot commented Jul 2, 2021 •

edited

Loading

TomNicholas commented Jul 2, 2021

max-sixty commented Jul 2, 2021

pep8speaks commented Jul 3, 2021 •

edited

Loading

TomNicholas commented Jul 16, 2021

dcherian left a comment

dcherian Jul 16, 2021

shoyer Jul 21, 2021

dcherian Jul 16, 2021

TomNicholas Jul 21, 2021

TomNicholas commented Jul 16, 2021 via email

TomNicholas commented Jul 21, 2021

shoyer Jul 21, 2021

TomNicholas commented Jul 21, 2021

Add to_numpy() and as_numpy() methods #5568

Add to_numpy() and as_numpy() methods #5568

Conversation

TomNicholas commented Jul 2, 2021 • edited Loading

github-actions bot commented Jul 2, 2021 • edited Loading

Unit Test Results

TomNicholas commented Jul 2, 2021

max-sixty commented Jul 2, 2021

pep8speaks commented Jul 3, 2021 • edited Loading

Comment last updated at 2021-07-21 21:11:19 UTC

TomNicholas commented Jul 16, 2021

dcherian left a comment

Choose a reason for hiding this comment

dcherian Jul 16, 2021

Choose a reason for hiding this comment

shoyer Jul 21, 2021

Choose a reason for hiding this comment

dcherian Jul 16, 2021

Choose a reason for hiding this comment

TomNicholas Jul 21, 2021

Choose a reason for hiding this comment

TomNicholas commented Jul 16, 2021 via email

TomNicholas commented Jul 21, 2021

shoyer Jul 21, 2021

Choose a reason for hiding this comment

TomNicholas commented Jul 21, 2021

TomNicholas commented Jul 2, 2021 •

edited

Loading

github-actions bot commented Jul 2, 2021 •

edited

Loading

pep8speaks commented Jul 3, 2021 •

edited

Loading