-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
as_shared_dtype
converts scalars to 0d numpy
arrays if chunked cupy
is involved
#7721
Comments
where
on a chunked cupy
array raises a TypeError
as_shared_dtype
converts scalars to 0d numpy
arrays if chunked cupy
is involved
Ping @leofang in case you have thoughts? |
Sorry that I missed the ping, Jacob, but I'd need more context for making any suggestions/answers 😅 Is the question about why CuPy wouldn't return scalars? |
The issue is that here: xarray/xarray/core/duck_array_ops.py Lines 193 to 206 in d4db166
numpy.array_api.where only accepts arrays as input.
However, detecting My naive suggestion was to treat |
Thanks, Justus, for expanding on this. It sounds to me the question is "how do we cast dtypes when multiple array libraries are participating in the same computation?" and I am not sure I am knowledgable enough to make any comment. From the array API point of view, long long ago we decided that this is UB (undefined behavior), meaning it's completely up to each library to decide what to do. You can raise or come up with a special rule that you can make sense of. It sounds like Xarray has some machinery to deal with this situation, but you'd rather prefer to not keep special-casing for a certain array library? Am I understanding it right? |
there's two things that happen in The latter could easily be done by using import numpy.array_api as np
np.where(cond, cupy_array, python_scalar) which (intentionally?) does not work. At the moment, So really, my question is: how do we support python scalars for libraries that only implement Of course, I would prefer removing the special casing for specific libraries, but I wouldn't be opposed to keeping the existing one. I guess as a short-term fix we could just pull As a long-term fix I guess we'd need to revive the stalled nested duck array discussion. |
I was considering this question for SciPy (xref scipy#18286) this week, and I think I'm happy with this strategy:
What that results in is an API that's backwards-compatible for numpy and array-like usage, and much stricter when using other array libraries. That strictness to me is a good thing, because:
|
So, after thinking about this for (quite) some time, it appears that one way or another we need to figure out the appropriate base array type of the nested array (regardless of whether or not we disallow passing python scalars to the xarray API... though since it is a breaking change I don't think we will do that). I've come up with a (recursive) way of extracting the nesting structure in keewis/nested-duck-arrays, which we should be able to use to figure out the leaf array type and keep the current hack until we figure out how to resolve the issue without it. |
Would this be an acceptable, if temporary, fix for #9195? Modified code in
|
I'd go with something like import nested_duck_arrays.dask
import nested_duck_arrays
...
if any(nested_duck_arrays.first_layer(x) is array_type_cupy for x in scalars_or_arrays):
import cupy as cp and add We'll need to think about what to do if try:
from nested_duck_arrays import first_layer
except ImportError:
def first_layer(x):
return type(x) Also, we'll probably want to push the contents of |
I tried to run
where
with chunkedcupy
arrays:this works:
this fails:
this works again:
And other methods like
fillna
show similar behavior.I think the reason is that this:
xarray/xarray/core/duck_array_ops.py
Line 195 in d4db166
cupy
beneath other layers of duckarrays (most commonlydask
,pint
, or both). In this specific case we could extend the condition to also match chunkedcupy
arrays (likearr.cupy.is_cupy
does, but usingis_duck_dask_array
), but this will still break for other duckarray layers or ifdask
is not involved, and we're also in the process of moving away from special-casingdask
. So short of askingcupy
to treat 0d arrays like scalars I'm not sure how to fix this.cc @jacobtomlinson
The text was updated successfully, but these errors were encountered: