-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for All-NaN in argmax, argmin #3884
Comments
I think this would be a reasonable change |
The main concern here is type stability. Normally the return value of My suggestion would be to add an optional |
The problem I had when implementing |
NumPy implements |
@shoyer xarray uses bottleneck for that if it can in |
I wouldn’t worry too much about reusing bottleneck here, unless we really these functions will be the bottleneck in user code :) |
Background: In data analyses, it is a common occurrence to have multidimensional datasets with missing conditions. For example, having a data array of power measurements for a multi-channel recording device with dimension nb_channel X nb_subjects X [...], it is current that some channels might be missing for some subjects, in which cases the array will have only NaN for this condition. Functions like Dataset.mean() performs well in this situation and will output a "RuntimeWarning: Mean of empty slice" and set the mean values for this all-NaN slice to NaN, which is what is expected for such a use case. Depending on the use case, the user has the possibility to filter these warnings to either ignore them or raise them as errors. This is all fine.
Problem: However, in the case of the Dataset.argmax(), there is no such option. The function will raise a "ValueError: All-NaN slice encountered" exception. I think it would be better if the behaviour of Dataset.argmax() was modelled on the behaviour of Dataset.mean() such that it would raise warning and set a NaN value. It seems fair to consider that the index that maximize the value of a all-NaN slice is NaN.
The implementation of such a feature may (but don't need to) depend on numpy/numpy#12352
MCVE Code Sample
output
Expected Output
Versions
Output of `xr.show_versions()`
INSTALLED VERSIONS
commit: None
python: 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.3.0-42-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.3
xarray: 0.15.0
pandas: 1.0.1
numpy: 1.18.2
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.11.0
distributed: None
matplotlib: 3.1.2
cartopy: None
seaborn: 0.9.0
numbagg: None
setuptools: 45.2.0.post20200210
pip: 20.0.2
conda: 4.8.2
pytest: None
IPython: 7.12.0
sphinx: 2.3.1
The text was updated successfully, but these errors were encountered: