Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test failure with upstream-dev #3409

Closed
dcherian opened this issue Oct 17, 2019 · 12 comments · Fixed by #3537
Closed

test failure with upstream-dev #3409

dcherian opened this issue Oct 17, 2019 · 12 comments · Fixed by #3537

Comments

@dcherian
Copy link
Contributor

https://dev.azure.com/xarray/xarray/_build/results?buildId=1101&view=logs

=================================== FAILURES ===================================
_________________________ test_datetime_reduce[False] __________________________

dask = False

    @arm_xfail
    @pytest.mark.parametrize("dask", [False, True])
    def test_datetime_reduce(dask):
        time = np.array(pd.date_range("15/12/1999", periods=11))
        time[8:11] = np.nan
        da = DataArray(np.linspace(0, 365, num=11), dims="time", coords={"time": time})
    
        if dask and has_dask:
            chunks = {"time": 5}
            da = da.chunk(chunks)
    
        actual = da["time"].mean()
>       assert not pd.isnull(actual)
E       AssertionError: assert not True
E        +  where True = <function isna at 0x7f2475449ae8>(<xarray.DataArray 'time' ()>\narray('NaT', dtype='datetime64[ns]'))
E        +    where <function isna at 0x7f2475449ae8> = pd.isnull

xarray/tests/test_duck_array_ops.py:288: AssertionError
__________________________ test_datetime_reduce[True] __________________________

dask = True

    @arm_xfail
    @pytest.mark.parametrize("dask", [False, True])
    def test_datetime_reduce(dask):
        time = np.array(pd.date_range("15/12/1999", periods=11))
        time[8:11] = np.nan
        da = DataArray(np.linspace(0, 365, num=11), dims="time", coords={"time": time})
    
        if dask and has_dask:
            chunks = {"time": 5}
            da = da.chunk(chunks)
    
        actual = da["time"].mean()
>       assert not pd.isnull(actual)
E       AssertionError: assert not True
E        +  where True = <function isna at 0x7f2475449ae8>(<xarray.DataArray 'time' ()>\narray('NaT', dtype='datetime64[ns]'))
E        +    where <function isna at 0x7f2475449ae8> = pd.isnull

xarray/tests/test_duck_array_ops.py:288: AssertionError
=============================== warnings summary 
@keewis keewis mentioned this issue Oct 17, 2019
16 tasks
@dcherian dcherian changed the title test failure with dask-dev test failure with upstream-dev Oct 17, 2019
@crusaderky
Copy link
Contributor

crusaderky commented Oct 20, 2019

This has now spread to all tests without pinned dependencies

@crusaderky
Copy link
Contributor

crusaderky commented Oct 20, 2019

conda list (py36) before and after the breakage:

< boto3                     1.9.252                    py_0    conda-forge
> boto3                     1.9.253                    py_0    conda-forge
< botocore                  1.12.252                   py_0    conda-forge
> botocore                  1.12.253                   py_0    conda-forge
< expat                     2.2.5             he1b5a44_1003    conda-forge
> expat                     2.2.5             he1b5a44_1004    conda-forge
< libxkbcommon              0.8.4                h516909a_0    conda-forge
> libxkbcommon              0.9.0                hebb1f50_0    conda-forge
< mypy_extensions           0.4.2                    py36_0    conda-forge
> mypy_extensions           0.4.3                    py36_0    conda-forge
< nss                       3.46                 he751ad9_0    conda-forge
> nss                       3.47                 he751ad9_0    conda-forge
< pandas                    0.25.1           py36hb3f55d8_0    conda-forge
> pandas                    0.25.2           py36hb3f55d8_0    conda-forge
< pip                       19.3                     py36_0    conda-forge
> pip                       19.3.1                   py36_0    conda-forge
< pseudonetcdf              3.0.2                      py_0    conda-forge
> pseudonetcdf              3.1.0                      py_0    conda-forge

@shoyer
Copy link
Member

shoyer commented Oct 20, 2019

This looks like this upstream pandas issue: pandas-dev/pandas#29053

@crusaderky
Copy link
Contributor

It doesn't add up - the upstream ticket mentions an incompatibility with numpy 1.18. However, our failing tests are all running with numpy 1.17.

@crusaderky
Copy link
Contributor

Looking more closely at the failing tests, there are actually two problems.
One is with pandas/numpy git tip, and it only affects the upstream-dev test suite.
The other is with pseudonetcdf-3.1, and it affects py36, py37, and upstream-dev.

@crusaderky
Copy link
Contributor

crusaderky commented Oct 20, 2019

Narrowed down:

>>> import numpy as np
>>> a = np.array(['1999-12-15', 'NaT'], dtype='M8[ns]')
>>> np.min(a)

Output:
numpy 1.17: numpy.datetime64('1999-12-15T00:00:00.000000000')
numpy 1.18: numpy.datetime64('NaT')

triggered by:

offset = min(array)

@shoyer
Copy link
Member

shoyer commented Oct 21, 2019

I think we have our own version of the same issue that pandas had.

This was referenced Oct 22, 2019
@crusaderky
Copy link
Contributor

Same problem:

a = xarray.DataArray(['1999-12-15', 'NaT']).astype('M8[ns]')            
a.min(skipna=False)  # np 1.17: 1999-12-15; np 1.18: NaT
a.min(skipna=True)  # np 1.17: crashes; np 1.18: crashes

@dcherian
Copy link
Contributor Author

dcherian commented Nov 7, 2019

Looks like numpy will fix this: numpy/numpy#14841

@crusaderky
Copy link
Contributor

@dcherian I didn't try running the PR code, but I don't think so?
The PR may mean (must test) that nanmin() and nanmax() now work with NaT. However, as highlighted above #3409 (comment) xarray is invoking min() on an array that contains NaT - which in numpy 1.17 ignores them, while in 1.18 correctly returns NaT.

Does anybody have the time to test it?

@dcherian
Copy link
Contributor Author

dcherian commented Nov 7, 2019

Sigh yes you're right

@mathause
Copy link
Collaborator

mathause commented Nov 8, 2019

Unfortunately this does not seem to make np.nanmin and np.nanmax work for datetime arrays (yet), see: numpy/numpy#14841 (comment)

crusaderky pushed a commit to crusaderky/xarray that referenced this issue Nov 15, 2019
@crusaderky crusaderky mentioned this issue Nov 15, 2019
3 tasks
crusaderky added a commit that referenced this issue Nov 19, 2019
* Closes #3409

* Unpin versions

* Rewrite unit test for clarity about its real scope

* mean() on dask

* Trivial

* duck_array_ops should never receive xarray.Variable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants