Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix doctests #4408

Merged
merged 13 commits into from
Sep 11, 2020
Merged

Fix doctests #4408

merged 13 commits into from
Sep 11, 2020

Conversation

keewis
Copy link
Collaborator

@keewis keewis commented Sep 6, 2020

This is an attempt to fix the examples in our docstrings. With this, the output of:

python -m pytest --doctest-modules xarray --ignore xarray/tests

is

remaining warnings and errors
====================================================================================================== test session starts =======================================================================================================
platform linux -- Python 3.8.5, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: ..., configfile: setup.cfg
plugins: cov-2.10.0, env-0.6.2, hypothesis-5.23.11
collected 91 items                                                                                                                                                                                                               

xarray/conventions.py FF                                                                                                                                                                                                   [  2%]
xarray/backends/api.py F                                                                                                                                                                                                   [  3%]
xarray/coding/cftime_offsets.py .                                                                                                                                                                                          [  4%]
xarray/coding/cftimeindex.py ....                                                                                                                                                                                          [  8%]
xarray/coding/strings.py F                                                                                                                                                                                                 [  9%]
xarray/core/accessor_dt.py ...                                                                                                                                                                                             [ 13%]
xarray/core/accessor_str.py .                                                                                                                                                                                              [ 14%]
xarray/core/alignment.py ..                                                                                                                                                                                                [ 16%]
xarray/core/combine.py ..                                                                                                                                                                                                  [ 18%]
xarray/core/common.py ....F......                                                                                                                                                                                          [ 30%]
xarray/core/computation.py .....                                                                                                                                                                                           [ 36%]
xarray/core/dataarray.py .......................                                                                                                                                                                           [ 61%]
xarray/core/dataset.py .....F.............                                                                                                                                                                                 [ 82%]
xarray/core/extensions.py .                                                                                                                                                                                                [ 83%]
xarray/core/groupby.py .                                                                                                                                                                                                   [ 84%]
xarray/core/indexing.py FF                                                                                                                                                                                                 [ 86%]
xarray/core/merge.py .                                                                                                                                                                                                     [ 87%]
xarray/core/nputils.py .                                                                                                                                                                                                   [ 89%]
xarray/core/options.py .                                                                                                                                                                                                   [ 90%]
xarray/core/parallel.py .                                                                                                                                                                                                  [ 91%]
xarray/core/rolling.py ..                                                                                                                                                                                                  [ 93%]
xarray/core/rolling_exp.py .                                                                                                                                                                                               [ 94%]
xarray/core/utils.py .                                                                                                                                                                                                     [ 95%]
xarray/core/variable.py ..                                                                                                                                                                                                 [ 97%]
xarray/plot/utils.py ..                                                                                                                                                                                                    [100%]

============================================================================================================ FAILURES ============================================================================================================
___________________________________________________________________________________________ [doctest] xarray.conventions.BoolTypeArray ___________________________________________________________________________________________
048 Decode arrays on the fly from integer to boolean datatype
049 
050     This is useful for decoding boolean arrays from integer typed netCDF
051     variables.
052 
053     >>> x = np.array([1, 0, 1, 1, 0], dtype="i1")
054 
055     >>> x.dtype
Expected:
    dtype('>i2')
Got:
    dtype('int8')

.../xarray/conventions.py:55: DocTestFailure
_______________________________________________________________________________________ [doctest] xarray.conventions.NativeEndiannessArray _______________________________________________________________________________________
021 
022     >>> x = np.arange(5, dtype=">i2")
023 
024     >>> x.dtype
025     dtype('>i2')
026 
027     >>> NativeEndiannessArray(x).dtype
028     dtype('int16')
029 
030     >>> NativeEndiannessArray(x)[:].dtype
UNEXPECTED EXCEPTION: TypeError("unexpected key type: <class 'slice'>")
Traceback (most recent call last):
  File "~/.conda/envs/xarray/lib/python3.8/doctest.py", line 1336, in __run
    exec(compile(example.source, filename, "single",
  File "<doctest xarray.conventions.NativeEndiannessArray[3]>", line 1, in <module>
  File ".../xarray/conventions.py", line 44, in __getitem__
    return np.asarray(self.array[key], dtype=self.dtype)
  File ".../xarray/core/indexing.py", line 1277, in __getitem__
    array, key = self._indexing_array_and_key(key)
  File ".../xarray/core/indexing.py", line 1269, in _indexing_array_and_key
    raise TypeError("unexpected key type: {}".format(type(key)))
TypeError: unexpected key type: <class 'slice'>
.../xarray/conventions.py:30: UnexpectedException
__________________________________________________________________________________________ [doctest] xarray.backends.api.save_mfdataset __________________________________________________________________________________________
1182     compute : bool
1183         If true compute immediately, otherwise return a
1184         ``dask.delayed.Delayed`` object that can be computed later.
1185 
1186     Examples
1187     --------
1188 
1189     Save a dataset into one netCDF per year of data:
1190 
1191     >>> years, datasets = zip(*ds.groupby("time.year"))
UNEXPECTED EXCEPTION: NameError("name 'ds' is not defined")
Traceback (most recent call last):
  File "~/.conda/envs/xarray/lib/python3.8/doctest.py", line 1336, in __run
    exec(compile(example.source, filename, "single",
  File "<doctest xarray.backends.api.save_mfdataset[0]>", line 1, in <module>
NameError: name 'ds' is not defined
.../xarray/backends/api.py:1191: UnexpectedException
_______________________________________________________________________________________ [doctest] xarray.coding.strings.StackedBytesArray ________________________________________________________________________________________
199 Wrapper around array-like objects to create a new indexable object where
200     values, when accessed, are automatically stacked along the last dimension.
201 
202     >>> StackedBytesArray(np.array(["a", "b", "c"]))[:]
UNEXPECTED EXCEPTION: ValueError("can only use StackedBytesArray if argument has dtype='S1'")
Traceback (most recent call last):
  File "~/.conda/envs/xarray/lib/python3.8/doctest.py", line 1336, in __run
    exec(compile(example.source, filename, "single",
  File "<doctest xarray.coding.strings.StackedBytesArray[0]>", line 1, in <module>
  File ".../xarray/coding/strings.py", line 215, in __init__
    raise ValueError(
ValueError: can only use StackedBytesArray if argument has dtype='S1'
.../xarray/coding/strings.py:202: UnexpectedException
________________________________________________________________________________________ [doctest] xarray.core.common.DataWithCoords.pipe ________________________________________________________________________________________
578         >>> def adder(data, arg):
579         ...     return data + arg
580         ...
581         >>> def div(data, arg):
582         ...     return data / arg
583         ...
584         >>> def sub_mult(data, sub_arg, mult_arg):
585         ...     return (data * mult_arg) - sub_arg
586         ...
587         >>> x.pipe(adder, 2)
Differences (unified diff with -expected +actual):
    @@ -2,6 +2,6 @@
     Dimensions:        (lat: 2, lon: 2)
     Coordinates:
    +  * lon            (lon) int64 150 160
       * lat            (lat) int64 10 20
    -  * lon            (lon) int64 150 160
     Data variables:
         temperature_c  (lat, lon) float64 12.98 16.3 14.06 12.9

.../xarray/core/common.py:587: DocTestFailure
_____________________________________________________________________________________ [doctest] xarray.core.dataset.Dataset.filter_by_attrs ______________________________________________________________________________________
5719         ...     },
5720         ...     coords={
5721         ...         "lon": (["x", "y"], lon),
5722         ...         "lat": (["x", "y"], lat),
5723         ...         "time": pd.date_range("2014-09-06", periods=3),
5724         ...         "reference_time": pd.Timestamp("2014-09-05"),
5725         ...     },
5726         ... )
5727         >>> # Get variables matching a specific standard_name.
5728         >>> ds.filter_by_attrs(standard_name="convective_precipitation_flux")
Differences (unified diff with -expected +actual):
    @@ -2,8 +2,8 @@
     Dimensions:         (time: 3, x: 2, y: 2)
     Coordinates:
    +    lon             (x, y) float64 -99.83 -99.32 -99.79 -99.23
    +  * time            (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08
    +    lat             (x, y) float64 42.25 42.21 42.63 42.59
         reference_time  datetime64[ns] 2014-09-05
    -    lat             (x, y) float64 42.25 42.21 42.63 42.59
    -  * time            (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08
    -    lon             (x, y) float64 -99.83 -99.32 -99.79 -99.23
     Dimensions without coordinates: x, y
     Data variables:

.../xarray/core/dataset.py:5728: DocTestFailure
____________________________________________________________________________________ [doctest] xarray.core.indexing._decompose_outer_indexer _____________________________________________________________________________________
971     -----
972     This function is used to realize the vectorized indexing for the backend
973     arrays that only support basic or outer indexing.
974 
975     As an example, let us consider to index a few elements from a backend array
976     with a orthogonal indexer ([0, 3, 1], [2, 3, 2]).
977     Even if the backend array only supports basic indexing, it is more
978     efficient to load a subslice of the array than loading the entire array,
979 
980     >>> backend_indexer = BasicIndexer(slice(0, 3), slice(2, 3))
UNEXPECTED EXCEPTION: TypeError('__init__() takes 2 positional arguments but 3 were given')
Traceback (most recent call last):
  File "~/.conda/envs/xarray/lib/python3.8/doctest.py", line 1336, in __run
    exec(compile(example.source, filename, "single",
  File "<doctest xarray.core.indexing._decompose_outer_indexer[0]>", line 1, in <module>
TypeError: __init__() takes 2 positional arguments but 3 were given
.../xarray/core/indexing.py:980: UnexpectedException
__________________________________________________________________________________ [doctest] xarray.core.indexing._decompose_vectorized_indexer __________________________________________________________________________________
893     -----
894     This function is used to realize the vectorized indexing for the backend
895     arrays that only support basic or outer indexing.
896 
897     As an example, let us consider to index a few elements from a backend array
898     with a vectorized indexer ([0, 3, 1], [2, 3, 2]).
899     Even if the backend array only supports outer indexing, it is more
900     efficient to load a subslice of the array than loading the entire array,
901 
902     >>> backend_indexer = OuterIndexer([0, 1, 3], [2, 3])
UNEXPECTED EXCEPTION: TypeError('__init__() takes 2 positional arguments but 3 were given')
Traceback (most recent call last):
  File "~/.conda/envs/xarray/lib/python3.8/doctest.py", line 1336, in __run
    exec(compile(example.source, filename, "single",
  File "<doctest xarray.core.indexing._decompose_vectorized_indexer[0]>", line 1, in <module>
TypeError: __init__() takes 2 positional arguments but 3 were given
.../xarray/core/indexing.py:902: UnexpectedException
======================================================================================================== warnings summary ========================================================================================================
xarray/core/common.py::xarray.core.common.DataWithCoords.resample
xarray/core/common.py::xarray.core.common.DataWithCoords.resample
xarray/core/common.py::xarray.core.common.DataWithCoords.resample
  .../xarray/core/common.py:1230: FutureWarning: 'base' in .resample() and in Grouper() is deprecated.
  The new arguments that you should use are 'offset' or 'origin'.
  
  >>> df.resample(freq="3s", base=2)
  
  becomes:
  
  >>> df.resample(freq="3s", offset="2s")
  
    grouper = pd.Grouper(

xarray/core/dataarray.py::xarray.core.dataarray.DataArray.argmax
xarray/core/dataarray.py::xarray.core.dataarray.DataArray.idxmax
  .../xarray/core/dataarray.py:4090: DeprecationWarning: Behaviour of argmin/argmax with neither dim nor axis argument will change to return a dict of indices of each dimension. To get a single, flat index, please use np.argmin(da.data) or np.argmax(da.data) instead of da.argmin() or da.argmax().
    result = self.variable.argmax(dim, axis, keep_attrs, skipna)

xarray/core/dataarray.py::xarray.core.dataarray.DataArray.argmin
xarray/core/dataarray.py::xarray.core.dataarray.DataArray.idxmin
  .../xarray/core/dataarray.py:3987: DeprecationWarning: Behaviour of argmin/argmax with neither dim nor axis argument will change to return a dict of indices of each dimension. To get a single, flat index, please use np.argmin(da.data) or np.argmax(da.data) instead of da.argmin() or da.argmax().
    result = self.variable.argmin(dim, axis, keep_attrs, skipna)

xarray/core/dataarray.py::xarray.core.dataarray.DataArray.roll
  .../xarray/core/dataarray.py:2974: FutureWarning: roll_coords will be set to False in the future. Explicitly set roll_coords to silence warning.
    ds = self._to_temp_dataset().roll(

xarray/core/dataset.py::xarray.core.dataset.Dataset.roll
  <doctest xarray.core.dataset.Dataset.roll[1]>:1: FutureWarning: roll_coords will be set to False in the future. Explicitly set roll_coords to silence warning.

-- Docs: https://docs.pytest.org/en/stable/warnings.html
==================================================================================================== short test summary info =====================================================================================================
FAILED xarray/conventions.py::xarray.conventions.BoolTypeArray
FAILED xarray/conventions.py::xarray.conventions.NativeEndiannessArray
FAILED xarray/backends/api.py::xarray.backends.api.save_mfdataset
FAILED xarray/coding/strings.py::xarray.coding.strings.StackedBytesArray
FAILED xarray/core/common.py::xarray.core.common.DataWithCoords.pipe
FAILED xarray/core/dataset.py::xarray.core.dataset.Dataset.filter_by_attrs
FAILED xarray/core/indexing.py::xarray.core.indexing._decompose_outer_indexer
FAILED xarray/core/indexing.py::xarray.core.indexing._decompose_vectorized_indexer
============================================================================================ 8 failed, 83 passed, 9 warnings in 3.73s ============================================================================================

where DataWithCoords.pipe and Dataset.filter_by_attrs fail mostly because the order of the coords is not deterministic. Should we sort before formatting the coords?

Does anyone know what ds in save_mfdataset should look like? Also, how do we fix the construction of OuterIndexer and VectorizedIndexer?

Edit: if we merge #4409 before this, we can make sure all doctests pass.

Copy link
Collaborator

@max-sixty max-sixty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, thanks a lot @keewis !

Is there a tool to do this or you had to copy and paste all the results?

xarray/core/common.py Show resolved Hide resolved
Copy link
Collaborator Author

@keewis keewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a tool to do this or you had to copy and paste all the results?

that was a entirely manual process. However, it should be possible to write a script using doctest.DoctestParser, doctest.DoctestFinder and doctest.DoctestRunner (and I would be surprised if there's nothing like that already).

xarray/core/common.py Show resolved Hide resolved
@max-sixty
Copy link
Collaborator

that was a entirely manual process. However, it should be possible to write a script using doctest.DoctestParser, doctest.DoctestFinder and doctest.DoctestRunner (and I would be surprised if there's nothing like that already).

Well, thanks for doing all that!

On the tool, I agree that would be great. Here's something I did in a day that aimed to do it for any value rather than doctests. Would be easier and less complicated for doctests: https://github.com/max-sixty/pytest-accept

@dcherian
Copy link
Contributor

dcherian commented Sep 9, 2020

shall we merge and deal with the effects of #4409 later?

@keewis
Copy link
Collaborator Author

keewis commented Sep 10, 2020

I tried to fix the remaining doctests that were unrelated to #4409, so those need a review first (especially the StackedBytesArray and indexing doctests).

@keewis
Copy link
Collaborator Author

keewis commented Sep 10, 2020

wow. The RTD check fails due to:

<frozen importlib._bootstrap>:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

@max-sixty
Copy link
Collaborator

Looks great! I raised an eyebrow about the os.remove but it's limited to a finite set of paths.

Re the warning, we could add an ignore in the doctest prelude?

@keewis
Copy link
Collaborator Author

keewis commented Sep 10, 2020

I raised an eyebrow about the os.remove but it's limited to a finite set of paths.

yeah, well, I didn't want to leave files lying around. Not sure if there's a better way to do that. Maybe if there's a way to have pytest create a temporary directory we could use that to construct the paths?

we could add an ignore in the doctest prelude

I doubt that would fix it (maybe I'm missing something), these are triggered by ipython directives (combining.rst, groupby.rst). I'll try rerunning the RTD check.

@keewis keewis closed this Sep 10, 2020
@keewis keewis reopened this Sep 10, 2020
@keewis
Copy link
Collaborator Author

keewis commented Sep 10, 2020

the main build is also failing so I guess this is unrelated. There is a numpy release PR in the feedstock, maybe that will fix it?

which makes manually removing paths unnecessary
@keewis
Copy link
Collaborator Author

keewis commented Sep 10, 2020

@max-sixty, I removed the call to os.remove, instead every doctest gets its own temporary directory and by changing to that directory first, any file will be written into it. This seems a bit wasteful because not every doctest needs a temporary directory (actually, the vast majority doesn't), but I couldn't find an option to enable that only for certain tests. The doctests still complete within a few seconds, though.

@@ -36,3 +36,6 @@ def add_standard_imports(doctest_namespace):

# always seed numpy.random to make the examples deterministic
np.random.seed(0)

# always switch to the temporary directory, so files get written there
tmpdir.chdir()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, nice idea

@max-sixty
Copy link
Collaborator

Great! To the extent the RTD build failure is not from this branch, shall we merge this now?

@max-sixty
Copy link
Collaborator

Great! To the extent the RTD build failure is not from this branch, shall we merge this now?

(i.e. feel free to merge!)

@keewis
Copy link
Collaborator Author

keewis commented Sep 11, 2020

the failing hypothesis test looks unrelated, so I'm merging.

Edit: it seems RTD was fixed

@keewis keewis merged commit 23dc2fc into pydata:master Sep 11, 2020
@keewis keewis deleted the fix-doctests branch September 11, 2020 12:34
@max-sixty
Copy link
Collaborator

Thanks a lot @keewis ! This is a big step forward!

@keewis
Copy link
Collaborator Author

keewis commented Sep 11, 2020

should we add a CI that checks that we don't regress?

@dcherian
Copy link
Contributor

Sounds good to me. Thanks for working on this.

@keewis
Copy link
Collaborator Author

keewis commented Sep 11, 2020

okay, then let's wait for #4409 which should fix the remaining three doctest failures.

@max-sixty
Copy link
Collaborator

For sure re CI!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should we run tests on docstrings?
3 participants