Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep the documentation in sync with the code #130

Closed
malmans2 opened this issue Dec 14, 2023 · 3 comments
Closed

Keep the documentation in sync with the code #130

malmans2 opened this issue Dec 14, 2023 · 3 comments
Assignees

Comments

@malmans2
Copy link
Member

malmans2 commented Dec 14, 2023

Hi there,

Great package and awesome documentation!

I have a small suggestion.
I often find hard to keep the documentation in sync with the code. I suggest to store the simple notebooks with no output (e.g., you could add nbstripout to your pre-commits) and make readthedocs executing them.
I imagine the example notebooks are too heavy to do that, so I'm referring to the simple notebooks such as quickstart.

Similarly, I would use doctests in the README rather than simple snippets. If you go that way, you just need to tell pytest to check the doctests in README.md, and you are sure that the README is in sync with the code.
See: https://docs.pytest.org/en/7.3.x/how-to/doctest.html#how-to-run-doctests

JOSS review: openjournals/joss-reviews#6060

@nicrie
Copy link
Contributor

nicrie commented Dec 17, 2023

Hey, thanks a lot for the kind words @malmans2 , and sorry for the late reply, these past few days have been pretty hectic.

I really like the idea of automatically synchronizing the documentation and the code! I've tried to integrate both suggestions, but I'm stuck with the implementation.

Notebook & nbstripout
[to the PR]
Cleaning the notebook output with nbstripout using a pre-commit hook works. However, I don't understand how to instruct ReadTheDocs to execute the notebook itself. It seems that ReadTheDocs doesn't do this automatically and thus fails when building the documentation. @malmans2 do you have any experience with this or know the relevant part in the ReadTheDocs documentation?

README & doctest
[to the PR]
I've started rewriting the README for doctest. Strangely, the examples for Varimax Rotation fail$^{**}$ with doctest, even though pytest shows no problem. I honestly can't make heads or tails of the error message. The origin seems to be the serialize method, here's the relevant part of the doctest error message:

Error Message
=================================== FAILURES ===================================
_____________________________ [doctest] README.md ______________________________
072 >>> scores = eof.scores()  # PCs (temporal patterns)
073 
074 ```
075 
076 **Varimax-rotated EOF analysis**
077 Initiate and fit an `EOFRotator` class to the model to obtain a varimax-rotated EOF analysis
078 
079 ```python
080 >>> rotator = xe.models.EOFRotator(n_modes=3)
081 >>> rotator.fit(eof) # doctest: +ELLIPSIS
UNEXPECTED EXCEPTION: KeyError('Could not find node at mean_')
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/doctest.py", line 1350, in __run
  exec(compile(example.source, filename, "single",
File "<doctest README.md[10]>", line 1, in <module>
File "/home/runner/work/xeofs/xeofs/xeofs/models/eof_rotator.py", line 116, in fit
  self.compute()
File "/home/runner/work/xeofs/xeofs/xeofs/models/_base_model.py", line 380, in compute
  self._deserialize_attrs(dt)
File "/home/runner/work/xeofs/xeofs/xeofs/models/_base_model.py", line 456, in _deserialize_attrs
  deserialized_obj = getattr(self, key).deserialize(dt[key])
File "/home/runner/work/xeofs/xeofs/xeofs/preprocessing/preprocessor.py", line 412, in deserialize
  deserialized = preprocessor.transformer_types()[name].deserialize(
File "/home/runner/work/xeofs/xeofs/xeofs/preprocessing/transformer.py", line 162, in deserialize
  return cls._deserialize(dt)
File "/home/runner/work/xeofs/xeofs/xeofs/preprocessing/transformer.py", line 177, in _deserialize
  data = transformer._deserialize_data_node(key, dt[key])
File "/home/runner/work/xeofs/xeofs/.venv/lib/python3.10/site-packages/datatree/datatree.py", line 876, in __getitem__
  return self._get_item(path)
File "/home/runner/work/xeofs/xeofs/.venv/lib/python3.10/site-packages/datatree/treenode.py", line 448, in _get_item
  raise KeyError(f"Could not find node at {path}")
KeyError: 'Could not find node at mean_'
/home/runner/work/xeofs/xeofs/README.md:81: UnexpectedException

@slevang, as a budding xarray-datatree expert, any idea why doctest is hitting a snag here, even though pytest runs through without a hitch?

Alternatively, I think we could also omit the Quickstart in the README and directly refer to Documentation > Quickstart, since both are essentially redundant.


$^{**}$ Additionally, pytest raises a lot of warnings which I think originate from the test modules in xeofs themselves, not the actual code.

@slevang
Copy link
Contributor

slevang commented Dec 17, 2023

Looks like the whole serialize/deserialize pathway doesn't quite work with Datasets. That's an issue since .compute() now relies on temporarily converting to a DataTree to compute everything at once. I'll try to look at getting this properly supported in the next few days. In the meantime your example works fine on the DataArray:

t2m = xr.tutorial.open_dataset("air_temperature")
# fails
xe.models.EOF(n_modes=10).fit(t2m, "time").compute()
# succeeds
xe.models.EOF(n_modes=10).fit(t2m.air, "time").compute()

@nicrie
Copy link
Contributor

nicrie commented Dec 18, 2023

Thanks again for the quick fix @slevang.
Closing, fixed by #135 and #136

@nicrie nicrie closed this as completed Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants