Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: run doctest in CI for Python API examples #1840

Merged
merged 12 commits into from
Nov 24, 2023

Conversation

marijncv
Copy link
Contributor

@marijncv marijncv commented Nov 11, 2023

Description

Enables running of doctest for python examples in docstrings. Also updates the docstrings where needed to make the test pass.

Related Issue(s)

closes #1783

Documentation

@github-actions github-actions bot added binding/python Issues for the Python package documentation Improvements or additions to documentation labels Nov 11, 2023
Copy link

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@marijncv marijncv changed the title Run doctest in CI for Python API examples ci: run doctest in CI for Python API examples Nov 11, 2023
Comment on lines 533 to 534
>>> write_deltalake('path/to/table', df, mode='overwrite') # doctest: +SKIP
>>> write_deltalake('path/to/table', df, mode='append') # doctest: +SKIP
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the library is largely about reading and write from disk, it seems like it might be worth the effort to make this work. I was able to do this in lancedb/lance by configuring all doc tests to run in a temporary directory. This is just a few lines.

https://github.com/lancedb/lance/blob/84e4ce55f869a833c42239bfd6ba6176aa5840d3/python/python/lance/conftest.py#L18

Although the drawback is this might break the other tests that rely on reading specific pre-written tables in the repo. 🤔 I think that's a tradeoff I'd be willing to make. Let me know if you disagree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion! It seems to work well in commit (on my machine, not sure about CI machine yet)

@@ -225,16 +223,42 @@ specified by the table configuration ``delta.logRetentionDuration``.
To view the available history, use :meth:`DeltaTable.history`:

.. code-block:: python


>>> from pprint import pprint
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the usage of pprint btw. 👍

Copy link
Collaborator

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW this doctest applies to the user guide docs, but I was hoping to also have them run the ones within the docstrings, such as:

```
from deltalake import DeltaTable, write_deltalake
import pyarrow as pa
data = pa.table({"x": [1, 2, 3], "y": [4, 5, 6]})
write_deltalake("tmp", data, partition_by=["x"])
dt = DeltaTable("tmp")
dt.get_add_actions().to_pandas()
```
```
path size_bytes modification_time data_change partition_values num_records null_count min max
0 x=2/0-91820cbf-f698-45fb-886d-5d5f5669530b-0.p... 565 1970-01-20 08:40:08.071 True {'x': 2} 1 {'y': 0} {'y': 5} {'y': 5}
1 x=3/0-91820cbf-f698-45fb-886d-5d5f5669530b-0.p... 565 1970-01-20 08:40:08.071 True {'x': 3} 1 {'y': 0} {'y': 6} {'y': 6}
2 x=1/0-91820cbf-f698-45fb-886d-5d5f5669530b-0.p... 565 1970-01-20 08:40:08.071 True {'x': 1} 1 {'y': 0} {'y': 4} {'y': 4}
```
```
dt.get_add_actions(flatten=True).to_pandas()
```
```
path size_bytes modification_time data_change partition.x num_records null_count.y min.y max.y
0 x=2/0-91820cbf-f698-45fb-886d-5d5f5669530b-0.p... 565 1970-01-20 08:40:08.071 True 2 1 0 5 5
1 x=3/0-91820cbf-f698-45fb-886d-5d5f5669530b-0.p... 565 1970-01-20 08:40:08.071 True 3 1 0 6 6
2 x=1/0-91820cbf-f698-45fb-886d-5d5f5669530b-0.p... 565 1970-01-20 08:40:08.071 True 1 1 0 4 4
```
"""

If you want to limit the scope of the PR to just the user guide for now, that is fine with me. We'll just not close that one issue just yet if that's the case.

@roeap
Copy link
Collaborator

roeap commented Nov 11, 2023

just wondering out loud, as I am not fully aware of the latest state. IIRC, we did merge some code migrating docs to mkdocs/markdown recently, albeit it not being released yet. Was the plan to deprecate the sphinx/rst docs once the switch is made? And would this work for the mkdocs docs as well?

@marijncv marijncv marked this pull request as draft November 13, 2023 07:43
@marijncv marijncv marked this pull request as ready for review November 13, 2023 20:10
@marijncv
Copy link
Contributor Author

BTW this doctest applies to the user guide docs, but I was hoping to also have them run the ones within the docstrings,

Enabled the docstring tests in the later commits. Although perhaps it actually makes more sense to scope this PR to only the docstrings and leave out the sphinx docs (as probably was the intention of #1783). Also given the comment by @roeap and the release of the mkdocs. It would make the conftest.py setup a lot simpler too. Curious what you think!

@roeap
Copy link
Collaborator

roeap commented Nov 19, 2023

Agreed, the sphinx docs by now are officially no longer updated - at least we now display a header that says so 😆.

@github-actions github-actions bot removed the documentation Improvements or additions to documentation label Nov 21, 2023
@roeap roeap enabled auto-merge (squash) November 24, 2023 18:54
@roeap roeap merged commit bf62a82 into delta-io:main Nov 24, 2023
24 checks passed
ion-elgreco pushed a commit to ion-elgreco/delta-rs that referenced this pull request Nov 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run doctest in CI for Python API examples
3 participants