Skip to content

melissawm/papyri

 
 

Repository files navigation

Papyri

Papyri is a set of tools to build, publish (future functionality - to be done), install and render documentation within IPython and Jupyter.


Information Links
Project License Rendered documentation
CI Python Package Linting

Papyri allows:

  • bidirectional crosslinking across libraries,
  • navigation,
  • proper reflow of user docstrings text,
  • proper reflow of inline images (when rendered to html),
  • proper math rendering (both in terminal and html),
  • and more.

Motivation

See some of the reasons behind the project on this blog post.

Key motivation is building a set of tools to build better documentation for Python projects.

  • Uses an opinionated implementation to enable better understanding about the structure of your project.
  • Allow automatic cross-links (back and forth) between documentation across Python packages.
  • Use a documentation IR (intermediate representation) to separate building the docs from rendering the docs in many contexts.

This approach should hopefully allow a conda-forge-like model, where projects upload their IR to a given repo, a single website that contains documentation for multiple projects (without sub domains). The documentation pages can then be built with better cross-links between projects, and efficient page rebuild.

This should also allow displaying user-facing documentation on non html backends (think terminal), or provide documentation in an IDE (Spyder/Jupyterlab), without having to iframe it.

Overview Presentation

And this small presentation at CZI EOSS4 meeting in early november 2021.

Screenshots

Click to expand Navigating astropy's documentation from within IPython. Note that this includes forward refs but also backward references (i.e. which pages link to the current page.)

Type inference and keyboard navigation in terminal: Directives are properly rendered in terminal, examples are type inferred, clicking (or pressing enter) on highlighted tokens would open said page (backspace navigates back).

Since Jupyter Notebook and Lab pages can render HTML, it should be possible to have inline graphs and images when using Jupyter inline help (to be implemented). In terminals, we replace inline images with a button/link to open images in an external viewer (quicklook, evince, paint...)

Papyri has complete information about which pages link to other pages; this allows us to create a local graph of which pages mention each other to find related topics.

Below, you can see the local connectivity graph for numpy.zeros (d3js, draggable, clickable). numpy.zeroes links to (or is linked from) all dots present there. In green, we show other numpy functions; in blue, we show skimage functions; in orange, we show scipy functions; in red, we show xarray functions. Arrows between dots indicate pages which link to each other (for example ndarray is linked from xarray.cos), and dot size represents the popularity of a page.

Math expressions are properly rendered even in the terminal: here, polyfit is shown in IPyhton with papyri enabled (left) and disabled (right).

Installation (not fully functional):

Some functionality is not yet available when installing from PyPI. For now you need a dev-install (see next section) to access all features.

You'll need Python 3.8 or newer, otherwise pip will tell you it can't find any matching distribution.

Pip install from PyPI:

$ pip install papyri

Install given package documentation:

$ papyri install package_name [package_name [package_name [...]]]

Only numpy 1.20.0, scipy 1.5.0 and xarray 0.17.0 are currently installable and published. For other packages you will need to build locally which is a much more involved process.

Run IPython terminal with Papyri as an extension:

$ ipython --ext papyri.ipython

This will augment the ? operator to show better documentation (when installed with papyri install ...

Papyri does not completely build its own docs yet, but you might be able to view a static rendering of it here. It is not yet automatically built, so might be out of date.

Development install

You may need to get a modified version of numpydoc depending on the stage of development. You will need pip > 21.3 if you want to make editable installs.

# clone this repo
# cd this repo
pip install -e .

Some functionality requires tree_sitter_rst. To build the TreeSitter rst parser:

$ git submodule update --init
$ papyri build-parser

Look at CI file if those instructions are not up to date.

Note that papyri still uses a custom parser which will be removed in the future to rely mostly on TreeSitter.

Testing

Install extra development dependencies by running:

$ pip install -r requirements-dev.txt

Run tests using

$ pytest

Usage

In the end there should be roughly 3 steps,

  • IR generation (package maintainers)
  • IR installation (end user or via pip/conda)
  • IR rendering (usually IDE, CLI/webserver)

IR Generation

This is the step you want to trigger if you are building documentation using Papyri for a library you maintain. Most likely as an end user you will not have to issue this step and can install pre-published documentation bundles. This step is likely to occur only once per new release of a project.

Look at the Toml files in examples, this will give you example configurations from some existing libraries.

$ ls -1 examples/*.toml
examples/IPython.toml
examples/astropy.toml
examples/dask.toml
examples/matplotlib.toml
examples/numpy.toml
examples/papyri.toml
examples/scipy.toml
examples/skimage.toml

Right now these files lives in papyri but would likely be in relevant repositories under docs/papyri.toml later on.

It is slow on full numpy/scipy; use --no-infer (see below) for a subpar but faster experience.

Use papyri gen <path to example file>

for example:

$ papyri gen examples/numpy.toml
$ papyri gen examples/scipy.toml

This will create intermediate docs files in in ~/.papyri/data/<library name>_<library_version>

Installation/ingestion

The installation/ingestion of documentation bundles is the step that makes all bundles "aware" of each other, and allows crosslinking/indexing to work.

We'll reserve the term "install" and "installation" for when you download pre-build documentation bundle from an external source and give only the package name – which is not completely implemented yet.

You can ingest local folders with the following command:

$ papyri ingest ~/.papyri/data/<path to folder generated at previous step>

This will crosslink the newly generate folder with the existing ones. Ingested data can be found in ~/.papyri/ingest/ but you are not supposed to interact with this folder with tools external to papyri.

There is currently a couple of pre-built documentation bundles that can be pre-installed, but are likely to break with each new version of papyri. We suggest you use the developer installation and ingestion procedure for now.

Rendering

The last step of the papyri pipeline is to render the docs, or the subset that is of interest to you. This will likely be done by your favorite IDE, probably just in time when you explore documentation. Nonetheless, we've implemented a couple of external renderers to help debug issues.

WARNING:

Many rendering methods current require papyri's own docs to be built and ingested first.

$ papyri gen examples/papyri.toml
$ papyri ingest ~/.papyri/data/papyri_0.0.7  # or any current version

Or you can try to pre-install an old papyri doc bundle

$ papyri install papyri

Standalone HTML rendering

$ papyri render  # render all the html pages statically in ~/.papyri/html
$ papyri serve-static # start a http.server with the propoer root to serve above files.
$ papyri serve  # start a server that will render the pages on the fly (nice to debug or iterate on theme, rendering)

Ascii terminal rendering (experimental)

$ papyri ascii <fully qualified names> # try to render in the terminal.

For example,

$ papyri ascii numpy.linspace

The next step uses urwid to provide a browsable interface in terminal.

$ papyri browse <fully qualified name> # urwid documentation browser.

Hacking on scrapping libraries papyri gen --no-infer [...] will skip type inference of examples. --exec option need to be passed to try to execute examples.

Papyri - Name's meaning

See the legendary Villa of Papyri, which get its name from its collection of many papyrus scrolls.

Legacy (MISC/OLD) documentation (Inaccurate):

Generation (papyri gen)

Collects the documentation of a project into a DocBundle -- a number of DocBlobs (currently json files), with a defined semantic structure, and some metadata (version of the project this documentation refers to, and potentially some other blobs).

During the generation a number of normalisation and inference can and should happen, for example

  • using type inference into the Examples sections of docstrings and storing those as pairs (token, reference), so that you can later decide that clicking on np.array in an example brings you to numpy array documentation; whether or not we are currently in the numpy doc.
  • Parsing "See Also" into a well defined structure
  • running Example to generate images for docs with images (not implemented)
  • resolve package local references for example building numpy doc "zeroes_like" is non ambiguous and shoudl be Normalized to "numpy.zeroes_like", ~.pyplot.histogram, normalized to matplotlib.pyplot.histogram as the target and histogram as the text ...etc.

The Generation step is likely project specific, as there might be import conventions that are per-project and should not need to be repeated (import pandas as pd, for example,)

Ingestion (papyri ingest)

The ingestion step takes a DocBundle and/or DocBlobs and adds them into a graph of known items; the ingestion is critical to efficiently build the collection graph metadata and understand which items refers to which. This allows the following:

  • Update the list of backreferences to a DocBundle
  • Update forward references metadata to know whether links are valid.

Currently the ingestion loads all in memory and update all the bundle in place but this can likely be done more efficiently.

A lot more can likely be done at larger scale, like detecting if documentation have changed in previous version so infer for which versions of a library this documentation is valid.

There is also likely some curating that might need to be done at that point, as for example, numpy.array have an extremely large number of back-references.

tree sitter info.

https://tree-sitter.github.io/tree-sitter/creating-parsers

When things don't work !

SqlOperationalError:

  • The DB schema likely have changed, try: rm -rf ~/.papyri/ingest/.

Can't build tree-sitter:

An error occurred trying to build-tree-sitter with clang, you likely have a conda environment. Install all the compilers in the current conda env:

conda install compilers

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 69.1%
  • CSS 20.6%
  • JavaScript 3.5%
  • TypeScript 3.3%
  • Jinja 3.1%
  • Shell 0.3%
  • HTML 0.1%