Skip to content

Commit

Permalink
Merge pull request #461 from JaxGaussianProcesses/fix-doc-build
Browse files Browse the repository at this point in the history
Update docs build process
  • Loading branch information
thomaspinder authored Aug 15, 2024
2 parents dac4553 + e38d5bb commit 0d21a31
Show file tree
Hide file tree
Showing 54 changed files with 218 additions and 170 deletions.
3 changes: 1 addition & 2 deletions .github/workflows/build_docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ jobs:
- name: Install and configure Poetry
uses: snok/install-poetry@v1
with:
version: 1.2.2
version: 1.5.1
virtualenvs-create: false
virtualenvs-in-project: false
installer-parallel: true
Expand All @@ -59,7 +59,6 @@ jobs:
- name: Build the documentation with MKDocs
run: |
cp docs/examples/gpjax.mplstyle .
poetry install --all-extras --with docs
conda install pandoc
poetry run mkdocs build
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/integration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:
- name: Install Poetry
uses: snok/[email protected]
with:
version: 1.4.0
version: 1.5.1

# Configure Poetry to use the virtual environment in the project
- name: Setup Poetry
Expand Down
5 changes: 2 additions & 3 deletions .github/workflows/test_docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,14 +51,13 @@ jobs:
- name: Install and configure Poetry
uses: snok/install-poetry@v1
with:
version: 1.2.2
version: 1.5.1
virtualenvs-create: false
virtualenvs-in-project: false
installer-parallel: true

- name: Build the documentation with MKDocs
run: |
cp docs/examples/gpjax.mplstyle .
poetry install --all-extras --with docs
conda install pandoc
poetry run mkdocs build
poetry run python docs/scripts/gen_examples.py && poetry run mkdocs build
11 changes: 7 additions & 4 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,13 @@ jobs:
python-version: ${{ matrix.python-version }}

# Install Poetry
- name: Install Poetry
uses: snok/install-poetry@v1.3.3
- name: Install and configure Poetry
uses: snok/install-poetry@v1
with:
version: 1.4.0
version: 1.5.1
virtualenvs-create: false
virtualenvs-in-project: false
installer-parallel: true

# Configure Poetry to use the virtual environment in the project
- name: Setup Poetry
Expand All @@ -39,7 +42,7 @@ jobs:
# Install the dependencies
- name: Install Package
run: |
poetry install --with tests
poetry install --with dev
- name: Check docstrings
run: |
Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -152,4 +152,4 @@ package-lock.json
node_modules/

docs/api
docs/examples/*.md
docs/_examples
6 changes: 3 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ GPJax is a didactic Gaussian process (GP) library in JAX, supporting GPU
acceleration and just-in-time compilation. We seek to provide a flexible
API to enable researchers to rapidly prototype and develop new ideas.

![Gaussian process posterior.](./_static/GP.svg)
![Gaussian process posterior.](static/GP.svg)


## "Hello, GP!"
Expand Down Expand Up @@ -40,15 +40,15 @@ would write on paper, as shown below.

!!! Install

GPJax can be installed via pip. See our [installation guide](https://docs.jaxgaussianprocesses.com/installation/) for further details.
GPJax can be installed via pip. See our [installation guide](installation.md) for further details.

```bash
pip install gpjax
```

!!! New

New to GPs? Then why not check out our [introductory notebook](https://docs.jaxgaussianprocesses.com/examples/intro_to_gps/) that starts from Bayes' theorem and univariate Gaussian distributions.
New to GPs? Then why not check out our [introductory notebook](_examples/intro_to_gps.md) that starts from Bayes' theorem and univariate Gaussian distributions.

!!! Begin

Expand Down
98 changes: 87 additions & 11 deletions docs/scripts/gen_examples.py
Original file line number Diff line number Diff line change
@@ -1,20 +1,96 @@
""" Convert python files in "examples" directory to markdown files using jupytext and nbconvert.
There's only a minor inconvenience with how supporting files are handled by nbconvert,
see https://github.com/jupyter/nbconvert/issues/1164. But these will be under a private
directory `_examples` in the docs folder, so it's not a big deal.
"""
from argparse import ArgumentParser
from pathlib import Path
import subprocess
from concurrent.futures import ThreadPoolExecutor, as_completed
import shutil

EXCLUDE = ["utils.py"]

EXECUTE = False
EXCLUDE = ["docs/examples/utils.py"]
ALLOW_ERRORS = False

def process_file(file: Path, out_file: Path | None = None, execute: bool = False):
"""Converts a python file to markdown using jupytext and nbconvert."""

for file in Path("docs/").glob("examples/*.py"):
if file.as_posix() in EXCLUDE:
continue
out_dir = out_file.parent
command = f"cd {out_dir.as_posix()} && "

out_file = file.with_suffix(".md")
out_file = out_file.relative_to(out_dir).as_posix()

command = "jupytext --to markdown "
command += f"{'--execute ' if EXECUTE else ''}"
command += f"{'--allow-errors ' if ALLOW_ERRORS else ''}"
command += f"{file} --output {out_file}"
if execute:
command += f"jupytext --to ipynb {file} --output - "
command += (
f"| jupyter nbconvert --to markdown --execute --stdin --output {out_file}"
)
else:
command = f"jupytext --to markdown {file} --output {out_file}"

subprocess.run(command, shell=True, check=False)


def is_modified(file: Path, out_file: Path):
"""Check if the output file is older than the input file."""
return out_file.exists() and out_file.stat().st_mtime < file.stat().st_mtime


def main(args):
# project root directory
wdir = Path(__file__).parents[2]

# output directory
out_dir: Path = args.outdir
out_dir.mkdir(exist_ok=True, parents=True)

# copy directories in "examples" to output directory
for dir in wdir.glob("examples/*"):
if dir.is_dir():
(out_dir / dir.name).mkdir(exist_ok=True, parents=True)
for file in dir.glob("*"):
# copy, not move!
shutil.copy(file, out_dir / dir.name / file.name)

# list of files to be processed
files = [f for f in wdir.glob("examples/*.py") if f.name not in EXCLUDE]

# process only modified files
if args.only_modified:
files = [f for f in files if is_modified(f, out_dir / f"{f.stem}.md")]

print(files)

# process files in parallel
with ThreadPoolExecutor(max_workers=args.max_workers) as executor:
futures = []
for file in files:
out_file = out_dir / f"{file.stem}.md"
futures.append(
executor.submit(
process_file, file, out_file=out_file, execute=args.execute
)
)

for future in as_completed(futures):
try:
future.result()
except Exception as e:
print(f"Error processing file: {e}")


if __name__ == "__main__":
project_root = Path(__file__).parents[2]

parser = ArgumentParser()
parser.add_argument("--max_workers", type=int, default=4)
parser.add_argument("--execute", action="store_true")
parser.add_argument("--only_modified", action="store_true")
parser.add_argument(
"--outdir", type=Path, default=project_root / "docs" / "_examples"
)
args = parser.parse_args()

main(args)
4 changes: 3 additions & 1 deletion docs/scripts/sharp_bits_figure.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@
import matplotlib as mpl
from matplotlib import patches

plt.style.use("../examples/gpjax.mplstyle")
plt.style.use(
"https://raw.githubusercontent.com/JaxGaussianProcesses/GPJax/main/docs/examples/gpjax.mplstyle"
)
cols = mpl.rcParams["axes.prop_cycle"].by_key()["color"]

# %%
Expand Down
16 changes: 7 additions & 9 deletions docs/sharp_bits.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ learning rate is greater is than 0.03, we would end up with a negative variance
We visualise this issue below where the red cross denotes the invalid lengthscale value
that would be obtained, were we to optimise in the unconstrained parameter space.

![](_static/step_size_figure.svg)
![](static/step_size_figure.svg)

A simple but impractical solution would be to use a tiny learning rate which would
reduce the possibility of stepping outside of the parameter's support. However, this
Expand All @@ -70,7 +70,7 @@ subspace of the real-line onto the entire real-line. Here, gradient updates are
applied in the unconstrained parameter space before transforming the value back to the
original support of the parameters. Such a transformation is known as a bijection.

![](_static/bijector_figure.svg)
![](static/bijector_figure.svg)

To help understand this, we show the effect of using a log-exp bijector in the above
figure. We have six points on the positive real line that range from 0.1 to 3 depicted
Expand All @@ -81,8 +81,7 @@ value, we apply the inverse of the bijector, which is the exponential function i
case. This gives us back the blue cross.

In GPJax, we supply bijective functions using [Tensorflow Probability](https://www.tensorflow.org/probability/api_docs/python/tfp/substrates/jax/bijectors).
In our [PyTrees doc](examples/pytrees.md) document, we detail how the user can define
their own bijectors and attach them to the parameter(s) of their model.


## Positive-definiteness

Expand All @@ -91,8 +90,7 @@ their own bijectors and attach them to the parameter(s) of their model.
### Why is positive-definiteness important?

The Gram matrix of a kernel, a concept that we explore more in our
[kernels notebook](examples/constructing_new_kernels.py) and our [PyTree notebook](examples/pytrees.md), is a
symmetric positive definite matrix. As such, we
[kernels notebook](_examples/constructing_new_kernels.md). As such, we
have a range of tools at our disposal to make subsequent operations on the covariance
matrix faster. One of these tools is the Cholesky factorisation that uniquely decomposes
any symmetric positive-definite matrix $\mathbf{\Sigma}$ by
Expand Down Expand Up @@ -158,7 +156,7 @@ for some problems, this amount may need to be increased.
## Slow-to-evaluate

Famously, a regular Gaussian process model (as detailed in
[our regression notebook](examples/regression.py)) will scale cubically in the number of data points.
[our regression notebook](_examples/regression.md)) will scale cubically in the number of data points.
Consequently, if you try to fit your Gaussian process model to a data set containing more
than several thousand data points, then you will likely incur a significant
computational overhead. In such cases, we recommend using Sparse Gaussian processes to
Expand All @@ -168,12 +166,12 @@ When the data contains less than around 50000 data points, we recommend using
the collapsed evidence lower bound objective [@titsias2009] to optimise the parameters
of your sparse Gaussian process model. Such a model will scale linearly in the number of
data points and quadratically in the number of inducing points. We demonstrate its use
in [our sparse regression notebook](examples/collapsed_vi.py).
in [our sparse regression notebook](_examples/collapsed_vi.md).

For data sets exceeding 50000 data points, even the sparse Gaussian process outlined
above will become computationally infeasible. In such cases, we recommend using the
uncollapsed evidence lower bound objective [@hensman2013gaussian] that allows stochastic
mini-batch optimisation of the parameters of your sparse Gaussian process model. Such a
model will scale linearly in the batch size and quadratically in the number of inducing
points. We demonstrate its use in
[our sparse stochastic variational inference notebook](examples/uncollapsed_vi.py).
[our sparse stochastic variational inference notebook](_examples/uncollapsed_vi.md).
File renamed without changes.
File renamed without changes
File renamed without changes
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
nav .bd-links a:hover{
color: #B5121B
}
}
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes
File renamed without changes.
File renamed without changes
File renamed without changes
File renamed without changes.
File renamed without changes
File renamed without changes
9 changes: 9 additions & 0 deletions docs/stylesheets/extra.css
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,16 @@ div.doc-contents:not(.first) {
user-select: none;
}

/* Centers all PNG images in markdown files */
img[src$=".png"] {
display: block;
margin-left: auto;
margin-right: auto;
}

/* Maximum space for text block */
/* .md-grid {
max-width: 65%; /* or 100%, if you want to stretch to full-width */
/* }


3 changes: 3 additions & 0 deletions docs/examples/barycentres.py → examples/barycentres.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,12 @@


key = jr.key(123)

# set the default style for plotting
plt.style.use(
"https://raw.githubusercontent.com/JaxGaussianProcesses/GPJax/main/docs/examples/gpjax.mplstyle"
)

cols = plt.rcParams["axes.prop_cycle"].by_key()["color"]

# %% [markdown]
Expand Down
File renamed without changes
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
3 changes: 0 additions & 3 deletions docs/examples/gpjax.mplstyle → examples/gpjax.mplstyle
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,7 @@ axes.axisbelow: true

### Fonts
mathtext.fontset: cm
font.family: serif
font.serif: Computer Modern Roman
font.size: 10
text.usetex: True

# Axes ticks
ytick.left: True
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -25,12 +25,12 @@
# In this section we'll provide a short introduction to likelihoods and why they are
# important. For users who are already familiar with likelihoods, feel free to skip to
# the next section, and for users who would like more information than is provided
# here, please see our [introduction to Gaussian processes notebook](intro_to_gps.py).
# here, please see our [introduction to Gaussian processes notebook](intro_to_gps.md).
#
# ### What is a likelihood?
#
# We adopt the notation of our
# [introduction to Gaussian processes notebook](intro_to_gps.py) where we have a
# [introduction to Gaussian processes notebook](intro_to_gps.md) where we have a
# Gaussian process (GP) $f(\cdot)\sim\mathcal{GP}(m(\cdot), k(\cdot, \cdot))$ and a
# dataset $\mathbf{y} = \{y_n\}_{n=1}^N$ observed at corresponding inputs
# $\mathbf{x} = \{x_n\}_{n=1}^N$. The evaluation of $f$ at $\mathbf{x}$ is denoted by
Expand Down Expand Up @@ -128,9 +128,7 @@
gpx.likelihoods.Gaussian(num_datapoints=D.n, obs_stddev=0.5)

# %% [markdown]
# To control other properties of the observation noise such as trainability and value
# constraints, see our [PyTree guide](pytrees.md).
#

# ### Prediction
#
# The `predict` method of a likelihood object transforms the latent distribution of
Expand Down Expand Up @@ -224,7 +222,7 @@
#
# The final method that is associated with a likelihood function in GPJax is the
# expected log-likelihood. This term is evaluated in the
# [stochastic variational Gaussian process](uncollapsed_vi.py) in the ELBO term. For a
# [stochastic variational Gaussian process](uncollapsed_vi.md) in the ELBO term. For a
# variational approximation $q(f)= \mathcal{N}(f\mid m, S)$, the ELBO can be written as
# $$
# \begin{align}
Expand Down
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit 0d21a31

Please sign in to comment.