-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 06cfadc
Showing
34 changed files
with
8,203 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
.git | ||
venv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
.cook | ||
.ipynb_checkpoints | ||
*.hpp | ||
*.html | ||
*.ipynb | ||
*.o | ||
*.pkl | ||
*.tmp | ||
__pycache__ | ||
padding/exact | ||
padding/padded | ||
profile/fourier_centered | ||
profile/fourier_non_centered | ||
profile/graph_centered | ||
profile/graph_non_centered | ||
profile/standard_centered | ||
profile/standard_non_centered | ||
trees/trees | ||
tube/tube | ||
venv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
FROM python:3.10 | ||
WORKDIR /workdir | ||
COPY requirements.txt . | ||
RUN pip install --no-cache-dir -r requirements.txt | ||
RUN python -m cmdstanpy.install_cmdstan --verbose --version 2.33.0 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
# Reproduction materials for "Scalable Gaussian Process Inference with Stan" | ||
|
||
This repository comprises code and data to reproduce the results and figures in the accompanying manuscript [Scalable Gaussian Process Inference with Stan](https://doi.org/10.48550/arXiv.2301.08836). | ||
|
||
## Setting up a computing environment | ||
|
||
A computing environment can either be set up directly on your machine by following the [steps below](#direct-installation), or you can reproduce the results in a [Docker container](https://en.wikipedia.org/wiki/Docker_(software)) (see [the following section](#containerized-runtime)). | ||
|
||
### Direct installation | ||
|
||
1. Ensure a recent version of Python is installed (>= 3.8). The code was tested with Python 3.8, 3.9, 3.10, and 3.11 on Ubuntu (see [here](https://github.com/onnela-lab/gptools/actions/workflows/main.yml) for details) and 3.10 on macOS. | ||
2. Install the Python requirements by running `pip install -r requirements.txt` from the root directory. | ||
3. Install `cmdstan` by running `python -m cmdstanpy.install_cmdstan --verbose --version=2.33.0`. Other versions of `cmdstan` may also work but have not been tested. | ||
4. Verify the installation by running `python -m gptools.stan` which should print `The include path is ...`. | ||
|
||
### Containerized runtime | ||
|
||
To simplify reproduction, we also provide a `Dockerfile` to build a containerized runtime environment. | ||
|
||
1. Ensure Docker is installed. Docker Desktop 4.22.1 (118664) with Docker Engine 24.0.5 was used to test the following steps. | ||
2. Run `docker build -t gptools .` to build the Docker image. | ||
3. Verify the installation by running `./in-docker.sh python -m gptools.stan` which should print `The include path is ...`. | ||
|
||
For any of the subsequent commands, they can be executed by either running `[the command]` directly or `./in-docker.sh [the command]` to use the containerized runtime. In the following `[./in-docker.sh]` indicates the optional prefix to run commands in a container. | ||
|
||
### Removing compiled Stan programs | ||
|
||
If you switch between containerized and local runtime, you may need to remove compiled Stan programs because binaries are not interoperable. Run `[./in-docker.sh] cook exec rm-compiled` to do so. | ||
|
||
## Running the experiments | ||
|
||
Figures in the manuscript were generated using the containerized runtime, and all runtime estimates below are based on a 2020 Macbook Pro with M1 chip and 16 GB of memory running macOS 13.4 (22F66). All figures can be reproduced by running `[./in-docker.sh] cook exec figures` (see below for details). | ||
|
||
### Applications | ||
|
||
The folders `trees` and `tube` contain code and data to reproduce the two applied examples in the manuscript. Each example takes about five minutes to run. The figures can be reproduced by running `[./in-docker.sh] cook exec tube:fig trees:fig` and will be saved in the corresponding folder as png and pdf files. | ||
|
||
### Profiling experiments | ||
|
||
The folder `profile` contains code to reproduce profiling experiments, and running all experiments can take up to ten hours. The profiling figure can be reproduced by running `[./in-docker.sh] cook exec profile:fig`. If a reduced runtime (but more noisy results) are desired, run `FAST=true cook exec profile:fig` which takes about 90 minutes. | ||
|
||
All experiments are seeded for reproducibility, but profiling experiments are subject to variability due to different hardware and competing processes running on the same machine. Despite seeding, results may also vary depending on the operating system and stdlib implementation. | ||
|
||
### Kernel properties and effect of padding | ||
|
||
The folders `kernels` and `padding` contain code to reproduce figures on the properties of different kernels and their spectral properties as well as the effect of padding on Fourier methods, respectively. The figures can be reproduced by running `[./in-docker.sh] cook exec kernels:fig padding:fig`; the latter takes about ten minutes to generate. | ||
|
||
## Expected results | ||
|
||
- `[./in-docker.sh] cook exec kernels:fig` ![](kernels/kernels.png) | ||
- `[./in-docker.sh] cook exec padding:fig` ![](padding/padding.png) | ||
- `[./in-docker.sh] cook exec profile:fig` ![](profile/profile.png) | ||
- `[./in-docker.sh] cook exec trees:fig` ![](trees/trees.png) | ||
- `[./in-docker.sh] cook exec tube:fig` ![](tube/tube.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
docker run --rm -it -e FAST -v `pwd`:/workdir gptools $@ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
figure.figsize : 6.1, 4.5 | ||
figure.dpi : 144 | ||
font.size : 10 | ||
legend.fontsize: small | ||
image.origin: lower | ||
animation.html: html5 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
--- | ||
jupytext: | ||
text_representation: | ||
extension: .md | ||
format_name: myst | ||
format_version: 0.13 | ||
jupytext_version: 1.14.4 | ||
kernelspec: | ||
display_name: Python 3 (ipykernel) | ||
language: python | ||
name: python3 | ||
--- | ||
|
||
```{code-cell} ipython3 | ||
from gptools.util.kernels import ExpQuadKernel, MaternKernel | ||
from gptools.util.fft.fft1 import transform_irfft, evaluate_rfft_scale | ||
import matplotlib as mpl | ||
from matplotlib import pyplot as plt | ||
import numpy as np | ||
from pathlib import Path | ||
mpl.style.use("../jss.mplstyle") | ||
``` | ||
|
||
```{code-cell} ipython3 | ||
np.random.seed(9) # Seed picked for good legend positioning. Works for any though. | ||
fig, axes = plt.subplots(2, 2) | ||
length_scale = 0.2 | ||
kernels = { | ||
"squared exp.": lambda period: ExpQuadKernel(1, length_scale, period), | ||
"Matern ³⁄₂": lambda period: MaternKernel(1.5, 1, length_scale, period), | ||
} | ||
x = np.linspace(0, 1, 101, endpoint=False) | ||
z = np.random.normal(0, 1, x.size) | ||
for ax, (key, kernel) in zip(axes[1], kernels.items()): | ||
value = kernel(None).evaluate(0, x[:, None]) | ||
line, = axes[0, 0].plot(x, value, ls="--") | ||
rfft = kernel(1).evaluate_rfft([x.size]) | ||
value = np.fft.irfft(rfft, x.size) | ||
axes[0, 1].plot(rfft, label=key) | ||
axes[0, 0].plot(x, value, color=line.get_color()) | ||
for maxf, ls in [(x.size // 2 + 1, "-"), (5, "--"), (3, ":")]: | ||
rfft_scale = evaluate_rfft_scale(cov_rfft=rfft, size=x.size) | ||
rfft_scale[maxf:] = 0 | ||
f = transform_irfft(z, np.zeros_like(z), rfft_scale=rfft_scale) | ||
ax.plot(x, f, ls=ls, color=line.get_color(), label=fr"$\xi_\max={maxf}$") | ||
ax.set_xlabel("position $x$") | ||
ax.set_ylabel(f"{key} GP $f$") | ||
ax = axes[0, 0] | ||
ax.set_ylabel("kernel $k(0,x)$") | ||
ax.set_xlabel("position $x$") | ||
ax.legend([ | ||
mpl.lines.Line2D([], [], ls="--", color="gray"), | ||
mpl.lines.Line2D([], [], ls="-", color="gray"), | ||
], ["standard", "periodic"], fontsize="small") | ||
ax.text(0.05, 0.05, "(a)", transform=ax.transAxes) | ||
ax.yaxis.set_ticks([0, 0.5, 1]) | ||
ax = axes[0, 1] | ||
ax.set_yscale("log") | ||
ax.set_ylim(1e-5, x.size) | ||
ax.set_xlabel(r"frequency $\xi$") | ||
ax.set_ylabel(r"Fourier kernel $\tilde k=\phi\left(k\right)$") | ||
ax.legend(fontsize="small", loc="center right") | ||
ax.text(0.95, 0.95, "(b)", transform=ax.transAxes, ha="right", va="top") | ||
ax = axes[1, 0] | ||
ax.legend(fontsize="small", loc="lower center") | ||
ax.text(0.95, 0.95, "(c)", transform=ax.transAxes, ha="right", va="top") | ||
ax = axes[1, 1] | ||
ax.legend(fontsize="small", loc="lower center") | ||
ax.sharey(axes[1, 0]) | ||
ax.text(0.95, 0.95, "(d)", transform=ax.transAxes, ha="right", va="top") | ||
for ax in [axes[0, 0], *axes[1]]: | ||
ax.xaxis.set_ticks([0, 0.5, 1]) | ||
fig.tight_layout() | ||
fig.savefig("kernels.pdf", bbox_inches="tight") | ||
fig.savefig("kernels.png", bbox_inches="tight") | ||
``` |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
data { | ||
int num_observations, observe_first; | ||
int kernel; | ||
array [num_observations] real x; | ||
vector[num_observations] y; | ||
real<lower=0> kappa, length_scale, epsilon, sigma; | ||
} | ||
|
||
transformed data { | ||
cov_matrix[num_observations] cov; | ||
if (kernel == 0) { | ||
cov = gp_exp_quad_cov(x, sigma, length_scale); | ||
} else if (kernel == 1) { | ||
cov = gp_matern32_cov(x, sigma, length_scale); | ||
} else { | ||
reject("kernel=", kernel); | ||
} | ||
cov = add_diag(cov, epsilon); | ||
matrix[num_observations, num_observations] chol = cholesky_decompose(cov); | ||
} | ||
|
||
parameters { | ||
vector[num_observations] z; | ||
} | ||
|
||
transformed parameters { | ||
vector[num_observations] f = chol * z; | ||
} | ||
|
||
model { | ||
z ~ normal(0, 1); | ||
y[:observe_first] ~ normal(f[:observe_first], kappa); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
functions { | ||
#include gptools/util.stan | ||
#include gptools/fft1.stan | ||
} | ||
|
||
data { | ||
int num_observations, padding, observe_first; | ||
int kernel; | ||
vector[num_observations] y; | ||
real<lower=0> kappa, length_scale, epsilon, sigma; | ||
} | ||
|
||
transformed data { | ||
int num = num_observations + padding; | ||
vector[num %/% 2 + 1] cov_rfft; | ||
if (kernel == 0) { | ||
cov_rfft = gp_periodic_exp_quad_cov_rfft(num, sigma, length_scale, num); | ||
} else if (kernel == 1) { | ||
cov_rfft = gp_periodic_matern_cov_rfft(1.5, num, sigma, length_scale, num); | ||
} | ||
cov_rfft += epsilon; | ||
} | ||
|
||
parameters { | ||
vector[num] z; | ||
} | ||
|
||
transformed parameters { | ||
vector[num] f = gp_inv_rfft(z, zeros_vector(num), cov_rfft); | ||
} | ||
|
||
model { | ||
z ~ normal(0, 1); | ||
y[:observe_first] ~ normal(f[:observe_first], kappa); | ||
} |
Oops, something went wrong.