Skip to content

Commit

Permalink
Merge #145
Browse files Browse the repository at this point in the history
145: Docsgp r=lm2612 a=lm2612

I've started on the GP emulator docs to include: 

1. Brief introduction to what a Gaussian process is (could definitely be expanded)
2. Implementation: choice of GPJL or SKJL, how to optimize hyperparameters and how to predict new points
3. Prediction Type: data or latent variable. This part needs more info.
4. Kernels: How to change kernel, I've started with examples for different kernels in GPJL, not yet added SKLJL options.
5. Learning the noise: how to add option to learn noise, again could be expanded.

Co-authored-by: lm2612 <[email protected]>
  • Loading branch information
bors[bot] and lm2612 authored Jul 15, 2022
2 parents 245b264 + b6fb5b0 commit 3ba3749
Show file tree
Hide file tree
Showing 3 changed files with 147 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ pages = [
"Calibrate" => "calibrate.md",
"Emulate" => "emulate.md",
"Examples" => examples,
"Gaussian Process" => "GaussianProcessEmulator.md",
"Package Design" => design,
"API" => api,
"Glossary" => "glossary.md",
Expand Down
131 changes: 131 additions & 0 deletions docs/src/GaussianProcessEmulator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Gaussian Process Emulator

One type of `MachineLearningTool` we provide for emulation is a Gaussian process.
Gaussian processes are a generalization of the Gaussian probability distribution, extended to functions rather than random variables.
They can be used for statistical emulation, as they provide both mean and covariances.
To build a Gaussian process, we first define a prior over all possible functions, by choosing the covariance function or kernel.
The kernel describes how similar two outputs `(y_i, y_j)` are, given the similarities between their input values `(x_i, x_j)`.
Kernels encode the functional form of these relationships and are defined by hyperparameters, which are usually initially unknown to the user.
To learn the posterior Gaussian process, we condition on data using Bayes theorem and optimize the hyperparameters of the kernel.
Then, we can make predictions to predict a mean function and covariance for new data points.

A useful resource to learn about Gaussian processes is [Rasmussen and Williams (2006)](http://gaussianprocess.org/gpml/).


# User Interface

`CalibrateEmulateSample.jl` allows the Gaussian process emulator to be built using
either [`GaussianProcesses.jl`](https://stor-i.github.io/GaussianProcesses.jl/latest/)
or [`ScikitLearn.jl`](https://scikitlearnjl.readthedocs.io/en/latest/models/#scikitlearn-models).
To use `GaussianProcesses.jl`, define the package type as
```julia
gppackage = Emulators.GPJL()
```

To use `ScikitLearn.jl`, define the package type as
```julia
gppackage = Emulators.SKLJL()
```


Initialize a basic Gaussian Process with
```julia
gauss_proc = GaussianProcess(gppackage)
```

This initializes the prior Gaussian process.
We train the Gaussian process by feeding the `gauss_proc` alongside the data into the `Emulator` struct and optimizing the hyperparameters,
described [here](https://clima.github.io/CalibrateEmulateSample.jl/dev/emulate/#Typical-construction-from-Lorenz_example.jl).

# Prediction Type

You can specify the type of prediction when initializing the Gaussian Process emulator.
The default type of prediction is to predict data, `YType()`.
You can create a latent function type prediction with

```julia
gauss_proc = GaussianProcess(
gppackage,
prediction_type = FType())

```


# Kernels

The Gaussian process above assumes the default kernel: the Squared Exponential kernel, also called
the [Radial Basis Function (RBF)](https://en.wikipedia.org/wiki/Radial_basis_function_kernel).
A different type of kernel can be specified when the Gaussian process is initialized.
Read more about [kernel options here](https://www.cs.toronto.edu/~duvenaud/cookbook/).


## GPJL
For the `GaussianProcess.jl` package, there are [a range of kernels to choose from](https://stor-i.github.io/GaussianProcesses.jl/latest/kernels).
For example,
```julia
using GaussianProcesses
my_kernel = GaussianProcesses.Mat32Iso(0., 0.) # Create a Matern 3/2 kernel with lengthscale=0 and sd=0
gauss_proc = GaussianProcess(
gppackage;
kernel = my_kernel )
```
You do not need to provide useful hyperparameter values when you define the kernel, as these are learned in
`optimize_hyperparameters!(emulator)`.

You can also combine kernels together through linear operations, for example,
```julia
using GaussianProcesses
kernel_1 = GaussianProcesses.Mat32Iso(0., 0.) # Create a Matern 3/2 kernel with lengthscale=0 and sd=0
kernel_2 = GaussianProcesses.Lin(0.) # Create a linear kernel with lengthscale=0
my_kernel = kernel_1 + kernel_2 # Create a new additive kernel
gauss_proc = GaussianProcess(
gppackage;
kernel = my_kernel )
```

## SKLJL
Alternatively if you are using the `ScikitLearn.jl` package, you can [find the list of kernels here](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.gaussian_process).
These need this preamble:
```julia
using PyCall
using ScikitLearn
const pykernels = PyNULL()
function __init__()
copy!(pykernels, pyimport("sklearn.gaussian_process.kernels"))
end
```
Then they are accessible, for example, as
```julia
my_kernel = pykernels.RBF(length_scale = 1)
gauss_proc = GaussianProcess(
gppackage;
kernel = my_kernel )
```
You can also combine multiple ScikitLearn kernels via linear operations in the same way as above.

# Learning the noise

Often it is useful to learn the noise of the data by adding a white noise kernel. This is added with the
Boolean keyword `noise_learn` when initializing the Gaussian process. The default is true.

```julia
gauss_proc = GaussianProcess(
gppackage;
noise_learn = true )
```

When `noise_learn` is true, an additional white noise kernel is added to the kernel. This white noise is present
across all parameter values, including the training data.
The scale parameters of the white noise kernel are learned in `optimize_hyperparameters!(emulator)`.

You may not need to learn the noise if you already have a good estimate of the noise from your training data.
When `noise_learn` is false, additional regularization is added for stability.
The default value is `1e-3` but this can be chosen through the optional argument `alg_reg_noise`:

```julia
gauss_proc = GaussianProcess(
gppackage;
noise_learn = false,
alg_reg_noise = 1e-3 )
```

15 changes: 15 additions & 0 deletions docs/src/emulate.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,21 @@ emulator = Emulator(
```
The optional arguments above relate to the data processing.

### Emulator Training

The emulator is trained when we combine the machine learning tool and the data into the `Emulator` above.
For any machine learning tool, we must also optimize the hyperparameters:
```julia
optimize_hyperparameters!(emulator)
```
In the Lorenz example, this line learns the hyperparameters of the Gaussian process, which depend on the choice of [kernel](https://clima.github.io/CalibrateEmulateSample.jl/dev/GaussianProcessEmulator/#kernels).
Predictions at new inputs can then be made using
```julia
y, cov = Emulator.predict(emulator, new_inputs)
```
This returns both a mean value and a covariance.


## Data processing

Some effects of the following are outlined in a practical setting in the results and appendices of [Howland, Dunbar, Schneider, (2022)](https://doi.org/10.1029/2021MS002735).
Expand Down

0 comments on commit 3ba3749

Please sign in to comment.