+# Contributing
+Thank you for considering contributing to `CalibrateEmulateSample`! We encourage opening issues and pull requests (PRs).
+## What to contribute?
+- The easiest way to contribute is by using `CalibrateEmulateSample`, identifying
+ problems and opening issues;
+- You can try to tackle an existing [issue](https://github.com/CliMA/CalibrateEmulateSample.jl/issues). It is best to outline your proposed solution in the issue thread before implementing it in a PR;
+- Write an example or tutorial. It is likely that other users may find your use of `CalibrateEmulateSample` insightful;
+- Improve documentation or comments if you found something hard to use;
+- Implement a new feature if you need it. We strongly encourage opening an issue to make sure the administrators are on board before opening a PR with an unsolicited feature addition.
+## Using `git`
+If you are unfamiliar with `git` and version control, the following guides
+will be helpful:
+- [Atlassian (bitbucket) `git`
+ tutorials](https://www.atlassian.com/git/tutorials). A set of tips and tricks
+ for getting started with `git`.
+- [GitHub's `git` tutorials](https://try.github.io/). A set of resources from
+ GitHub to learn `git`.
+### Forks and branches
+Create your own fork of `CalibrateEmulateSample` [on
+GitHub](https://github.com/CliMA/CalibrateEmulateSample.jl) and check out your copy:
+$ git clone https://github.com//CalibrateEmulateSample.jl.git
+$ cd CalibrateEmulateSample.jl
+Now you have access to your fork of `CalibrateEmulateSample` through `origin`. Create a branch for your feature; this will hold your contribution:
+$ git checkout -b
+#### Some useful tips
+- When you start working on a new feature branch, make sure you start from
+ main by running: `git checkout main` and `git pull`.
+- Create a new branch from main by using `git checkout -b `.
+### Develop your feature
+Make sure you add tests for your code in `test/` and appropriate documentation in the code and/or
+in `docs/`. Before committing your changes, you can verify their behavior by running the tests, the examples, and building the documentation [locally](https://clima.github.io/CalibrateEmulateSample.jl/dev/installation_instructions/). In addition, make sure your feature follows the formatting guidelines by running
+julia --project=.dev .dev/climaformat.jl .
+from the `CalibrateEmulateSample.jl` directory.
+### Squash and rebase
+When your PR is ready for review, clean up your commit history by squashing
+and make sure your code is current with `CalibrateEmulateSample.jl` main by rebasing. The general rule is that a PR should contain a single commit with a descriptive message.
+To make sure you are up to date with main, you can use the following workflow:
+$ git checkout main
+$ git pull
+$ git checkout
+$ git rebase main
+This may create conflicts with the local branch. The conflicted files will be outlined by git. To resolve conflicts,
+we have to manually edit the files (e.g. with vim). The conflicts will appear between >>>>, ===== and <<<<<.
+We need to delete these lines and pick what version we want to keep.
+To squash your commits, you can use the following command:
+$ git rebase -i HEAD~n
+where `n` is the number of commits you need to squash into one. Then, follow the instructions in the terminal. For example, to squash 4 commits:
+$ git rebase -i HEAD~4
+will open the following file in (typically) vim:
+ pick 01d1124
+ pick 6340aaa
+ pick ebfd367
+ pick 30e0ccb
+ # Rebase 60709da..30e0ccb onto 60709da
+ #
+ # Commands:
+ # p, pick = use commit
+ # e, edit = use commit, but stop for amending
+ # s, squash = use commit, but meld into previous commit
+ #
+ # If you remove a line here THAT COMMIT WILL BE LOST.
+ # However, if you remove everything, the rebase will be aborted.
+We want to keep the first commit and squash the last 3. We do so by changing the last three commits to `squash` and then do `:wq` on vim.
+ pick 01d1124
+ squash 6340aaa
+ squash ebfd367
+ squash 30e0ccb
+ # Rebase 60709da..30e0ccb onto 60709da
+ #
+ # Commands:
+ # p, pick = use commit
+ # e, edit = use commit, but stop for amending
+ # s, squash = use commit, but meld into previous commit
+ #
+ # If you remove a line here THAT COMMIT WILL BE LOST.
+ # However, if you remove everything, the rebase will be aborted.
+Then in the next screen that appears, we can just delete all messages that
+we do not want to show in the commit. After this is done and we are back to
+the console, we have to force push. We need to force push because we rewrote
+the local commit history.
+$ git push -u origin --force
+You can find more information about squashing [here](https://github.com/edx/edx-platform/wiki/How-to-Rebase-a-Pull-Request#squash-your-changes).
+### Unit testing
+Currently a number of checks are run per commit for a given PR.
+- `JuliaFormatter` checks if the PR is formatted with `.dev/climaformat.jl`.
+- `Documentation` rebuilds the documentation for the PR and checks if the docs
+ are consistent and generate valid output.
+- `Unit Tests` run subsets of the unit tests defined in `tests/`, using `Pkg.test()`.
+ The tests are run in parallel to ensure that they finish in a reasonable time.
+ The tests only run the latest commit for a PR, branch and will kill any stale jobs on push.
+ These tests are only run on linux (Ubuntu LTS).
+Unit tests are run against every new commit for a given PR,
+the status of the unit-tests are not checked during the merge
+process but act as a sanity check for developers and reviewers.
+Depending on the content changed in the PR, some CI checks that
+are not necessary will be skipped. For example doc only changes
+do not require the unit tests to be run.
+### The merge process
+We use [`bors`](https://bors.tech/) to manage merging PR's in the the `CalibrateEmulateSample` repo.
+If you're a collaborator and have the necessary permissions, you can type
+`bors try` in a comment on a PR to have integration test suite run on that
+PR, or `bors r+` to try and merge the code. Bors ensures that all integration tests
+for a given PR always pass before merging into `main`. The integration tests currently run example cases in `examples/`. Any breaking changes will need to also update the `examples/`, else bors will fail.
diff --git a/docs/src/index.md b/docs/src/index.md
index 113ef1667..b5ef9b04e 100644
--- a/docs/src/index.md
+++ b/docs/src/index.md
@@ -1,33 +1,51 @@
# CalibrateEmulateSample.jl
-`CalibrateEmulateSample.jl` solves parameter estimation problems using (approximate) Bayesian inversion. It is designed for problems that require running a computer model that is expensive to evaluate, but can also be used for simple models.
+`CalibrateEmulateSample.jl` solves parameter estimation problems using accelerated (and approximate) Bayesian inversion.
-The computer model is supplied by the user – it is a forward model, i.e., it takes certain parameters and produces data that can then be compared with the actual observations. We can think of that model as a parameter-to-data map ``G(u): \mathbb{R}^p \rightarrow \mathbb{R}^d``. For example, ``G`` could be a global climate model or a model that predicts the motion of a robot arm.
+The framework can be applied currently to learn:
+- the joint distribution for a moderate numbers of parameters (<40),
+- it is not inherently restricted to unimodal distributions.
+it can be used with computer models that:
+- can be noisy or chaotic,
+- are non-differentiable,
+- can only be treated as black-box (interfaced only with parameter files).
+The computer model is supplied by the user, as a parameter-to-data map ``G(u): \mathbb{R}^p \rightarrow \mathbb{R}^d``. For example, ``G`` could be a map from any given parameter configuration ``u`` to a collection of statistics of a dynamical system trajectory.
The data produced by the forward model are compared to observations $y$, which are assumed to be corrupted by additive noise ``\eta``, such that
-y = G(u) + \eta
+y = G(u) + \eta,
where the noise ``\eta`` is drawn from a d-dimensional Gaussian with distribution ``\mathcal{N}(0, \Gamma_y)``.
-Given knowledge of the observations ``y``, the forward model ``G(u): \mathbb{R}^p \rightarrow \mathbb{R}^d``, and some information about the noise level such as its size or distribution (but not its value), the inverse problem we want to solve is to find the unknown parameters ``u``.
+### The inverse problem
+Given an observation ``y``, the computer model ``G``, the observational noise ``\Gamma_y``, and some broad prior information on ``u``, we return the joint distribution of a data-informed distribution for "``u`` given ``y``".
-As the name suggests, `CalibrateEmulateSample.jl` breaks this problem into a sequence of three steps: calibration, emulation, and sampling.
-A comprehensive treatment of the calibrate-emulate-sample approach to Bayesian inverse problems can be found in [Cleary et al., 2020](https://arxiv.org/pdf/2001.03689.pdf).
+As the name suggests, `CalibrateEmulateSample.jl` breaks this problem into a sequence of three steps: calibration, emulation, and sampling. A comprehensive treatment of the calibrate-emulate-sample approach to Bayesian inverse problems can be found in [Cleary et al., 2020](https://arxiv.org/pdf/2001.03689.pdf).
+### The three steps of the algorithm:
-In a one-sentence summary, the **calibrate** step of the algorithm consists of an Ensemble Kalman inversion that is used to find good training points for a Gaussian process regression, which in turn is used as a surrogate (**emulator**) of the original forward model ``G`` in the subsequent Markov chain Monte Carlo **sampling** of the posterior distributions of the unknown parameters.
+The **calibrate** step of the algorithm consists of an application of [Ensemble Kalman Processes](https://github.com/CliMA/EnsembleKalmanProcesses.jl), that generates input-output pairs in high density around an optimal parameter ``u^*``. This ``u^*`` will be near a mode of the posterior distribution (Note: This the only time we interface with the forward model ``G``).
+The **emulate** step takes these pairs and trains a statistical surrogate model (Gaussian process), emulating the forward map ``G``.
+The **sample** step uses this surrogate in place of ``G`` in a sampling method (Markov chain Monte Carlo) to sample the posterior distribution of ``u``.
`CalibrateEmulateSample.jl` contains the following modules:
-Module | Purpose
-EnsembleKalmanProcesses.jl | Calibrate – Ensemble Kalman inversion
-GaussianProcessEmulator.jl | Emulate – Gaussian process regression
-MarkovChainMonteCarlo.jl | Sample – Markov chain Monte Carlo
-Observations.jl | Structure to hold observations
-Utilities.jl | Helper functions
+Module | Purpose
+CalibrateEmulateSample.jl | Pulls in the [Ensemble Kalman Processes](https://github.com/CliMA/EnsembleKalmanProcesses.jl) package
+Emulator.jl | Emulate: Modular template for emulators
+GaussianProcess.jl | - A Gaussian process emulator
+MarkovChainMonteCarlo.jl | Sample: Modular template for MCMC
+Utilities.jl | Helper functions
**The best way to get started is to have a look at the examples!**
+## Authors
+`CalibrateEmulateSample.jl` is being developed by the [Climate Modeling
\ No newline at end of file
@test length(σ4²) == size(new_inputs, 2)
@test size(σ4²[1]) == (d, d)