Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize on Julia 1.6.2 #127

Merged
merged 15 commits into from
Jan 20, 2022
Merged

Standardize on Julia 1.6.2 #127

merged 15 commits into from
Jan 20, 2022

Conversation

tsj5
Copy link
Collaborator

@tsj5 tsj5 commented Jan 18, 2022

This is a revision of PR #126, which ran into a new problem (missing environment module) when attempting to fix #125 (Manifest.toml incompatibility between julia 1.5 and 1.6). See most recent comment on that PR.

Current PR pins the buildkite julia version at 1.6.2, for which a module currently exists.

The Manifests in this PR were generated under 1.6.5, and I haven't been able to verify that they're compatible with 1.6.2 (julialang.org didn't retain 1.6.2, and the conda package for 1.6.2 appears to be broken on osx).

bors bot and others added 6 commits January 5, 2022 13:13
120: Modular emulator interface r=odunbar a=odunbar

## Purpose
Reduces the current Emulator interface dependence of Gaussian Processes. Now the GP can be swapped for another statistical emulator. 
 
## In the PR
- [x] New general `Emulator` class. This handles all the data manipulation e.g. Normalization, Standardization, Decorrelation
- [x] General interface functions for Emulator, `optimize_hyperparameters!`, `predict`
- [x] New `MachineLearningTool` type, 
- [x] Moved the Gaussian Processes into a `GaussianProcess <: MachineLearningTool` class
- [x] Example (e.g. `plot_GP`) to demonstrate the new interface
- [x] Unit tests
- [x] New doc strings.

## Additional change
Seems to be ongoing issues with unit testing Julia 1.5.4, so I have updated the Manifest, Docs.yml and Test.yml to Julia 1.6.X
 
## Changes to user experience:

Ingredients: 
```julia
gppackage = GPJL()
pred_type = YType()
GPkernel = ...
iopairs = PairedDataContainer(x_data,y_data)
```

### Old interface

Set up a `GaussianProcessEmulator` object
```julia 
    gp = GaussianProcess(
         iopairs,
         gppackage;
         GPkernel=GPkernel, 
         obs_noise_cov=nothing, 
         normalized=false, 
         noise_learn=true, 
	 truncate_svd=1.0, 
         standardize=false,
         prediction_type=pred_type, 
         norm_factor=nothing)
 ```
Then predict with it.
```julia
μ, σ² = GaussianProcessEmulator.predict(gp, new_inputs)
```
It is short, but it inherently is stuck to the Gaussian process framework. It also hides e.g. the training away, and we may wish to have this more open. The script below is more general, separating out which parameters are related to data processing and which relate to the specific ML tool.

### New interface
Setup a `GaussianProcess<:MachineLearningTool` object
```julia
 gp = GaussianProcess(
       gppackage;
       kernel=GPkernel,
       noise_learn=true,
       prediction_type=pred_type) 
```
and then create the general emulator type using `gp`
```julia
    em = Emulator(
        gp,
        iopairs,
        obs_noise_cov=nothing,
        normalize_inputs=false,
        standardize_outputs=false,
        truncate_svd=1.0)
```
Train and predict
```julia
Emulators.optimize_hyperparameters!(em)
μ, σ² = Emulators.predict(em, new_inputs)
```
### Adding a new `MachineLearningTool`
Include a new file `NewTool.jl` at the top of `Emulator.jl`
In this file define:
1. `struct NewTool <: MachineLearningTool` with constructor `NewTool(...)` to hold ML parameters and models
2. `function build_models!(NewTool,iopairs)` to build and store ML models. Called in Emulator constructor
3. `function optimize_hyperparameters!(NewTool)`to train the stored ML models. Called by method of same name in Emulator
4. `function predict(NewTool,new_inputs)` to predict with stored ML models Called by method of same name in Emulator


Co-authored-by: odunbar <[email protected]>
@tsj5 tsj5 self-assigned this Jan 18, 2022
@jakebolewski
Copy link
Contributor

bors try

bors bot added a commit that referenced this pull request Jan 18, 2022
@bors
Copy link
Contributor

bors bot commented Jan 18, 2022

try

Build failed:

@jakebolewski
Copy link
Contributor

it seems like the examples are broken?

@tsj5
Copy link
Collaborator Author

tsj5 commented Jan 18, 2022

it seems like the examples are broken?

Yeah, that's what I've got at this point: this PR is up-to-date with staging, which includes the merge from PR #120. That PR implemented major interface changes, e.g. replacing GaussianProcessEmulator with Emulator.GaussianProcess, but I'm still seeing references to the former class in the examples.

For what it's worth, unit tests pass on staging and on this PR.

else
println(truth.mean)
end
println(truth.mean) # same, regardless of norm_factor
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't have needed to rescale truth.mean, since that's what the emulator is emulating, and in the new code y_mean is correctly scaled by reverse_standardize. Verified that vec(y_mean) ≈ truth.mean in the new code, which means this was being done incorrectly (but consistently) on the existing code in /master.

@tsj5
Copy link
Collaborator Author

tsj5 commented Jan 20, 2022

Output of examples now either reproduce /master exactly, or correct known issues in /master (thanks to @odunbar for help in debugging). All unit tests and examples now run on julia 1.6.5 (can't get an instance of 1.6.2 running locally).

@jakebolewski, we're ready to rerun buildkite when you get a sec. Thanks!

@jakebolewski
Copy link
Contributor

bors try

bors bot added a commit that referenced this pull request Jan 20, 2022
@bors
Copy link
Contributor

bors bot commented Jan 20, 2022

try

Build failed:

@jakebolewski
Copy link
Contributor

mkdir examples/GaussianProcessEmulator/depot
export JULIA_DEPOT_PATH="$$(pwd)/examples/GaussianProcessEmulator/depot:$JULIA_DEPOT_PATH"
julia --color=yes --project -e '
println("--- Instantiating Project")
using Pkg;
Pkg.instantiate()
Pkg.activate("examples/GaussianProcessEmulator")
Pkg.instantiate()
println("+++ Running Learn Noise")
include("examples/GaussianProcessEmulator/learn_noise.jl")
println("+++ Running PlotGP")
include("examples/GaussianProcessEmulator/plot_GP.jl")'

need to reflect the new example path (also for the artifacts)

@tsj5
Copy link
Collaborator Author

tsj5 commented Jan 20, 2022

need to reflect the new example path (also for the artifacts)

Argh, can't believe I missed that! Thanks for catching that mistake, @jakebolewski !

@jakebolewski
Copy link
Contributor

bors r+

@bors bors bot merged commit 0f6bae9 into CliMA:staging Jan 20, 2022
@bors
Copy link
Contributor

bors bot commented Jan 20, 2022

Build failed:

@jakebolewski
Copy link
Contributor

It looks like the load path needs to be adjusted in the buildkite file

@tsj5
Copy link
Collaborator Author

tsj5 commented Jan 20, 2022

Thanks again @jakebolewski -- this is fixed in new PR #128, since I can't re-open this one.

Verified that the GaussianProcess examples now run when the exact commands in pipeline.yml are invoked from the repo dir (previously was running from the examples' directory, with the CES module previously loaded in the REPL, which I'm guessing is why I didn't hit the error earlier.)

@tsj5 tsj5 changed the title [WIP] Standardize on Julia 1.6.2 Standardize on Julia 1.6.2 Apr 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants