Model fitting using jaxopt solvers #364

frazane · 2023-08-22T10:42:04Z

Type of changes

Checklist

I've formatted the new code by running poetry run pre-commit run --all-files --show-diff-on-failure before committing.
I've added tests for new code.
I've added docstrings for the new code.

Description

Model training using fit can now be called with a much wider choice of optimization algorithms provided as jaxopt solvers. These also include optax solvers (which are currently used).

thomaspinder

Looks good @frazane! A high-level comment on the notebook updates. I see you've updated the code, but can ensure that the supporting text is aligned. In the regression notebook, for example, I think there should be changes

thomaspinder · 2023-08-24T04:40:10Z

gpjax/fit.py

+    if isinstance(solver, jaxopt.OptaxSolver):
+        model = jax.tree_map(lambda x: x.astype(jnp.float64), model)
+
+    # Initialise solver state.
+    solver.fun = _wrap_objective(solver.fun)
+    solver.__post_init__()  # needed to propagate changes to `fun` attribute
+
+    solver_state = solver.init_state(
+        model,
+        get_batch(train_data, batch_size, key) if batch_size != -1 else train_data,
+    )
+    jitted_update = jax.jit(solver.update)


Should we have any additional unit tests to check this block is running correctly?

It's a good idea, yes. However I am not convinced about this piece of code (particularly lines 131-132) to start with, and I am open to suggestions.

I wonder if instead of wrapping the objective function inside fit like so, we could specify whether to do constraints and stopping gradients when instantiating the objective. Something like:

nmll = gpx.ConjugateMLL(negative=True, stop_gradients=True, constrain_model=True)

Hmm, my concern with this approach is that it demands a lot of the user. If I've passed a bijector or a trainability status to my parameters, then I would not expect to have to explicitly apply this functionality again later on. Thoughts?

I agree with your concerns. For now I'll keep it as is and add a test to check that specific block 👍

henrymoss · 2023-09-18T14:56:58Z

I've taken this on so we can merge in @frazane's absence.

I've fiddled it so it all works with the new BO code and it seems to pass all the tests locally.

I can;t work out why it wont pass on the CI though. Please can someone take a quick look at the CI logs and explain them to me @thomaspinder or @daniel-dodd ?

daniel-dodd · 2023-09-18T15:42:11Z

Nice work @henrymoss.

I think we need to remove the plum dependancy! (It conflicts with cola's plum). Doing that should (🤞) fix the issue!

daniel-dodd

Have reviewed the code. Other than the aforementioned issue, this PR looks excellent to me - will give an approval ahead of fixing this, feel free to merge as and when the tests pass. Thanks, @henrymoss, @frazane. :)

henrymoss · 2023-09-19T10:24:44Z

@daniel-dodd I actually think something is really wrong with this.

I was updating the notebooks to use the LBFGS instead of Adam and I couldn't get good performances on the ocean or decision maker notebook.

In the decision maker notebook, I was getting weird things like the plots of the posterior became invisible .....

henrymoss · 2023-09-19T10:28:55Z

Also @daniel-dodd , If you use the new version of Adam (e.g. in the decision maker notebook), model fits seem much worse than with the old version of Adam

daniel-dodd · 2023-09-19T10:39:39Z

Ah strange @henrymoss!

gpjax/fit.py

docs/examples/bayesian_optimisation.py

daniel-dodd · 2023-09-19T20:50:50Z

@henrymoss fixed it!

In the old code each time we call fit we create an (optax) optimiser state - that remains inside within the scope of fit therefore and gets discarded after optimisation.

In the new code, we pass solver. This has the (optax) optimiser state stored on it, and it was carried across over the data acquisition -> whereas we needed to discard / refresh it, since the data changed between data acquisitions.

henrymoss · 2023-09-22T14:15:01Z

Right then @thomaspinder and @daniel-dodd . I have rejigged this somewhat. When I went to update some of the notebooks to use the jaxopt lbfgs, everything got worse, i.e. it was even worse than ADAM!!!

Turns out that the jaxopt lbfgs is its own implementation that lots of people say is a bit naff (similar to the torch and tensorflow versions). I have instead built a wrapper to go to Scipy (still through jaxopt), but now using Scipy's LBFGS directly.

We now have a model fitter than works really really well and has resulted in a lot of the notebooks running a lot faster. For example, the barycentre model fits take 4 secs to get to lower loss than after 25 secs of Adam.

I had to rejig the code a little bit, because the Scipy optimisers just optimise in one go (rather than the step by step of the optax ones). This is actually great, because it means we dont have to specify a maxiter, which, to me, is one of the advantages of LBFGS.

I have also strengthened the typing a bit. We now only support the optax and scipy wrappers from jaxopt. Before, the code suggested that we supported all IterativeSolvers from jaxopt, but this was nonsense. We never actually checked this and all the different solvers will likely require weird hacks to get to work.

thomaspinder · 2023-09-25T04:49:51Z

Thanks for picking this up @henrymoss. It looks like some of the documentation is failing to build - would you mind looking into this and then, once resolved, we should be good to merge.

henrymoss · 2023-09-25T09:09:36Z

Thanks for picking this up @henrymoss. It looks like some of the documentation is failing to build - would you mind looking into this and then, once resolved, we should be good to merge.

I cant recreat this error on my machine :( @daniel-dodd reckons its something to do with macs....

daniel-dodd · 2023-09-28T10:52:50Z

@henrymoss This is not what I thought earlier. I thought we were passing through the same solver to the posterior, and returning this from fit.

Actually what is happening is side-effect behaviour. When we call fit the solver gets modified outsider the function. This is a bug.

mblondel · 2023-10-04T08:53:47Z

JAXopt developer here. JAXopt's LBFGS is getting better and better. We try to fix things as people report issues. The issue is that optimization with linesearch in float32 precision is very hard. Using scipy's LBFGS means that float64 precision is used, while JAX uses float32 by default. If people compare JAXopt's LBFGS and scipy's LBFGS, they should compare both with float64. Please do report issues when you encounter them.

fabianp · 2023-10-04T08:58:58Z

Also, if you keep encountering issues in jaxopt, we would be very grateful if you can report them (ideally with a reproducible test 🙏🏼 )

adam-hartshorne · 2023-10-17T09:29:14Z

I don't know if you have seen this,

Optimistix is a JAX library for nonlinear solvers: root finding, minimisation, fixed points, and least squares.
https://docs.kidger.site/optimistix/

Reasons to use Optimistix rather than JAXopt:

Optimistix is much faster to compile, and faster to run.
Optimistix supports some solvers not found in JAXopt (e.g. optimistix.Newton for root-finding problems).
Optimistix's APIs will integrate more cleanly with the scientific ecosystem being built up around Equinox.
Optimistix is much more flexible for advanced use-cases, see e.g. the way we can mix-and-match different optimisers.

https://docs.kidger.site/optimistix/faq/

frazane added 2 commits August 22, 2023 12:39

add jaxopt dependency

7cf94c2

add fit_jaxopt function

ac45c2e

frazane added the enhancement New feature or request label Aug 22, 2023

frazane requested a review from henrymoss August 22, 2023 10:42

frazane self-assigned this Aug 22, 2023

frazane added 4 commits August 22, 2023 15:03

just use jaxopt

cb3615b

adapt tests

bc54759

adapt examples

80d4976

small fix

3c7a89c

frazane marked this pull request as ready for review August 23, 2023 10:16

fix readme

4d1e4ae

frazane requested a review from thomaspinder August 23, 2023 11:31

thomaspinder requested changes Aug 24, 2023

View reviewed changes

thomaspinder mentioned this pull request Aug 24, 2023

Add classes for handling posteriors during the decision making loop #362

Merged

8 tasks

adapt examples markdown cells

76af7d8

frazane removed the request for review from henrymoss August 24, 2023 08:49

henrymoss added 4 commits September 18, 2023 12:49

WIP

7edd162

WIP

ec5102a

all workin

1d8b4de

lock fixed

1b1890a

daniel-dodd approved these changes Sep 18, 2023

View reviewed changes

henrymoss added 4 commits September 18, 2023 16:58

no more Professor Plum

c02e752

no

3f73d19

undo notebook update

c43672e

oh der

1d51c54

henrymoss force-pushed the jaxopt_fit branch from eeadba6 to 1d51c54 Compare September 19, 2023 10:23

daniel-dodd reviewed Sep 19, 2023

View reviewed changes

gpjax/fit.py Outdated Show resolved Hide resolved

daniel-dodd reviewed Sep 19, 2023

View reviewed changes

gpjax/fit.py Outdated Show resolved Hide resolved

daniel-dodd reviewed Sep 19, 2023

View reviewed changes

docs/examples/bayesian_optimisation.py Show resolved Hide resolved

Fix.

5cd90a4

daniel-dodd and others added 2 commits September 19, 2023 21:57

Remove jit within fit.

a690df0

all done

62c889f

henrymoss added 2 commits September 22, 2023 15:18

fixed

57f0dc3

extra test

4fae37e

henrymoss added 2 commits September 26, 2023 10:05

fixed?

0a09d61

still broken

f46288e

henrymoss added 2 commits October 21, 2023 09:04

Merge branch 'main' into jaxopt_fit

fd1cd2b

wip

83571b4

daniel-dodd closed this Nov 7, 2023

frazane deleted the jaxopt_fit branch November 15, 2023 08:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model fitting using jaxopt solvers #364

Model fitting using jaxopt solvers #364

frazane commented Aug 22, 2023 •

edited

Loading

thomaspinder left a comment

thomaspinder Aug 24, 2023

frazane Aug 24, 2023

thomaspinder Aug 26, 2023

frazane Sep 5, 2023

henrymoss commented Sep 18, 2023

daniel-dodd commented Sep 18, 2023 •

edited

Loading

daniel-dodd left a comment

henrymoss commented Sep 19, 2023

henrymoss commented Sep 19, 2023

daniel-dodd commented Sep 19, 2023

daniel-dodd commented Sep 19, 2023 •

edited

Loading

henrymoss commented Sep 22, 2023

thomaspinder commented Sep 25, 2023

henrymoss commented Sep 25, 2023

daniel-dodd commented Sep 28, 2023

mblondel commented Oct 4, 2023

fabianp commented Oct 4, 2023

adam-hartshorne commented Oct 17, 2023 •

edited

Loading

Model fitting using jaxopt solvers #364

Model fitting using jaxopt solvers #364

Conversation

frazane commented Aug 22, 2023 • edited Loading

Type of changes

Checklist

Description

thomaspinder left a comment

Choose a reason for hiding this comment

thomaspinder Aug 24, 2023

Choose a reason for hiding this comment

frazane Aug 24, 2023

Choose a reason for hiding this comment

thomaspinder Aug 26, 2023

Choose a reason for hiding this comment

frazane Sep 5, 2023

Choose a reason for hiding this comment

henrymoss commented Sep 18, 2023

daniel-dodd commented Sep 18, 2023 • edited Loading

daniel-dodd left a comment

Choose a reason for hiding this comment

henrymoss commented Sep 19, 2023

henrymoss commented Sep 19, 2023

daniel-dodd commented Sep 19, 2023

daniel-dodd commented Sep 19, 2023 • edited Loading

henrymoss commented Sep 22, 2023

thomaspinder commented Sep 25, 2023

henrymoss commented Sep 25, 2023

daniel-dodd commented Sep 28, 2023

mblondel commented Oct 4, 2023

fabianp commented Oct 4, 2023

adam-hartshorne commented Oct 17, 2023 • edited Loading

frazane commented Aug 22, 2023 •

edited

Loading

daniel-dodd commented Sep 18, 2023 •

edited

Loading

daniel-dodd commented Sep 19, 2023 •

edited

Loading

adam-hartshorne commented Oct 17, 2023 •

edited

Loading