159 feat bms return variable number of models #161

TheLemonPig · 2022-12-05T21:45:49Z

Description

Simple attribute added to BMSRegressor which contains all parallel trees. This is for a new experimentalist which compares fit between models. A major nonbreaking bug was spotted and resolved in the process (which will allow BMS to work better with more difficult data).

_Fixes #159 _

Type of change:

Bug fix
New feature (non-breaking change which adds functionality)

Features:

Regressor attribute containing multiple models which best fit the data at different temperatures

Questions:

Remarks:

I can't believe the bug wasn't spotted. We weren't using the parallel trees at all. The issue is essentially related to the type inconsistency in mcmc.py - during type setting, we changed a '1' to a '1.0', which caused major issues when changing floats to strings for a dictionary
BMS will work much better now! Especially for recovering difficult equations

…from being considered as models. We were only seeing BMS run a single tree since we type set!

…fore fitting

benwandrew · 2022-12-06T19:40:15Z

@TheLemonPig good catch! should we be more generally concerned about the convoluted type changes in mcmc.py now?

TheLemonPig · 2022-12-06T20:10:59Z

@benwandrew I am hoping not. Most of it definitely not. But hard to say for sure

hollandjg

This is great!
I've one major suggestion: I think a test case for the new functionality would are really useful.
I've made a start in #165, but I think it would be good to show how you expect people to use this functionality.

.gitignore

hollandjg · 2022-12-06T21:49:17Z

autora/theorist/bms/parallel.py

@@ -119,7 +119,7 @@ def tree_swap(self) -> Tuple[Optional[str], Optional[str]]:
            self.trees[self.Ts[nT2]] = t1
            t1.BT = BT2
            t2.BT = BT1
-            self.t1 = self.trees["1"]
+            self.t1 = self.trees["1.0"]


what does the 1.0 do?

It's probably easier to see looking in the code but I'll do my best to describe it. The keys of the dictionary of temperatures are created as numerical values and then recast as strings. Originally the code contained one integer 1 and the rest floats. When we type casted, we had to change an integer to a float. 1 became 1.0, which is different when cast as a string and so is a distinct key value. The code assumes a one-to-one correspondence between temperature values and dictionary keys. The code also hard codes one tree with 1 and then iteratively creates trees from the temperatures, but skips 1 (which we changed to 1.0). So the parallel machine scientist held a tree for a temperature value of 1 and of 1.0, and considers the the key "1" to hold the best model and ignores "1.0". When comparing trees in tree_swap(), it only chooses among those that correspond to a temperature value stored in, which are recorded as floats, and so ignores "1". Hence the model selected by BMS is unaffected by tree_swap(). The tree corresponding to "1" does improve over time because it is still affected by mcmc_step(). However, we lose the functionality of having higher temperatures, tree swaps, and the other 95% of the training results.

Tl;dr: when we type set temperatures to be floats, the hooks forced us to change a 1 to a 1.0. Turns out there was some hard coding which relied on it being specifically 1.

…-feat-bms-return-variable-number-of-models-suggestions-john

musslick

Looks good to me!

autora/skl/bms.py

TheLemonPig · 2022-12-07T21:07:45Z

This is great! I've one major suggestion: I think a test case for the new functionality would are really useful. I've made a start in #165, but I think it would be good to show how you expect people to use this functionality.

The attribute was added for the dissimilarity experimentalist. Currently, there is not code elsewhere in the BMS to make use of this attribute. The print statement in #165 is the only use of it that I can think of. We could add an argument to present results to present the results of all the models perhaps.

…-feat-bms-return-variable-number-of-models-suggestions-john

…e-number-of-models-suggestions-john add test case for models_

TheLemonPig · 2022-12-08T15:51:45Z

This is great! I've one major suggestion: I think a test case for the new functionality would are really useful. I've made a start in #165, but I think it would be good to show how you expect people to use this functionality.

The attribute was added for the dissimilarity experimentalist. Currently, there is not code elsewhere in the BMS to make use of this attribute. The print statement in #165 is the only use of it that I can think of. We could add an argument to present results to present the results of all the models perhaps.

I have just created an issue to further address this: AutoResearch/autora-theorist-bms#31

hollandjg

There's something broken here when running under python 3.9 – the test cases (test_tree_mcmc_stepping) suddenly takes ~30 minutest to run. Doesn't seem to affect python 3.8 or 3.10. We should fix that before we merge, because this will cause us problems with the tests of all our future code if we don't.

TheLemonPig added 4 commits December 5, 2022 16:09

resolve MAJOR nonbreaking bug that was preventing the parallel trees …

46a3f51

…from being considered as models. We were only seeing BMS run a single tree since we type set!

add variable to BMSRegressor to store multiple models

375a5d7

update .gitignore with .dat files made by test runs

def8cce

simplify by holding all trees, rather a number you have to specify be…

d06d488

…fore fitting

TheLemonPig requested a review from musslick as a code owner December 5, 2022 21:45

TheLemonPig linked an issue Dec 5, 2022 that may be closed by this pull request

feat: BMS return variable number of models #159

Closed

musslick requested a review from hollandjg December 5, 2022 22:25

Merge branch 'main' into 159-feat-bms-return-variable-number-of-models

842cf16

test: add testcase for models_

348543e

hollandjg requested changes Dec 6, 2022

View reviewed changes

TheLemonPig and others added 2 commits December 6, 2022 17:24

removing lines to resolve in another PR

ee06f86

Merge branch '159-feat-bms-return-variable-number-of-models' into 159…

f15e15f

…-feat-bms-return-variable-number-of-models-suggestions-john

musslick approved these changes Dec 7, 2022

View reviewed changes

autora/skl/bms.py Show resolved Hide resolved

Merge branch 'main' into 159-feat-bms-return-variable-number-of-models

aed52ab

This was referenced Dec 7, 2022

Add models_ parameter to DARTS AutoResearch/autora-theorist-darts#5

Open

Tidy up .dat files produced by running tests AutoResearch/autora-theorist-bms#27

Open

Update tests/test_bms_multi_model_output.py

5d80834

hollandjg and others added 4 commits December 8, 2022 08:51

Merge branch '159-feat-bms-return-variable-number-of-models' into 159…

6b4d7d3

…-feat-bms-return-variable-number-of-models-suggestions-john

Merge branch 'main' into 159-feat-bms-return-variable-number-of-models

b3ad2d3

Merge branch '159-feat-bms-return-variable-number-of-models' into 159…

4b22a87

…-feat-bms-return-variable-number-of-models-suggestions-john

Merge pull request #165 from AutoResearch/159-feat-bms-return-variabl…

c235992

…e-number-of-models-suggestions-john add test case for models_

TheLemonPig requested a review from hollandjg December 8, 2022 15:23

hollandjg added 2 commits December 15, 2022 15:23

Merge branch 'main' into 159-feat-bms-return-variable-number-of-models

5ccabd2

fix: update path to autora.theorist.bms.Tree

5aa11f2

hollandjg approved these changes Dec 15, 2022

View reviewed changes

hollandjg requested changes Dec 15, 2022

View reviewed changes

test: make test results less verbose

387005e

hollandjg approved these changes Dec 15, 2022

View reviewed changes

TheLemonPig merged commit 8ac65a6 into main Dec 15, 2022

TheLemonPig deleted the 159-feat-bms-return-variable-number-of-models branch December 15, 2022 21:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

159 feat bms return variable number of models #161

159 feat bms return variable number of models #161

TheLemonPig commented Dec 5, 2022

benwandrew commented Dec 6, 2022

TheLemonPig commented Dec 6, 2022

hollandjg left a comment

hollandjg Dec 6, 2022

TheLemonPig Dec 7, 2022

TheLemonPig Dec 7, 2022

musslick left a comment

TheLemonPig commented Dec 7, 2022

TheLemonPig commented Dec 8, 2022

hollandjg left a comment

159 feat bms return variable number of models #161

159 feat bms return variable number of models #161

Conversation

TheLemonPig commented Dec 5, 2022

Description

Type of change:

Features:

Questions:

Remarks:

benwandrew commented Dec 6, 2022

TheLemonPig commented Dec 6, 2022

hollandjg left a comment

Choose a reason for hiding this comment

hollandjg Dec 6, 2022

Choose a reason for hiding this comment

TheLemonPig Dec 7, 2022

Choose a reason for hiding this comment

TheLemonPig Dec 7, 2022

Choose a reason for hiding this comment

musslick left a comment

Choose a reason for hiding this comment

TheLemonPig commented Dec 7, 2022

TheLemonPig commented Dec 8, 2022

hollandjg left a comment

Choose a reason for hiding this comment