-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
159 feat bms return variable number of models #161
159 feat bms return variable number of models #161
Conversation
…from being considered as models. We were only seeing BMS run a single tree since we type set!
@TheLemonPig good catch! should we be more generally concerned about the convoluted type changes in |
@benwandrew I am hoping not. Most of it definitely not. But hard to say for sure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
I've one major suggestion: I think a test case for the new functionality would are really useful.
I've made a start in #165, but I think it would be good to show how you expect people to use this functionality.
@@ -119,7 +119,7 @@ def tree_swap(self) -> Tuple[Optional[str], Optional[str]]: | |||
self.trees[self.Ts[nT2]] = t1 | |||
t1.BT = BT2 | |||
t2.BT = BT1 | |||
self.t1 = self.trees["1"] | |||
self.t1 = self.trees["1.0"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does the 1.0 do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably easier to see looking in the code but I'll do my best to describe it. The keys of the dictionary of temperatures are created as numerical values and then recast as strings. Originally the code contained one integer 1
and the rest floats. When we type casted, we had to change an integer to a float. 1
became 1.0
, which is different when cast as a string and so is a distinct key value. The code assumes a one-to-one correspondence between temperature values and dictionary keys. The code also hard codes one tree with 1
and then iteratively creates trees from the temperatures, but skips 1
(which we changed to 1.0
). So the parallel machine scientist held a tree for a temperature value of 1
and of 1.0
, and considers the the key "1"
to hold the best model and ignores "1.0"
. When comparing trees in tree_swap()
, it only chooses among those that correspond to a temperature value stored in, which are recorded as floats, and so ignores "1"
. Hence the model selected by BMS is unaffected by tree_swap()
. The tree corresponding to "1"
does improve over time because it is still affected by mcmc_step()
. However, we lose the functionality of having higher temperatures, tree swaps, and the other 95% of the training results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tl;dr: when we type set temperatures to be floats, the hooks forced us to change a 1
to a 1.0
. Turns out there was some hard coding which relied on it being specifically 1
.
…-feat-bms-return-variable-number-of-models-suggestions-john
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
The attribute was added for the dissimilarity experimentalist. Currently, there is not code elsewhere in the BMS to make use of this attribute. The print statement in #165 is the only use of it that I can think of. We could add an argument to present results to present the results of all the models perhaps. |
…-feat-bms-return-variable-number-of-models-suggestions-john
…-feat-bms-return-variable-number-of-models-suggestions-john
…e-number-of-models-suggestions-john add test case for models_
I have just created an issue to further address this: AutoResearch/autora-theorist-bms#31 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's something broken here when running under python 3.9 – the test cases (test_tree_mcmc_stepping) suddenly takes ~30 minutest to run. Doesn't seem to affect python 3.8 or 3.10. We should fix that before we merge, because this will cause us problems with the tests of all our future code if we don't.
Description
Simple attribute added to BMSRegressor which contains all parallel trees. This is for a new experimentalist which compares fit between models. A major nonbreaking bug was spotted and resolved in the process (which will allow BMS to work better with more difficult data).
_Fixes #159 _
Type of change:
Features:
Questions:
Remarks: