-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix to bootstrap of free energy surfaces, affecting timing and quantitative results #535
base: master
Are you sure you want to change the base?
Conversation
…changed indices, taking too long.
Couple more changes I will put in to fix uncertainties, don't do anything yet to it . . . |
OK, I think I got the changes in I needed to. |
@mrshirts when you are ready for review go ahead and add whoever you want to review this PR 😄 |
self.u_kn[:, bootstrap_indices], self.N_k, initial_f_k=self.mbar.f_k | ||
) | ||
x_nb = x_n[bootstrap_indices] | ||
# recompute MBAR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was unnecessary - it was running MBAR too many times. This saves approximately 2X time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments on this for other people.
fall[:, b] = h["f"] - h["f"][j] | ||
df_i = np.std(fall, axis=1) | ||
fall[:, b] = h["f"] - h["f"][j] # subtract out the reference bin | ||
df_i = np.std(fall, ddof=1, axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing the std definition.
@@ -1510,7 +1512,7 @@ def _get_fes_histogram( | |||
fall[i, j, b] = ( | |||
histogram_datas[b]["f"] - histogram_datas[b]["f"].transpose() | |||
) | |||
dfxij_vals = np.std(fall, axis=2) | |||
dfxij_vals = np.std(fall, ddof=1, axis=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing std definition
kde = self.kde | ||
kde.fit(x_n, sample_weight=self.w_n) | ||
kde = self.kde # use these new weights for the KDE | ||
w_n = self.w_n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually can't remember if this was 100% necessary to get updated weights . . .
fall[:, b] = h["f"] - h["f"][j] | ||
df_i = np.std(fall, axis=1) | ||
fall[:, b] = h["f"] - h["f"][j] # subtract out the reference bin | ||
df_i = np.std(fall, ddof=1, axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix bootstrap std definition
@@ -1565,9 +1567,15 @@ def _get_fes_kde( | |||
if reference_point == "from-lowest": | |||
fmin = np.min(f_i) | |||
f_i = f_i - fmin | |||
wheremin = np.argmin( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to find the location that this is zeroed at for the actual computation of the std.
elif reference_point == "from-specified": | ||
fmin = -self.kde.score_samples(np.array(fes_reference).reshape(1, -1)) | ||
f_i = f_i - fmin | ||
wheremin = np.argmin( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to find the location that this is zeroed at for the actual computation of the std.
fall[:, b] = -self.kdes[b].score_samples(x) - fmin | ||
df_i = np.std(fall, axis=1) | ||
fall[:, b] = -self.kdes[b].score_samples(x) | ||
fall[:, b] -= fall[wheremin, b] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zero out at the correct location.
Suggestions for anyone else who should review - or if there's anyone who could take a look? We are looking at some free energy surface problems for OpenFF, so we want to get this through. |
Free energy surface code was calling MBAR after each call to randomizing bootstraps. It does not appear to affect the results, but slows things down by a factor of a little less than 2x (146 vs 87 seconds for one sample run).