Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to bootstrap of free energy surfaces, affecting timing and quantitative results #535

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

mrshirts
Copy link
Collaborator

Free energy surface code was calling MBAR after each call to randomizing bootstraps. It does not appear to affect the results, but slows things down by a factor of a little less than 2x (146 vs 87 seconds for one sample run).

Copy link

codecov bot commented Aug 26, 2024

Codecov Report

Attention: Patch coverage is 52.94118% with 8 lines in your changes missing coverage. Please review.

Project coverage is 69.70%. Comparing base (85e034c) to head (4ed45db).

Additional details and impacted files

@mrshirts
Copy link
Collaborator Author

Couple more changes I will put in to fix uncertainties, don't do anything yet to it . . .

@mrshirts
Copy link
Collaborator Author

OK, I think I got the changes in I needed to.

@mikemhenry
Copy link
Contributor

@mrshirts when you are ready for review go ahead and add whoever you want to review this PR 😄

@mrshirts mrshirts self-assigned this Aug 26, 2024
@mrshirts mrshirts marked this pull request as draft August 26, 2024 17:02
self.u_kn[:, bootstrap_indices], self.N_k, initial_f_k=self.mbar.f_k
)
x_nb = x_n[bootstrap_indices]
# recompute MBAR.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unnecessary - it was running MBAR too many times. This saves approximately 2X time.

Copy link
Collaborator Author

@mrshirts mrshirts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments on this for other people.

fall[:, b] = h["f"] - h["f"][j]
df_i = np.std(fall, axis=1)
fall[:, b] = h["f"] - h["f"][j] # subtract out the reference bin
df_i = np.std(fall, ddof=1, axis=1)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing the std definition.

@@ -1510,7 +1512,7 @@ def _get_fes_histogram(
fall[i, j, b] = (
histogram_datas[b]["f"] - histogram_datas[b]["f"].transpose()
)
dfxij_vals = np.std(fall, axis=2)
dfxij_vals = np.std(fall, ddof=1, axis=2)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing std definition

kde = self.kde
kde.fit(x_n, sample_weight=self.w_n)
kde = self.kde # use these new weights for the KDE
w_n = self.w_n
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually can't remember if this was 100% necessary to get updated weights . . .

fall[:, b] = h["f"] - h["f"][j]
df_i = np.std(fall, axis=1)
fall[:, b] = h["f"] - h["f"][j] # subtract out the reference bin
df_i = np.std(fall, ddof=1, axis=1)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix bootstrap std definition

@@ -1565,9 +1567,15 @@ def _get_fes_kde(
if reference_point == "from-lowest":
fmin = np.min(f_i)
f_i = f_i - fmin
wheremin = np.argmin(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to find the location that this is zeroed at for the actual computation of the std.

elif reference_point == "from-specified":
fmin = -self.kde.score_samples(np.array(fes_reference).reshape(1, -1))
f_i = f_i - fmin
wheremin = np.argmin(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to find the location that this is zeroed at for the actual computation of the std.

fall[:, b] = -self.kdes[b].score_samples(x) - fmin
df_i = np.std(fall, axis=1)
fall[:, b] = -self.kdes[b].score_samples(x)
fall[:, b] -= fall[wheremin, b]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zero out at the correct location.

@mrshirts mrshirts changed the title Fix to bootstrap of free energy surfaces, affecting timing Fix to bootstrap of free energy surfaces, affecting timing and quantitative results Aug 26, 2024
@maxentile maxentile removed their request for review August 26, 2024 17:38
@mrshirts
Copy link
Collaborator Author

Suggestions for anyone else who should review - or if there's anyone who could take a look? We are looking at some free energy surface problems for OpenFF, so we want to get this through.

@mrshirts mrshirts marked this pull request as ready for review August 29, 2024 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants