Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the MBAR calculation #357

Merged
merged 18 commits into from
May 17, 2024
Merged

Conversation

xiki-tempula
Copy link
Collaborator

@xiki-tempula xiki-tempula commented May 15, 2024

I have adapted @mrshirts 's advice of using BAR results as initial guess into the MBAR, which seems to provide a 5-fold speed up.

from alchemtest.gmx import load_ABFE
from alchemlyb.parsing.gmx import extract_u_nk
import pandas as pd
df_list = [extract_u_nk(data, 310) for data in load_ABFE()['data']['complex']]
df = pd.concat(df_list)
mbar = MBAR(initial_f_k=None)
%timeit mbar.fit(df)
1.66 s ± 7.42 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
mbar = MBAR(initial_f_k='BAR')
%timeit mbar.fit(df)
248 ms ± 2.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I tried to optimise the convergence detection as well, where using the result from the previous MBAR run as input into the next MBAR run. Compared to using BAR as input, this approach still provides some speed up.

from alchemlyb.convergence import forward_backward_convergence
%timeit forward_backward_convergence(df_list)

Using BAR as initial guess
3.58 s ± 35.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Using the previous step as initial guess.
2.25 s ± 23.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Copy link

codecov bot commented May 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.83%. Comparing base (55870c8) to head (8f94e37).
Report is 11 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #357   +/-   ##
=======================================
  Coverage   98.82%   98.83%           
=======================================
  Files          28       28           
  Lines        1875     1890   +15     
  Branches      405      409    +4     
=======================================
+ Hits         1853     1868   +15     
  Misses          2        2           
  Partials       20       20           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat idea and cool that it gives a real performance boost.

Definitely needs documentation!

src/alchemlyb/convergence/convergence.py Show resolved Hide resolved
with:
token: ${{ secrets.CODECOV_TOKEN }}
token: ${{ secrets.CODECOV }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a change due to a merge or did you smuggle it into this PR??

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you could see that the CI failed for codecov token issues, I tried to see if I could fix it.

result = estimator_fit(sample)
result = estimator_fit.fit(sample)
if estimator == "MBAR":
estimator_fit.initial_f_k = result.delta_f_.iloc[0, :]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't fully follow the code path but how is delta_f already initialized with the BAR results?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you did a MBAR calculation first result = estimator_fit.fit(sample)
Which populates the result.delta_f_ with for the current MBAR results.
This MBAR results is then propagated back as initial guess for the next calculation.

src/alchemlyb/estimators/mbar_.py Show resolved Hide resolved
@xiki-tempula xiki-tempula marked this pull request as ready for review May 16, 2024 09:32
@xiki-tempula
Copy link
Collaborator Author

Some codecov is still failing for token issue but otherwise this seems fine. When this PR is merged. Do you mind if I do a new 2.2.1 release? I think the speed increase is quite significant.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make a 2.3.0 release (see my suggested changes — you can just accept the whole bunch) and document carefully API change (new default).

Please add a test for MBAR(initial_nk=None) for the old behavior, given that default changed.

src/alchemlyb/estimators/mbar_.py Show resolved Hide resolved
@@ -71,14 +80,19 @@ def __init__(
self,
maximum_iterations=10000,
relative_tolerance=1.0e-7,
initial_f_k=None,
initial_f_k: np.ndarray | Literal["BAR"] | None = "BAR",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the default is technically speaking an API change so we can't just do it in a patch release.

The conservative approach would be to keep the original default.

However, I think we want to make this speed-up available from release so my suggestion is to make a 2.3 release and make clearer that this changed — see other comments.

src/alchemlyb/estimators/mbar_.py Show resolved Hide resolved
src/alchemlyb/estimators/mbar_.py Outdated Show resolved Hide resolved
src/alchemlyb/estimators/mbar_.py Outdated Show resolved Hide resolved
CHANGES Outdated Show resolved Hide resolved
CHANGES Show resolved Hide resolved
src/alchemlyb/tests/test_fep_estimators.py Show resolved Hide resolved
xiki-tempula and others added 6 commits May 17, 2024 08:49
Co-authored-by: Oliver Beckstein <[email protected]>
Co-authored-by: Oliver Beckstein <[email protected]>
@orbeckst orbeckst merged commit 46cc83b into alchemistry:master May 17, 2024
10 of 11 checks passed
xiki-tempula added a commit to xiki-tempula/alchemlyb that referenced this pull request May 18, 2024
@xiki-tempula xiki-tempula deleted the feat_fastMBAR branch May 21, 2024 06:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants