Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add R_c (convergence fraction) convergence analysis #239

Merged
merged 32 commits into from
Oct 31, 2022

Conversation

xiki-tempula
Copy link
Collaborator

@xiki-tempula xiki-tempula commented Sep 23, 2022

Fix #104

I have reimplemented the R_c implemented by @VOD555.
I have also implemented the A_c based on my understanding of the paper.

I have done a speed check, for a 32 window simulation where each window has 10,000 data points.

from alchemlyb.convergence import A_c
import time

data = pd.Series(data=range(10000))
data.attrs['temperature'] = 310
data.attrs['energy_unit'] = 'kcal/mol'
a=time.time()
value = A_c([data, ] * 32)
b=time.time()

Takes 0.01 second, so I think the performance is satisfying.

Includes commits from

@codecov
Copy link

codecov bot commented Sep 23, 2022

Codecov Report

Merging #239 (2628740) into master (3809fdf) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #239      +/-   ##
==========================================
+ Coverage   98.65%   98.68%   +0.02%     
==========================================
  Files          26       26              
  Lines        1711     1750      +39     
  Branches      352      359       +7     
==========================================
+ Hits         1688     1727      +39     
  Misses          3        3              
  Partials       20       20              
Impacted Files Coverage Δ
src/alchemlyb/preprocessing/subsampling.py 100.00% <ø> (ø)
src/alchemlyb/convergence/__init__.py 100.00% <100.00%> (ø)
src/alchemlyb/convergence/convergence.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@xiki-tempula xiki-tempula marked this pull request as ready for review September 26, 2022 13:00
@xiki-tempula
Copy link
Collaborator Author

@orbeckst Do you mind review this PR, please?
Note that the coverage is not as high as we want due to

for i in range(out_length):
    if all(conv[i:]):
        return i / length

This loop checks of the last i points in conv are true. conv is true when the backward and the forward are all with a certain distance to the end point. Given that the last forward point must be the same as the last backward point as they both cover the whole trajectory, conv[-1] is always true and thus, the else part of this loop is never covered, which lowers the coverage.

@orbeckst
Copy link
Member

I might be able to look at it near the end of the week. Sorry, a bit busy.

@VOD555 , could you please review this PR? It's your method after all and you know it best. Although you cannot block or approve in alchemlyb, just state in your review what — if anything — needs to be done. I'll then use your review as a basis. Thank you!

Comment on lines 222 to 223
if all(conv[i:]):
return i / length
Copy link
Member

@orbeckst orbeckst Sep 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting on #239 (comment) : How about adding

else: # pragma: no cover
   # This loop checks of the last i points in conv are true. conv is true when the backward 
   # and the forward are all with a certain distance to the end point. Given that the last forward 
   # point must be the same as the last backward point as they both cover the whole trajectory, 
   # conv[-1] is always true and thus, the else part of this loop is never executed.
   raise AssertionError("Reached forbidden code path in convergence.R_c")

to make clear that the else part is never executed. Probably shorten the comment a bit... I just copied and pasted.

conv = ((g_forward>g-diff) & (g_forward<g+diff)) & ((g_backward>g-diff)&(g_backward<g+diff))
for i in range(out_length):
if all(conv[i:]):
return i / length
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to also return g_forward and g_backward so that one can draw the convergence plot with these two series.

@xiki-tempula xiki-tempula marked this pull request as draft October 5, 2022 09:06
@xiki-tempula
Copy link
Collaborator Author

@VOD555 Thanks for the review. I have returned the forward and backward as dataframe now.

@xiki-tempula xiki-tempula marked this pull request as ready for review October 5, 2022 12:25
@orbeckst
Copy link
Member

@xiki-tempula sorry, I'll look at this PR next week; I pretty much burned my "time to work on alchemlyb" on looking at all the other PRs.

@VOD555 , could you please have another look and review? Thank you!!

@orbeckst
Copy link
Member

@xiki-tempula could you please resolve the conflicts with all the other recently merged PRs?

@xiki-tempula xiki-tempula requested a review from orbeckst October 19, 2022 17:59
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks very good. I have some questions and would also like @VOD555 's input.

For consistency I'd keep input in $k_B T$ but if you have good reasons for absolute energies please say so.

src/alchemlyb/convergence/convergence.py Outdated Show resolved Hide resolved
src/alchemlyb/convergence/convergence.py Outdated Show resolved Hide resolved
docs/convergence.rst Outdated Show resolved Hide resolved

# Final value
g = g_forward[-1]
conv = ((g_forward>g-tol) & (g_forward<g+tol)) & ((g_backward>g-tol)&(g_backward<g+tol))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this faster than abs?

src/alchemlyb/convergence/convergence.py Show resolved Hide resolved
Comment on lines 261 to 265
The output will the float A_c. A_c is a number between 0 and 1 that can be
interpreted as the ratio of the total equilibrated simulation time to the
whole simulation time for a full set of simulations. A_c = 1 means that all
simulation time frames in all windows can be considered equilibrated, while
A_c = 0 indicates that nothing is equilibrated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use math mode for $A_c$ and equations like $A_c = 1$

:func:`~alchemlyb.preprocessing.subsampling.decorrelate_u_nk` or
:func:`~alchemlyb.preprocessing.subsampling.decorrelate_dhdl`.

The output will the float R_c. R_c = 0 indicates that the system is well
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

math mode $R_c$


Note
----
This function tries to compute R_c from equation 18 from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function computes $R_c$ ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really mean R_c or should it be A_c.

I'd also put a SeeAlso to R_c and state that the R_c are calculated for the whole set using R_c().

precision : float
The precision of the output R_c.
diff : float
Tolerance of the convergence check in kcal/mol.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in kT ?

src/alchemlyb/convergence/convergence.py Show resolved Hide resolved
@xiki-tempula xiki-tempula mentioned this pull request Oct 25, 2022
6 tasks
@xiki-tempula xiki-tempula added this to the 1.0.0 milestone Oct 25, 2022
@xiki-tempula
Copy link
Collaborator Author

@orbeckst Thanks very much for the review. Sorry for the delay. I have addressed all the issues.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor doc things, otherwise looks good.

@VOD555 do you want to take a last look?

CHANGES Outdated Show resolved Hide resolved
docs/convergence.rst Outdated Show resolved Hide resolved
docs/convergence.rst Outdated Show resolved Hide resolved
docs/convergence.rst Show resolved Hide resolved
src/alchemlyb/convergence/convergence.py Outdated Show resolved Hide resolved
src/alchemlyb/convergence/convergence.py Outdated Show resolved Hide resolved
src/alchemlyb/convergence/convergence.py Show resolved Hide resolved
@xiki-tempula
Copy link
Collaborator Author

@orbeckst Thanks for the review. I have checked that the reference could be resolved.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all my nitpicks, much appreciated.

docs/convergence.rst Outdated Show resolved Hide resolved
@orbeckst
Copy link
Member

I made one change to the docs (wasn't in your edits) and I am now waiting for CI.

@orbeckst orbeckst self-assigned this Oct 31, 2022
@orbeckst orbeckst merged commit cbdbfcd into alchemistry:master Oct 31, 2022
@orbeckst
Copy link
Member

Thank you @xiki-tempula , a very welcome contribution!

@orbeckst
Copy link
Member

I added @VOD555 as a "co-authored" by to the squash-merge because you said that you used his code as an initial basis and because he participated in the review. I put an extra note in the commit message that this version has your performance improvement included.

@xiki-tempula xiki-tempula deleted the feat_Rc branch October 31, 2022 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add R_c (convergence fraction) convergence analysis
3 participants