-
Notifications
You must be signed in to change notification settings - Fork 656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of Parallelization to analysis.hydrogenbonds.hbond_analysis #4718
Implementation of Parallelization to analysis.hydrogenbonds.hbond_analysis #4718
Conversation
Added ResultsGroup for hbond and the supported backends
Added the client for HydrogenBondAnalysis
Added client_HydrogenBondAnalysis to the tests
Added to changelog entry about hbond parallelization
Hello @talagayev! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2024-10-06 15:23:19 UTC |
Linter Bot Results:Hi @talagayev! Thanks for making this PR. We linted your code and found the following: Some issues were found with the formatting of your code.
Please have a look at the Please note: The |
adjusted for PEP
adjusted for PEP
Exciting! Did you happen to benchmark? |
hey @orbeckst, no sadly not, since I wasn't sure if it is correct, due to this one Failure for the specific pytest, but I can check out the performance of the parallel hbond_analysis :) |
I ran some local benchmarking on a local system and it did perform quite well, tested it with 3, 5 & 10 CPUs.
The sytem that needed
Looks reasonable from the times and |
These are very encouraging results, it looks like pretty good scaling! Nice. (Now we just have to ensure that the results are also correct...) What kind of systems were you testing, i.e., number of atoms, number of trajectory frames, type (eg water only, protein in water, ...)? What machine did you test on (CPU?) and where did you store the trajectory (SSD?)? cc @RMeli @marinegor |
@orbeckst I mainly tried to concatenete the MDAnalysisTests Files to get something that would be reproducable, for the tests I mentioned I ran the
https://userguide.mdanalysis.org/stable/examples/analysis/hydrogen_bonds/hbonds.html As for the system that I have its: I run everything through a |
Very nice results @talagayev, thank you for sharing. It's really helpful to have small benchmarks for these PRs, so that we can see that the parallelization indeed speed things up. |
Thanks for sharing indeed @talagayev ! Interesting to see that |
Happy to help :) @marinegor I didn't reset the cache, so that would influence the tests and I only ran them only once, so I could check for multiple runs how it would perform and also reset the filesystem cache after the runs, as for the order it was first without the parallelization, afterwards with @marinegor @RMeli for the benchmarks, should I somehow use then directly the files that are present in the MDAnalysisTests for all of those benchmarks or what would be the best approach to add benchmarks for the parallelization? |
The reason it was failing was that |
I think it is OK the use the provided files and report the results here, so other people can easily run the benchmark on their system too, if they wish. However, if you have real systems you are interested in, it would definitely be good to see the results too. I don't think at this stage we discussed about automated benchmarks using Airspeed Velocity (or other frameworks); maybe that is something to keep in mind and discuss. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #4718 +/- ##
===========================================
- Coverage 93.55% 93.53% -0.02%
===========================================
Files 173 185 +12
Lines 21451 22523 +1072
Branches 3985 3986 +1
===========================================
+ Hits 20068 21067 +999
- Misses 929 1002 +73
Partials 454 454 ☔ View full report in Codecov by Sentry. |
hi @talagayev , I don't think it's actually necessary to run the benchmarks on MDAnalysis test files. Though if you still decide to, here's how I ran my benchmarks earlier: https://gist.github.com/marinegor/17558d1685cd2f24a6de65aa99cf5c9e It also contains zenodo link for the trajectory itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, just waiting for CI (after having added the versionchanged)
@yuxuanzhuang @marinegor @talagayev can you briefly explain why the definitions of donors and acceptors needed to be moved from |
@RMeli @marinegor Ah I see, thanks for the information, yes that makes sense and thanks @marinegor for for the link how you ran the benchmarks, I can try to apply it in the future :) |
I think in general it's ok to define attributes in either of those, unless you'll need them after you run A general rule here might be "if an attribute small and you'll need it later, put it in |
…lysis (MDAnalysis#4718) - Fixes MDAnalysis#4664 - Parallelization of the backend support to the class HydrogenBondAnalysis in hbond_analysis.py - Moved setting up of donors and acceptors from _prepare() to __init__() (needed to make parallel processing work) - Addition of parallelization tests in test_hydrogenbonds_analysis.py and fixtures in conftest.py - Updated Changelog --------- Co-authored-by: Yuxuan Zhuang <[email protected]> Co-authored-by: Oliver Beckstein <[email protected]>
Fixes #4664
Changes made in this Pull Request:
HydrogenBondAnalysis
inhbond_analysis.py
test_hydrogenbonds_analysis.py
and fixtures inconftest.py
There is still the case that one of the pytests raises an Failure, while the remainder work fine, I am currently not sure what the reason for this is.
The Failure is due to this:
AttributeError: 'HydrogenBondAnalysis' object has no attribute '_hydrogens'
and appears only intest_no_hydrogens
PR Checklist
Developers certificate of origin
📚 Documentation preview 📚: https://mdanalysis--4718.org.readthedocs.build/en/4718/