Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of Parallelization to MDAnalysis.analysis.contacts #4820

Open
wants to merge 19 commits into
base: develop
Choose a base branch
from

Conversation

talagayev
Copy link
Member

@talagayev talagayev commented Dec 4, 2024

Fixes #4660

Changes made in this Pull Request:

  • added backends and aggregators to Contacts in analysis.contacts
  • added get_box function in analysis.contacts to fix error, which appeared during parallelization
  • added the client_Contacts in conftest.py
  • added client_Contacts in run() in test_contacts.py

The function get_box was implemented due to this Error occuring in pytests with parallelization:
AttributeError: Can't get local object 'Contacts.__init__.<locals>.<lambda>'

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4820.org.readthedocs.build/en/4820/

added aggregator and supported backends + get_box to fix an error with lambda that appeared for multiprocessing & dask
black formatting
addition of client_Contacts
Addition of client_Contacts to pytests
Changelog addition for contacts parallelization
@pep8speaks
Copy link

pep8speaks commented Dec 4, 2024

Hello @talagayev! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-12-15 00:49:24 UTC

added versionchange
import ResultsGroup
Copy link

codecov bot commented Dec 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.63%. Comparing base (7f686ca) to head (4ce62d1).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4820      +/-   ##
===========================================
- Coverage    93.65%   93.63%   -0.03%     
===========================================
  Files          177      189      +12     
  Lines        21767    22840    +1073     
  Branches      3065     3064       -1     
===========================================
+ Hits         20386    21386    +1000     
- Misses         929     1002      +73     
  Partials       452      452              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

package/MDAnalysis/analysis/contacts.py Outdated Show resolved Hide resolved
else:
self._get_box = lambda ts: None
self.pbc = pbc
self._get_box = functools.partial(get_box, pbc=self.pbc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self._get_box = functools.partial(get_box, pbc=self.pbc)
self._get_box = lambda ts: ts.dimensions if self.pbc else None

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RMeli I tried running it with the suggestion locally and I get

AttributeError: Can't get local object 'Contacts.__init__.<locals>.<lambda>'

It is somehow due to the parallelization having problems with lamba, it also does not like this

        self._get_box = functools.partial(
            lambda ts, pbc: ts.dimensions if pbc else None, pbc=self.pbc
        )

it leads to the same error, so yes I would agree that the function is quite general, so moving
def get_box() to be a function in the Contacts class would be fine? since that would
works with the tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't put get_box() elsewhere because it's really a bit clunky and not really that useful except that it's needed here (or do we have other places where we need it??). I'd keep it as local and non-public as possible. Writing it as a lambda would be the nicest way but if that does not work (... good to know and a bit sad ...) then I'd make it a static private class method

@staticmethod
def _get_box_func(ts, pbc):
    """... keep the written docs ... """
    return ts.dimensions if pbc else None

which you can then use for the partial.

Add a comment to self._get_box = functools.partial(self._get_box_func, pbc=self.pbc) why all of this is necessary.


(I assume that following will run into the same issue as lambdas:

def get_box(ts, pbc=self.pbc):
   return ts.dimensions if pbc else None
self._get_box = get_box

as it's also local, right?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @talagayev for the explanation, I didn't know a lambda would not work in this case. I think defining a function instead of a lambda will in our in the same issue, as you suspect @orbeckst, but haven't tried. If we don't want to move the function elsewhere, I agree it should be private and not exposed by the module.

Copy link
Member Author

@talagayev talagayev Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I assume that following will run into the same issue as lambdas:

def get_box(ts, pbc=self.pbc):
   return ts.dimensions if pbc else None
self._get_box = get_box

as it's also local, right?)

Yes, tried it right now locally, leads sadly to the Can't get local object 'Contacts.__init__.<locals>.get_box'

Copy link
Member Author

@talagayev talagayev Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but yes making it more local static method is a more sophisticated way then it is currently, will adjust it then :)

as for it being used anywhere else, not really, it is only used here once

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orbeckst @RMeli I moved it now as you suggested and would re request the review, to see if it is correct now :)

adjusted fixes, moved get_box into class and made
it more local and less accessable
forgot to add staticmethod, probably reason of failure in tests added it now
@talagayev talagayev requested review from RMeli and orbeckst December 15, 2024 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MDAnalysis.analysis.contacts Implement parallelization or mark as unparallelizable
4 participants