-
Notifications
You must be signed in to change notification settings - Fork 664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pickling AtomGroups #293
Comments
Correction: I guess the |
Can't an AtomGroup access |
And I think the heuristic for Universe could even be some sort of |
Yes, that's what I was suggesting for the heuristic, but my point is that rebuilding the |
Hmm yeah, searching locals for the Universe makes most sense... but then if many AtomGroups refer to the same Universe, and they want different frames.. won't this break things? Ie.. don't AtomGroups want a "lock" on the Universe frame? |
Hmm... Should they carry such locks? My view of |
Sorry yeah, I'm imagining you want to parallelise by doing AtomGroups individually.. But if you then wanted to iterate the trajectory for each AG, won't this go wrong? for ag in ags_to_do: # if you pool here..
for ts in u.trajectory: # then this will be shared amongst all AGs
# process ag So I guess you just need to make sure that nobody changes the frame, so you need to order your loops like this: for ts in u.trajectory: # iterate the trajectory outside the parallelism
for ag in ags_to_do: # split work here
# process ag I think you're right so long as the loops are ordered correctly. |
Just tried out some code and it turns out that what I'm proposing is not going to fly because even This means that the following code won't work: syst = MDAnalysis.Universe("conf.gro")
ag = syst.atoms[:10]
pkl = pickle.dumps(ag) #pickling can generate a hash from filenames and number of atoms
#but can't look through the main module's variables
#to get the names of the AtomGroup's Universe
ag2 = pickle.loads(pkl) #unpickling takes place in the AtomGroup class
#and only variables at that module level are visible through globals();
#'syst' will never be found. Introspection, via the But I just came up with an idea that might work: |
So as Universes are created, they're stored in the MDAnalysis module that has been imported? So as AtomGroups are unpickled they search through the local MDAnalysis scope looking for Universes? Sounds sane enough, but I'm not sure how this is all "meant" to be done. |
Exactly. I just came across the |
It does work (yay!), with an issue: A solution is to call An alternative would be to register a # NOTE: DO NOT ADD A __del__() method: it somehow keeps the Universe
# alive during unit tests and the unit tests run out of memory!
#### def __del__(self): <------ do not add this! [orbeckst] For reference, the problem here seems to be that when doing this we'll have |
Re: Your feature branch looks good, but you need to merge develop into it so that it has all the travis (automatic testing) settings. Then if you open a pull request, it'll do lots of cool stuff automagically, eg: |
I think I can imagine an application that breaks the auto assignment of a universe when unpickling Let's say you want to do an all-to-all trajectory calculation. You load two universes with the same topology and filename. If you then pickle/unpickle an My solution to this would be the following: In addition, |
Maybe I am missing key points after skimming the thread but it seems that what you want is a unique ID for each Universe so that if you re-instantiate a universe with the same ID you can attach the same AtomGroups again. I might not get the use case that you have in mind but it somewhat reminds me of issues that @dotsdl is solving with his MDSynthesis persistence framework in a different manner. An implementation might be worthwhile, just to show how it works. As @richardjgowers said, make sure you rebase against the latest develop and then generate a pull request. It will get unit tested by travis and provides a convenient frame work for code discussion. |
…verses AtomGroups can now be pickled/unpickled (closes #293) A weakref system was implemented so that Universes can have a __del__ method without memleaking (closes #297) ChainReaders no longer leak (and also no longer have a __del__ method) (closes #312) Tests now specifically look for uncollectable objects at the end. Breaks parallelization, though.
… (PR #2893) * AtomGroup now are pickled/unpickled without looking for its anchored Universe * removed core.universe._ANCHOR_UNIVERSES (originally introduced with #293) and related machinery to keep global state. No global state is kept any more. * update docs for atomgroup pickle * update CHANGELOG * update tests
… (PR MDAnalysis#2893) * AtomGroup now are pickled/unpickled without looking for its anchored Universe * removed core.universe._ANCHOR_UNIVERSES (originally introduced with MDAnalysis#293) and related machinery to keep global state. No global state is kept any more. * update docs for atomgroup pickle * update CHANGELOG * update tests
Universe pickling might be somewhat tricky (#173), and unpickling might be a relatively heavy operation if it involves reloading trajectories.
I've run into the need to pickle Universes mostly as a side effect of actually wanting to pickle
AtomGroups
, in my case as a return from amultiprocessing.pool.map
.My idea was to provide lightweight pickling and unpickling functions that'd do the following:
Universe
identification (total number of atoms and trajectory filename, for instance).globals()
looking for aUniverse
that matches the heuristic. RecreateAtomGroup
from the indices. (Raise an error if no suitableUniverse
is found).Search could be sped up by saving with the pickle the
Universe
names that have the same id as theAtomGroup
's, and then looking at those names first when unpickling:The whole thing feels quite unpythonic, and granted can be easily worked around by just passing back and forth atom index lists instead of
AtomGroups
. But it'd make MDAnalysis work better and more natively with other code that requires pickling.Opinions?
The text was updated successfully, but these errors were encountered: