-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
__del__ destructor for Universes? #297
Comments
So adding the del destructor does create a memory leak.. :) |
So, I think I was able to finally track this down. I always thought it was a problem of having the import gc
class Universe():
def __init__(self):
self.trajectory = Reader()
def __del__(self):
print "Universe is dying"
class Reader():
def __init__(self):
self.somevar = 3
def __del__(self):
print "Trajectory is dying"
u = Universe()
u = None
gc.collect() # This is empty, meaning all got properly collected Now let's add some more realism to our toy Universe: import gc
class Universe():
def __init__(self):
self.trajectory = Reader()
self.atom = Atom(self) # Having this or the following lines breaks garbage collection
self.universe = self
def __del__(self):
print "Universe is dying"
class Reader():
def __init__(self):
self.somevar = 3
def __del__(self):
print "Trajectory is dying"
class Atom():
def __init__(self, parent):
self.universe = parent
u = Universe()
u = None
gc.collect()
gc.garbage # [<__main__.Universe instance at 0x1004a91b8>, <__main__.Reader instance at 0x1004a9200>] As far as I know there's no way to clean this automatically. A user would not only have to set |
https://docs.python.org/2/library/weakref.html If we made all the references to Universe (from Atom etc) a Can you try? class Atom():
self.universe = weakref.ref(parent) |
Ah, was just writing to propose that! I'll try it out and let you know. |
import gc
import weakref
class Universe():
def __init__(self):
self.trajectory = Reader()
self.atom = Atom(self) # Having this or the following lines breaks garbage collection
self.universe = weakref.ref(self)
def __del__(self):
print "Universe is dying"
class Reader():
def __init__(self):
self.somevar = 3
def __del__(self):
print "Trajectory is dying"
class Atom():
def __init__(self, parent):
self.universe = weakref.ref(parent) Leads to
So possibly fixed? I think deleting Universe should break everything that refers to Universe... makes sense to me |
Well, but now what happens is that if you let the def first10(top,traj):
u = Universe(top,traj)
return u.atoms[:10]
ag = first10("top", "traj")
ag.positions #This works now, but wouldn't anymore with the weakrefs I see no problem with breaking this. It'll have to be well-documented. What's your take? |
Ah, I didn't think of scopes like that.... I don't think AtomGroups or any other products of Universe are designed to live standalone though, so I think the above code should break. I guess the next step is to implement this onto the real code and see which tests break |
I'm on it. Might take a bit of fine-combing to find all that refers back to the |
I think it should be just Atoms... "git grep '.universe ='" might make it easier to find |
Just to chime in: I am fine with standalone AtomGroups dying; it's fine to say that they have to live in some Universe. FYI, the # create a list of Atoms, then convert it to an AtomGroup
atoms = [copy.copy(a) for gr in args for a in gr]
for a in atoms:
a.universe = u which looks to me like the canonical way to do this kind of thing. |
Ok, all tests are passing. I am against keeping a
Option 2 is more forethoughtful, but has the side-effect of breaking the API in the terms we discussed above ( I'm for option 2, mostly because I don't use |
Push the branch to this repo and make a pull request so we can see the changes |
…verses AtomGroups can now be pickled/unpickled (closes #293) A weakref system was implemented so that Universes can have a __del__ method without memleaking (closes #297) ChainReaders no longer leak (and also no longer have a __del__ method) (closes #312) Tests now specifically look for uncollectable objects at the end. Breaks parallelization, though.
I favour (2):
I don't think we ever promised that def get_protein(*args):
return Universe(*args).selectAtoms("protein")
protein = get_protein("some.pdb")
# contrived analysis...
protein_coords = protein.positions
Rg = protein.radiusOfGyration() The above will possibly fail with One would have to rewrite it to return the For this it would be really useful to give a good error message once the proverbial manure hits the fan and a Otherwise: as @richardjgowers said: push a feature-issue297 branch, which will engage Travis and allows to directly comment on the code. |
1- I'll look into failing gracefully. Since the access to the universe is a managed property in Atom and AtomGroup I think it's as easy as catching it there. 2- Yup, everything should be in that same pull request. Did I forget something again? |
I can't remember what the problem with this was. As long as we only have the import gc
class Universe():
def __init__(self):
self.trajectory = Reader()
self.atom = Atom(self)
self.universe = self
class Reader():
def __init__(self):
self.somevar = 3
def __del__(self):
print 'Closing'
class Atom():
def __init__(self, parent):
self.universe = parent
self.name = 'Alfred'
u = Universe()
at = u.atom
del u
gc.collect()
print gc.garbage
# at should still work here
print "Hello I'm {}".format(at.name)
print "My daddy is called {}".format(at.universe)
del at
gc.collect()
print gc.garbage |
Following up from #447 : @kain88-de , when we discussed this problem, we thought that the use cases that you're presenting are not very common (see e.g. my summary comment). I agree with you that the current behavior leads to extremely obscure errors and we never put code in place to provide better error messages. We can certainly re-open the discussion and perhaps you see a way by which we can have our cake (picklable AtomGroups, no memory leaks) and eat it, too (#447). Perhaps some of the problems will go away with #363 ? |
Ok, I removed the weakref thing from @orbeckst I think #363 is just going to be rearranging data structures, not their relationships. So hopefully I'm only responsible for performance there :D @kain88-de Can you check this allows your use case? I did a little test locally and I think it does @mnmelo Can you check that this isn't leaking horribly? import MDAnalysis as mda
import gc
def get_ag():
u = mda.Universe('adk.psf','adk_dims.dcd')
return u.atoms[:10]
ag = get_ag()
# Universe should still exist at this point
print ag.positions
del ag
# Universe should now be dead
gc.collect()
print gc.garbage |
The title says it all. @richardjgowers mentioned it'd be interesting to implement, but currently that leads to mem leaks as the garbage collector can't clean up the Universe.
See the discussion on #293.
If Universe has a del I end up with two uncollectables: the Universe itself and the trajectory (which also has a del). I tried to find any sort of circular reference between the two, but found nothing.
Is it worth pursuing this?
The text was updated successfully, but these errors were encountered: