Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C/Python difference: Setting __class__ activates in C, doesn't activate in Python #155

Closed
jamadden opened this issue Mar 1, 2021 · 4 comments · Fixed by #160
Closed
Assignees
Labels

Comments

@jamadden
Copy link
Member

jamadden commented Mar 1, 2021

In the C implementation, setting __class__ of a ghost calls _p_activate(), and uses the class methods defined on the original class.

But in the Python implementation, setting __class__ does not call _p_activate(); consequently, the class methods of the new class are used.

This makes a difference when class methods include those used in the pickle protocol, such as __setstate__.

Here's an example. Consider this code:

from persistent import Persistent
from ZODB.DB import DB
from ZODB.DemoStorage import DemoStorage
import transaction

class A(Persistent):

    def __setstate__(self, state):
        print("Setting state in A")
        Persistent.__setstate__(self, state)

class B(Persistent):

    def __setstate__(self, state):
        print("Setting state in B")
        Persistent.__setstate__(self, state)


db = DB(DemoStorage())
transaction.begin()
conn = db.open()
conn.root()['key'] = A()
transaction.commit()

transaction.begin()
conn2 = db.open()
a = conn2.root()['key']
print("Setting the class")
a.__class__ = B
print("Activating the object")
a._p_activate()
transaction.abort()

When we run it with the C implementation, the object is activated by the line a.__class__ = B and uses the A.__setstate__ implementation to do so.

$ python /tmp/t.py
Setting the class
Setting state in A
Activating the object
Done activating

Using the pure-Python version, however, activation doesn't happen until we call _p_activate() explicitly, and then it uses the Other.__setstate__ implementation.

PURE_PYTHON=1 python /tmp/t.py
Setting the class
Activating the object
Setting state in B
Done activating

Ideally, setting the class of a ghost wouldn't activate the object, I think. But since that's the way the C implementation has worked for some time, probably the Python implementation needs to change to activate the object when the class is assigned to.

@jamadden jamadden added the bug label Mar 1, 2021
@jamadden
Copy link
Member Author

jamadden commented Mar 1, 2021

Probably closely related: The C implementation sets _p_changed when __class__ is assigned to, but the Python implementation doesn't. Adding print("Object is changed?", a._p_changed) right after the assignment shows this:

$ python /tmp/t.py
Setting the class
Setting state in A
Object is changed? True
...

$ PURE_PYTHON=1 python /tmp/t.py
Setting the class
Object is changed? None
...

This is another reason why I prefer the Python behaviour. But I assume there's reasons the C code does what it does (I haven't checked).

@jamadden jamadden self-assigned this Mar 10, 2021
@jamadden
Copy link
Member Author

There don't really seem to be any reasons that the C code does what it does; it could be changed to match Python, but that's a non-starter for backwards compatibility.

I also discovered this same discrepancy applies to __dict__.

It also applies to a few other special attributes: __of__, __setstate__, __del__. I don't think those are actually a problem, though, because as methods, typically found by looking at the type, they are unlikely to be assigned to ghost instance.

jamadden added a commit that referenced this issue Mar 10, 2021
…like C.

By activating the object.

Add tests for this.

In the process of adding tests, I discovered the mock pickle cache used by the Python tests
was returning the wrong thing from new_ghost (not a ghost), so I fixed this and added some extra
tests and assertions to be sure the behaviour matches with the real and C caches.

Fixes #155
@mgedmin
Copy link
Member

mgedmin commented Mar 10, 2021

The C behaviour makes sense to me if you view an object's class as part of its state (and there are some design patterns in Python that (ab)use __class__ this way to implement e.g. state machines).

OTOH, I seem to remember that the way persistent references work is that each object that has a reference to some persistent object stores a tuple of (class name, oid), which makes me question whether it's even a good idea to allow changing __class__ ever?

@jamadden
Copy link
Member Author

The C behaviour makes sense to me if you view an object's class as part of its state (and there are some design patterns in Python that (ab)use class this way to implement e.g. state machines).

True. I did exactly that (state machine based on __class__) once, until I learned that it didn't (consistently) work with ZODB...

OTOH, I seem to remember that the way persistent references work is that each object that has a reference to some persistent object stores a tuple of (class name, oid), which makes me question whether it's even a good idea to allow changing __class__ ever?

… and this is why it doesn't work on ZODB: ZODB stores the __class__ with each persistent reference in order to be able to cheaply create ghost objects with no additional information. Things appear fine until your object gets flushed from the connection's cache and reloaded, and then your change may disappear. (I'm not the only one who has been bitten by this.)

That said, changing the class can still be useful at runtime if you're very careful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants