-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hash (get_object_state
) of functools.partial
is wrong
#207
Comments
Ah, one other idea: What about using However, that's a bit different to accessing |
But could we fix it without breaking existing setups? And should the implementation now be python version dependent? |
Definitely. That's the goal.
I will try with a generic implementation which is not Python version dependent. I will also try with a generic implementation which even does not depend on checking specifically for |
It does seem to be like we'd always breaking hashes for the cases that were broken before, no? In those cases could we introduce a flag that turns on the new behavior on-demand ( |
We should distinguish cases which were obviously wrong before, so wrong that it was basically unusable. I think I would also argue, the same is also true for cases where there was both a |
I'm just saying there are members on the teams that are very hestitant to hash breakage and that would rather have some slightly erroneous behavior (which seems to have worked out for them before?) rather than update to a potential hash-breaking version. And in the end this leads to these users never updating their tooling at all, which is also bad (but on them). 🤷🏼♂️ But yes, it's probably better to introduce a fix rather than to introduce legacy baggage to cater for these cases. |
For anyone who used And for the other change, this makes it actually now more consistent across Python versions. Without the change, it is very likely that you would get a different hash in Python <=3.10 than in Python >=3.11 for any such objects which are affected by this. And again, the hash in Python <=3.10 probably would not have covered any of the relevant parts of the objects, so the same problem as for But yes, it's definitely a good idea to test this on some more complex setups, whether no hashes change (or if sth changes, then to study exactly why what changes - maybe there is also some bug in the PR here). (Btw, I tested this for my recent experiments, and there, nothing changes, except for Also, I think we should extend the test cases for the hashes, to cover all cases which we care about, to make sure those hashes really never change. Currently there is only a very small number of tests on this. (But this is not necessarily a change for this specific PR. I only added one test case for |
This fixes the same problems as #207 but now for extract_paths, by using the same shared code (get_object_state).
This fixes the same problems as #207 but now for extract_paths, by using the same shared code (get_object_state).
Current
get_object_state
logic:I think this is already partially wrong when there is both
__dict__
and__slots__
(which is valid).Also note that the behavior changed since Python 3.11: Now there is always a default
__getstate__
function (doc), which actually handles__dict__
/__slots__
correctly, but that means, the behavior changed for Sisyphus hashes from older Python versions to Python 3.11 and newer (but only for those rare cases where there was both__dict__
and__slots__
).functools.partial
is such a case:Note that the
__dict__
usually stays empty. It's just there to allow to add other attribs to the object (which would not be possible if there is no__dict__
).So, for older Python versions, it's wrong.
For Python >=3.11, you might think, it's correct.
But it's also wrong, now in a different way! After the pure Python implementation of
partial
, it tries to import a native implementation to replace it, as an optimization:This might succeed or not, depending on Python compiler flags (it could be optional to compile this native
_functools
module).Now, this native
partial
is more tricky. It again defines__dict__
. But it does not have slots. Instead, the relevant args are stored internally in the nativepartial
type. So, just accessing__dict__
here is wrong, but also using__getstate__
is wrong, as this will just return__dict__
(it's the default implementation of__getstate__
).(I'm not really sure whether that's maybe even a bug in CPython. It was unexpected to me. I reported it here: python/cpython#125094)
So we cannot use that. And again, our current
get_object_state
is wrong.Note,
partial.__reduce__
correctly returns all the internal state, in all cases. But it's also not consistent across Python versions in the way what it returns, so it would also not be a good idea to use this for hashing.So, what's the solution? Currently the only reasonable thing I can think of for this particular case is to handle this type specifically in
get_object_state
.The text was updated successfully, but these errors were encountered: