Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash (get_object_state) of functools.partial is wrong #207

Closed
albertz opened this issue Oct 8, 2024 · 7 comments · Fixed by #208
Closed

Hash (get_object_state) of functools.partial is wrong #207

albertz opened this issue Oct 8, 2024 · 7 comments · Fixed by #208
Assignees

Comments

@albertz
Copy link
Member

albertz commented Oct 8, 2024

Current get_object_state logic:

    if hasattr(obj, "__getnewargs_ex__"):
        args = obj.__getnewargs_ex__()
    elif hasattr(obj, "__getnewargs__"):
        args = obj.__getnewargs__()
    else:
        args = None

    if hasattr(obj, "__sis_state__"):
        state = obj.__sis_state__()
    elif hasattr(obj, "__getstate__"):
        state = obj.__getstate__()
    elif hasattr(obj, "__dict__"):
        state = obj.__dict__
    elif hasattr(obj, "__slots__"):
        state = {k: getattr(obj, k) for k in obj.__slots__ if hasattr(obj, k)}
    else:
        assert args is not None, "Failed to get object state of: %s" % repr(obj)
        state = None

I think this is already partially wrong when there is both __dict__ and __slots__ (which is valid).

Also note that the behavior changed since Python 3.11: Now there is always a default __getstate__ function (doc), which actually handles __dict__/__slots__ correctly, but that means, the behavior changed for Sisyphus hashes from older Python versions to Python 3.11 and newer (but only for those rare cases where there was both __dict__ and __slots__).

functools.partial is such a case:

class partial:
    __slots__ = "func", "args", "keywords", "__dict__", "__weakref__"
    ...

Note that the __dict__ usually stays empty. It's just there to allow to add other attribs to the object (which would not be possible if there is no __dict__).

So, for older Python versions, it's wrong.

For Python >=3.11, you might think, it's correct.

But it's also wrong, now in a different way! After the pure Python implementation of partial, it tries to import a native implementation to replace it, as an optimization:

try:
    from _functools import partial
except ImportError:
    pass

This might succeed or not, depending on Python compiler flags (it could be optional to compile this native _functools module).

Now, this native partial is more tricky. It again defines __dict__. But it does not have slots. Instead, the relevant args are stored internally in the native partial type. So, just accessing __dict__ here is wrong, but also using __getstate__ is wrong, as this will just return __dict__ (it's the default implementation of __getstate__).

(I'm not really sure whether that's maybe even a bug in CPython. It was unexpected to me. I reported it here: python/cpython#125094)

So we cannot use that. And again, our current get_object_state is wrong.

Note, partial.__reduce__ correctly returns all the internal state, in all cases. But it's also not consistent across Python versions in the way what it returns, so it would also not be a good idea to use this for hashing.

So, what's the solution? Currently the only reasonable thing I can think of for this particular case is to handle this type specifically in get_object_state.

@albertz
Copy link
Member Author

albertz commented Oct 8, 2024

Ah, one other idea: What about using dir? That correctly lists the relevant attribs. (Note: vars does not.)

However, that's a bit different to accessing __dict__ or __slots__ or also what you would get via __getstate__.

@michelwi
Copy link
Contributor

michelwi commented Oct 8, 2024

But could we fix it without breaking existing setups?

And should the implementation now be python version dependent?

@albertz
Copy link
Member Author

albertz commented Oct 8, 2024

But could we fix it without breaking existing setups?

Definitely. That's the goal.

And should the implementation now be python version dependent?

I will try with a generic implementation which is not Python version dependent. I will also try with a generic implementation which even does not depend on checking specifically for partial.

@NeoLegends
Copy link
Contributor

NeoLegends commented Oct 8, 2024

It does seem to be like we'd always breaking hashes for the cases that were broken before, no? In those cases could we introduce a flag that turns on the new behavior on-demand (gs.APPLY_HASH_FIX_FOR_DICT_AND_SLOTS or so), so users can still update their sisyphus and choose to delay the hash breakage until there is enough time to re-run any affected experiments.

@albertz
Copy link
Member Author

albertz commented Oct 8, 2024

We should distinguish cases which were obviously wrong before, so wrong that it was basically unusable. I think functools.partial is such a case. None of the relevant args (func, args, keywords) were ever part of the hash. So fixing that now should not really break anything. So I don't think we need such a flag for this now.

I would also argue, the same is also true for cases where there was both a __dict__ and __slots__. Currently (Python <=3.10) it would have used only the __dict__, and very likely, that was just empty, and does not contain any of the relevant things (which are very likely all in the __slots__), so it was totally broken and basically unusable. So again, I would argue, fixing that should not be a problem, and we don't need a flag for that. (For Python >=3.11, it would already do the correct thing.)

@NeoLegends
Copy link
Contributor

NeoLegends commented Oct 9, 2024

So fixing that now should not really break anything.

I'm just saying there are members on the teams that are very hestitant to hash breakage and that would rather have some slightly erroneous behavior (which seems to have worked out for them before?) rather than update to a potential hash-breaking version. And in the end this leads to these users never updating their tooling at all, which is also bad (but on them). 🤷🏼‍♂️

But yes, it's probably better to introduce a fix rather than to introduce legacy baggage to cater for these cases.

@albertz
Copy link
Member Author

albertz commented Oct 9, 2024

For anyone who used functools.partial before, this was just not slightly erroneous, this was totally wrong. There was just no hash at all for the partial. The only way I can see that they were lucky and it "worked for them" was if they had exactly 1 single experiment with it, and never more than that, which seems rather unlikely. Because once they would have tried to put some different values into the partial, they would have noticed that the hash does not change at all.

And for the other change, this makes it actually now more consistent across Python versions. Without the change, it is very likely that you would get a different hash in Python <=3.10 than in Python >=3.11 for any such objects which are affected by this. And again, the hash in Python <=3.10 probably would not have covered any of the relevant parts of the objects, so the same problem as for partial, once you have more than 1 experiment.

But yes, it's definitely a good idea to test this on some more complex setups, whether no hashes change (or if sth changes, then to study exactly why what changes - maybe there is also some bug in the PR here). (Btw, I tested this for my recent experiments, and there, nothing changes, except for partial, which was just broken without this change.)

Also, I think we should extend the test cases for the hashes, to cover all cases which we care about, to make sure those hashes really never change. Currently there is only a very small number of tests on this. (But this is not necessarily a change for this specific PR. I only added one test case for partial here.)

albertz added a commit that referenced this issue Oct 11, 2024
This fixes the same problems as #207
but now for extract_paths,
by using the same shared code (get_object_state).
albertz added a commit that referenced this issue Oct 21, 2024
This fixes the same problems as #207
but now for extract_paths,
by using the same shared code (get_object_state).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants