-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extract_paths, use get_object_state #209
Conversation
This fixes the same problems as #207 but now for extract_paths, by using the same shared code (get_object_state).
@NeoLegends @michelwi @curufinwe what is the status here? |
This change leads to a RecursionError in one of our gmm training setups (I did not start any debugging besides triggering the pipeline, sorry) |
Can you tell me for what object this recursion error happens? I checked the AppTek PR pipeline error, but there is no information on the variables, so I cannot tell from that. Maybe you can also enable |
sisyphus/hash.py
Outdated
# so we keep consistent to the behavior of sis_hash_helper. | ||
if obj is None: | ||
return None | ||
if isinstance(obj, (bool, int, float, complex, str)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if obj
is of type np.float
then it is an instance of float
so get_object_state
simply returns it.
But sis_hash_helper(obj)
checks for type(obj) in (int, float, bool, str, complex):
, which is False and therefore we end up in the else case byte_list.append(sis_hash_helper(get_object_state(obj)))
which leads to an infinite recursion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed some change for that which makes this check more consistent. Can you check again?
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, np.float
is a bad example: This will actually break depending on your Numpy version. In (very old) Numpy versions, yes, np.float
is not float
but derived from float
. However, in newer Numpy versions, np.float
is just an alias to float
.
DeprecationWarning:
np.float
is a deprecated alias for the builtinfloat
. To silence this warning, usefloat
by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.float64
here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
So it means, if you really used np.float
in your setup, the hash will break when you update Numpy...
A better example is a namedtuple. And here I accidentally also broke some hash now, but this is fixed now. Specifically, due to the type(obj) in (tuple, list)
check, which was False, it also falls back to the get_object_state
logic for namedtuples, so it's important that get_object_state
behaves the same as before for namedtuples. This is what I do now.
AppTek pipeline no longer crashes. |
So it means ok to merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This fixes the same problems as #207 but now for
extract_paths
, by using the same shared code (get_object_state
).