Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract_paths, use get_object_state #209

Merged
merged 9 commits into from
Oct 21, 2024
Merged

Conversation

albertz
Copy link
Member

@albertz albertz commented Oct 11, 2024

This fixes the same problems as #207 but now for extract_paths, by using the same shared code (get_object_state).

This fixes the same problems as #207
but now for extract_paths,
by using the same shared code (get_object_state).
@albertz albertz marked this pull request as draft October 11, 2024 10:37
@albertz albertz requested a review from NeoLegends October 11, 2024 10:38
@albertz albertz marked this pull request as ready for review October 11, 2024 11:35
@albertz
Copy link
Member Author

albertz commented Oct 15, 2024

@NeoLegends @michelwi @curufinwe what is the status here?

@michelwi
Copy link
Contributor

This change leads to a RecursionError in one of our gmm training setups
cf. https://bitbucket.org/omnifluent/apptek_asr/pull-requests/1268

(I did not start any debugging besides triggering the pipeline, sorry)

@albertz
Copy link
Member Author

albertz commented Oct 17, 2024

Can you tell me for what object this recursion error happens? I checked the AppTek PR pipeline error, but there is no information on the variables, so I cannot tell from that. Maybe you can also enable better_exchook (maybe better_exchook.setup_all()) for that, so then I can see it?

sisyphus/hash.py Outdated
# so we keep consistent to the behavior of sis_hash_helper.
if obj is None:
return None
if isinstance(obj, (bool, int, float, complex, str)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if obj is of type np.float then it is an instance of float so get_object_state simply returns it.

But sis_hash_helper(obj) checks for type(obj) in (int, float, bool, str, complex):, which is False and therefore we end up in the else case byte_list.append(sis_hash_helper(get_object_state(obj))) which leads to an infinite recursion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed some change for that which makes this check more consistent. Can you check again?

This comment was marked as resolved.

This comment was marked as resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, np.float is a bad example: This will actually break depending on your Numpy version. In (very old) Numpy versions, yes, np.float is not float but derived from float. However, in newer Numpy versions, np.float is just an alias to float.

DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

So it means, if you really used np.float in your setup, the hash will break when you update Numpy...

A better example is a namedtuple. And here I accidentally also broke some hash now, but this is fixed now. Specifically, due to the type(obj) in (tuple, list) check, which was False, it also falls back to the get_object_state logic for namedtuples, so it's important that get_object_state behaves the same as before for namedtuples. This is what I do now.

@curufinwe
Copy link
Collaborator

AppTek pipeline no longer crashes.

@albertz
Copy link
Member Author

albertz commented Oct 21, 2024

So it means ok to merge?

Copy link
Collaborator

@curufinwe curufinwe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@albertz albertz merged commit 311ebfd into master Oct 21, 2024
3 checks passed
@albertz albertz deleted the albert-fix-extract-paths branch October 21, 2024 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants