Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gzip_ng_threaded prevents programs from crashing if a file is open #53

Closed
D-VR opened this issue Sep 10, 2024 · 7 comments · Fixed by #54 or #55
Closed

gzip_ng_threaded prevents programs from crashing if a file is open #53

D-VR opened this issue Sep 10, 2024 · 7 comments · Fixed by #54 or #55

Comments

@D-VR
Copy link

D-VR commented Sep 10, 2024

If you use gzip_ng_threaded and open a file with multiple threads, any Exceptions thrown will cause Python to hang indefinitely.

This seems to be because the thread lock is never released.

Code to trigger the bug:

from zlib_ng import gzip_ng_threaded as gzip

file_ = gzip.open("Throwaway", "wb", threads=2)

raise Exception("weird")

example output (KeyboardInterrupt was used to quit the program):

~/venv/bin/python ~/testing.py 
Traceback (most recent call last):
  File "~/testing.py", line 5, in <module>
    raise Exception("weird")
Exception: weird

Exception ignored in: <module 'threading' from '/usr/lib/python3.12/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.12/threading.py", line 1622, in _shutdown
    lock.acquire()
KeyboardInterrupt: 

Process finished with exit code 1
@rhpvorderman
Copy link
Contributor

Good catch. And nice of you to create a reproducer. I will see if I can get this into automated testing and make a patch.

@rhpvorderman
Copy link
Contributor

I was able to reproduce and produce a patch. This problem occurs when the file is not used with a context manager. As a result it will not get closed automatically and the threads aren't properly cleaned up.
I changed the threads to be daemon threads. That way python will kill them if the spawning thread exits which is the desired behavior if it is to be used the same way as non-threaded gzip.

Very good catch. I am glad you reported it.

@D-VR
Copy link
Author

D-VR commented Sep 11, 2024

Thank you for the quick response and patch!

This is a cool package and I hope there will be more usage of it, more than happy to give a little back!

edit: sorry for closing / opening the issue, realized it wasn't merged yet 😅

@rhpvorderman
Copy link
Contributor

No worries. I will close it in appropriate time. Seems like there is a bug in Python 3.13 that prevents the fix from working. Which is unfortunate.

Also checkout python-isal. It has the same bug (still need to fix that), but works with a different backend library which is even faster.

@D-VR
Copy link
Author

D-VR commented Sep 11, 2024

Tried igzip_threaded and it seems to be marginally slower for my current application (on default settings), probably since I'm using a AMD CPU? (actually good, else I'd need to change more code for my optimization project 😅 )

Definitely cool though, thanks for making these packages!

@rhpvorderman
Copy link
Contributor

rhpvorderman commented Sep 11, 2024

Strange, on my also AMD cpu it is a lot faster (for compression). Unless maybe zlib-ng level 1 is used, but zlib-ng level 1 results in huge files even compared to normal gzip level 1.

You could also tryout xopen. That should choose the most optimal gzip backend, no need to worry about that yourself.

@D-VR
Copy link
Author

D-VR commented Sep 11, 2024

Strange, on my also AMD cpu it is a lot faster (for compression). Unless maybe zlib-ng level 1 is used, but zlib-ng level 1 results in huge files even compared to normal gzip level 1.

Seems to have been slowed by some background process! Ran it again and it's slightly faster
Yeah xopen looks nice, will use that in the future :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants