Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GzipFile.flush doesn't flush compressor in 3.12 beta #105808

Closed
bdarnell opened this issue Jun 15, 2023 · 3 comments
Closed

GzipFile.flush doesn't flush compressor in 3.12 beta #105808

bdarnell opened this issue Jun 15, 2023 · 3 comments
Labels
3.12 bugs and security fixes release-blocker type-bug An unexpected behavior, bug, or error

Comments

@bdarnell
Copy link
Contributor

bdarnell commented Jun 15, 2023

Bug report

The change to add buffering to GzipFile.write (gh-89550, #101251) broke the GzipFile.flush method. The flush method previously called self.compress.flush, but now it only flushes the IO objects and not the compressor (as a side effect, the zlib_mode argument is now ignored, although in my case I only use the default). Flushing the compressor is necessary to create synchronization points that can be used to decompress part of the stream.

Here is a test script, reduced from Tornado's use of GzipFile (see tornadoweb/tornado#3278 for the way this manifests in Tornado's test suite):

import io
import gzip
import zlib

# Write two chunks to the same compressed stream. In real usage
# I send these chunks as two separate network messages, but in this
# test I just save them to two local variables.
data = io.BytesIO()
gzip_file = gzip.GzipFile(fileobj=data, mode="wb")
gzip_file.write(b"Hello World")
gzip_file.flush()
message1 = data.getvalue()
data.truncate(0)
data.seek(0)
gzip_file.write(b"Goodbye World")
gzip_file.close()
message2 = data.getvalue()

# Decode the two messages. Each one should decode separately,
# but in Python 3.12b2 the compressor was not flushed with
# Z_SYNC_FLUSH so the second message produces no output on its own
# and both messages are emitted when the second message is added
# to the decompressor's input.
#
# This results in the error
#   AssertionError: [b'', b'Hello WorldGoodbye World']
decompressor = zlib.decompressobj(16 + zlib.MAX_WBITS)
messages = [decompressor.decompress(message1), decompressor.decompress(message2)]
assert messages == [b"Hello World", b"Goodbye World"], messages

Your environment

  • CPython versions tested on: The bug is present in 3.12b2; the above script passes on 3.11 and earlier
  • Operating system and architecture: macOS and Linux

Linked PRs

@bdarnell bdarnell added the type-bug An unexpected behavior, bug, or error label Jun 15, 2023
bdarnell added a commit to bdarnell/tornado that referenced this issue Jun 15, 2023
Current betas have a bug in GzipFile we can't easily work around.
python/cpython#105808
Yhg1s added a commit to Yhg1s/cpython that referenced this issue Jun 19, 2023
GzipFile.flush() to not flush the compressor (nor pass along the zip_mode
argument).
@Yhg1s
Copy link
Member

Yhg1s commented Jun 19, 2023

Thanks! #105910 should fix the issue.

@gpshead gpshead added the 3.12 bugs and security fixes label Jun 19, 2023
gpshead pushed a commit that referenced this issue Jun 19, 2023
Fix a regression introduced in GH-101251, causing GzipFile.flush() to
not flush the compressor (nor pass along the zip_mode argument).
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 19, 2023
…onGH-105910)

Fix a regression introduced in pythonGH-101251, causing GzipFile.flush() to
not flush the compressor (nor pass along the zip_mode argument).
(cherry picked from commit 1858db7)

Co-authored-by: T. Wouters <[email protected]>
Yhg1s added a commit that referenced this issue Jun 19, 2023
#105920)

GH-105808: Fix a regression introduced in GH-101251 (GH-105910)

Fix a regression introduced in GH-101251, causing GzipFile.flush() to
not flush the compressor (nor pass along the zip_mode argument).
(cherry picked from commit 1858db7)

Co-authored-by: T. Wouters <[email protected]>
@gpshead
Copy link
Member

gpshead commented Jun 19, 2023

thanks for the report and PR!

@gpshead gpshead closed this as completed Jun 19, 2023
bdarnell added a commit to bdarnell/tornado that referenced this issue Jun 22, 2023
Now that python/cpython#105808 is fixed in beta 3.
@bdarnell
Copy link
Contributor Author

Thanks for the fix! I can confirm that things are working again with 3.12 beta 3: tornadoweb/tornado#3288

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes release-blocker type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

4 participants