-
-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linewise iteration over compressed input #13
Comments
You can instantiate an The proper solution is for python-zstandard to provide an API to obtain an object that conforms to the |
Thank you for adding it to the TODO, as I am especially looking for performance.
|
That |
Also, it is possible to write a wrapper class until the API is added to python-zstandard. See https://github.com/python/cpython/blob/3.6/Lib/_compression.py and https://github.com/python/cpython/blob/3.6/Lib/gzip.py for how the Python standard library does it. |
I started working on this today. I'm intent on it being in the next release, which will be 0.9. No ETA for that release, however. I view a |
…sion Like we just did for compression. This is a precursor to #13.
I just pushed the beginnings of a new stream API. It doesn't yet support |
Hi, The general concept looks like this: import zstd
import io
path = '/tmp/foo.zst'
with open(path, 'rb') as fh:
dctx = zstd.ZstdDecompressor()
with dctx.stream_reader(fh) as reader:
wrap = io.TextIOWrapper(io.BufferedReader(reader), encoding='utf8')
print(wrap.readline()) However this fails with |
An "open" method would be a big help. Ultimately I want to write my own open function that will automatically use the right compressor or decompressor based on the file name and file signature. Something that behaves the way all of the other open functions do would be a big help. |
The problem appears to be that |
Following up on this, |
I tried to use this pattern exactly today:
and received an error during the readline call: This was with zstandard 0.10.1 on pypy3 |
If you feel differently, please let me know. |
Now that io.RawIOBase is implemented, we can properly chain a ZstdDecompressionReader to an io.BufferedReader and io.TextIOWrapper to achieve buffering and line-based reading. This closes #13.
The documentation was removed in commit d613a5f. Was that intentional? If so, why? |
If the documentation was removed, it was accidental. I believe the documentation now exists at https://github.com/indygreg/python-zstandard/blob/main/zstandard/backend_cffi.py#L3048? |
@indygreg I was looking for https://python-zstandard.readthedocs.io/en/latest/search.html?q=readlines&check_keywords=yes&area=default which found nothing. Seems like the generated documentation is partially broken: https://python-zstandard.readthedocs.io/en/latest/decompressor.html |
Yeah, the Sphinx docs on readthedocs aren't working. I'll try to get that fixed in the next few hours. |
It should be working now. |
For anyone who finds this after me, the correct way to do this (copied from the link above) is:
|
Currently it's only possible to iterate over chunks:
How can I iterate line-by-line as it is possible with
gzip.open()
?The text was updated successfully, but these errors were encountered: