Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent zip compression and decompression memory usage warnings #178

Closed
marcospb19 opened this issue Nov 10, 2021 · 4 comments · Fixed by #217
Closed

Inconsistent zip compression and decompression memory usage warnings #178

marcospb19 opened this issue Nov 10, 2021 · 4 comments · Fixed by #217
Labels
bug Something isn't working good first issue Good for newcomers high priority
Milestone

Comments

@marcospb19
Copy link
Member

image

The limitation of compressing zip archives with other streams only occurs when using multiple formats, like:

archive.zip.gz, or archive.zip.xz.xz.xz.xz.

.zip requires io::Seek, and it can only be provided in fs::File and Vec<u8>, so we can either compress directly to the final file, or load into a Vec, but not pipe to encoding streams.

Loading the whole file to a Vec might exhaust memory capacity, that's why we need to give those warnings in specific cases.


While filing this, I found out the decompression command has another warning with different formatting:

image

These messages formats should be consistent.

@marcospb19
Copy link
Member Author

marcospb19 commented Nov 10, 2021

Warning parts is related to #177.

EDIT: warning!() macro is already available for usage since #154.

@marcospb19 marcospb19 added bug Something isn't working high priority labels Nov 10, 2021
@sigmaSd
Copy link
Contributor

sigmaSd commented Nov 10, 2021

Apperently its possible to decompress zip as stream https://github.com/uktrade/stream-unzip

This should be easy to port to rust, but I have no idea how correct is this implementation

@marcospb19
Copy link
Member Author

marcospb19 commented Nov 10, 2021

I'm still a bit skeptical, I think this is an unsolvable problem.

zip has this "metadata" part at the start of the archive that needs to be updated for each file that's added to it, so we cannot send the start of the archive while we don't finish it.

Meaning you add to a stream like tar, unless you have the "metadata" part ready at the start.

But we cannot solve the metadata upfront, because we'd need to load everything to memory and preprocess it, that wouldn't be better than decompressing/compressing everything to a Vec.

EDIT: the "stream" in this other project has other meaning, because they are not using multiple decoders on top of zip, we are already able to read it directly to fs::File because it implements Seek, the problem is .zip.gz (.zip.X really, because none of those impl io::Seek).

@Crypto-Spartan
Copy link
Contributor

Crypto-Spartan commented Dec 7, 2021

Created a PR to address the main issue. This is not a fix to stream compress/decompress zip archives.

Edit: moved commits to new branch and submitted a new PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers high priority
Projects
None yet
3 participants