-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move to zip.js from JSZip? Or implement JSZip within webworker? #881
Comments
Also of note is that this would impact the zip.js doesn't seem to offer a stream for output. Does this mean we should consider putting our existing JSZip code in a web worker instead? |
Looks like zip.js uses FileWriter, which supports streams. Either way using a worker seems like a good way to stop the render process locking? Failing that, maybe we need to rethink how assets are transferred? |
Another option here is stronger support for remote URL assets, and/or embedding things like the youtube video player. Video assets are likely to be the largest files we deal with. |
We are emulating FileWriter streams on Cordova, which I think is a compromise based on what we could achieve with the API. The solution might be to find a Cordova zip plugin and handle it that way, it means forking the code, but we already had to do that for emulation. |
Some more research on this issue:
|
Further options is native compression/decompression which may have already landed in chrome: https://github.com/WICG/compression/blob/master/explainer.md (not yet landed in safari) Web assembly: https://github.com/drbh/wasm-flate and https://github.com/nika-begiashvili/libarchivejs |
I am using zip.js for very big files, it is the only JS zip library that I have found that can seek to the end of the zip to read the file directory without having to read the whole zip file. Even stream based implementations cannot do that. So the max memory requirement is the size of the buffer, not the size of the zip archive, nor the file you want to extract (unless you extract to memory). It uses a configurable buffer size (you can edit in the code), and you just need to implement the Reader and Writer interface for whatever you want to read and write from. I wrote chunked ArrayBuffer Readers and Writers for IndexedDB allowing me to unzip gigabytes of data to and from IndexedDB reliably, It may be 'abandoned', or it may just 'work', and not need any updates. It has worked without any issues for me in production code for years, and allows me to handle files that are too big for JSZip. |
@keean - thanks for weighing in on this! I really appreciate your input. The issue we are trying to address here is both memory use (which as you say can partially be managed by manipulating the buffer, using streams for backpressure, and creating adapters for the input and output 'write' mechanism) and also the UI jank caused by unzipping in a single threaded context. The latter one is actually probably a higher priority for us. Do you have any insight into unzipping in a different process, via webworker? We are struggling with buffered reading specifically on cordova. As you see from this issue: Stuk/jszip#555 JSzip actually does inadvertently load large amounts of data into memory in some circumstances, which can cause our app to be killed by iOS in particular. We need to stream both zip reads and writes, and we only ever want to write/write the entire archive (never just the directory listing for example). Any thoughts on this? |
Yes, this zip.js runs the pako compressor/decompresser in a web worker so you dont need to do anything special, this is automatic (this can be disabled for single threaded use as well). I find it works on iOS to decompress multi-gigabyte zip files. Its also pretty fast as pako is written in asm.js and written to be as fast as possible in JS. I find using stream backpressure very unreliable and "removed" from the actual control needed. With the buffering technique used in zip.js memory control is very precise, and you can reduce memory requirements to the size of a single buffer if you implement your Reader and Writer correctly. What zip.js does is quite simple, it reads a buffer from your 'Reader' processes it through the pako decompression using a web-worker, and then passes the buffer to the 'Writer'. If your Reader and Writer can operate on the same buffer size as zip.js this never uses more than one buffer-size of memory to process the whole file. The only issue is that to decompress a file in the zip, you must read the directory at the end first, there is no way around this, because it contains all the meta-data about the file. What you might be able to do is this:
But this all seems a bit backwards to me. I would need to understand what you are actually trying to do, but instead of streaming I would recommend:
Now the client can get the directory first, and then download and decompress each file in the archive to a local file. |
@keean - Thanks again for providing your insight into this issue. Very much appreciated. It sounds like your approach will definitely work for us. |
Currently unzipping large protocol files hangs the render thread on cordova (which is single threaded) when using JSZip.
zip.js implements concurrent threaded unzip using web workers, which are supported on all platforms.
Additionally, unzipping large protocol files causes app crashes, believed to be due to memory constraints, on cordova (particularly iOS). This may help alleviate that issue.
The text was updated successfully, but these errors were encountered: