Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating huge files on client #394

Open
Pacerier opened this issue Feb 5, 2017 · 6 comments
Open

Generating huge files on client #394

Pacerier opened this issue Feb 5, 2017 · 6 comments

Comments

@Pacerier
Copy link

Pacerier commented Feb 5, 2017

How do you pipe huge files on client?

I'm cutting files using a loop of:

var reader = new FileReader();
reader.readAsText(file.slice(pos, pos + x));

but where is the place to input the output string? How do we pipe the output string to the .pipe of:

zip
.generateNodeStream({type:'nodebuffer',streamFiles:true})
.pipe(fs.createWriteStream('out.zip'))
.on('finish', function () {
    // JSZip generates a readable stream with a "end" event,
    // but is piped here in a writable stream which emits a "finish" event.
    console.log("out.zip written.");
});

Aside from that, I'm getting the error nodestream is not supported by this platform.

@Pacerier Pacerier changed the title Piping huge files on client Generating huge files on client Feb 5, 2017
@jimmywarting
Copy link
Contributor

Screw-FileReader is a cross browser solution to turn a blob into a web stream
Something of which jszip is considering adding: #345

Another solution is just to turn that blob into a stream using Response - part of the fetch API to do the most work for you

var stream = new Response(blob).body

The advantage with web stream is that you don't have to include the hole node stream + node buffer libs to the browser...

@jimmywarting
Copy link
Contributor

either way this might be interesting to you also #343

@Pacerier
Copy link
Author

Pacerier commented Feb 5, 2017

This code:

zip
.generateInternalStream({type:"uint8array"})
.on('data', function (data, metadata) {
console.log('data');
    console.log(data);
})
.on('error', function (e) {
    console.log('error');
})
.on('end', function () {
    console.log('end');
})
.resume();

controls the output, But what are the methods to control the input?

My input isn't this:

	var zip = new JSZip();
	zip.file("Hello.txt", "Hello World\n");
	var img = zip.folder("images");

It is this:

// ..
window.webkitRequestFileSystem(window.PERSISTENT, huge_number, function(fs){
	fs.root.getFile('asdfgh', {create: false, exclusive: true}, function(fe) {
		fe.file(function(file) {
                      // ..

var my_input = file; // this file is my input // warning Huge file alert!

So I managed to cut the huge file up into small pieces using a loop of file.slice(pos, pos + x).

However, how do I insert those small pieces into the JSZip object?

@amagliul
Copy link

Hi @Pacerier and @jimmywarting,
My apologies for resurrecting this old thread, but did either of you ever figure out an option for this?

Goal: Allow a user to upload multiple files/folders of any size, via drag-and-drop, from their local file system into their browser.

Proposed Solution: Create a zip file of the desired files/folders and chunk that up into multiple POSTs to the server. The server then assembles these chunks into a single zip file that can be extracted.

I have gotten this to work, and it works very well with a small enough set of items, but ....

Problem: zip.file() will end up reading all of the file data into an arraybuffer in memory, as it prepares each file (https://github.com/Stuk/jszip/blob/master/dist/jszip.js#L3471)

This ends up ballooning the browser memory. A single file over 4GB will break things completely. A large set of smaller items will use memory equal to the total size of the files (8GB of files = 8GB of memory use, for example).

Is there a way to avoid this? The ability for generateInternalStream to stream with StreamFiles: true is great, but it doesn't seem very useful when the files added to the zip are all loaded into memory beforehand.

@jimmywarting
Copy link
Contributor

jimmywarting commented Oct 10, 2018

@amagliul I made a PR to solve this very issue #555
But nothing has happened here in one year

@amagliul
Copy link

Thanks @jimmywarting! This looks exactly what I figured should be happening. They should definitely merge this in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants