Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RE: API to predict final zipfile size #71

Closed
sntran opened this issue Dec 27, 2021 · 2 comments
Closed

RE: API to predict final zipfile size #71

sntran opened this issue Dec 27, 2021 · 2 comments

Comments

@sntran
Copy link

sntran commented Dec 27, 2021

This is a follow up of #1 .

I see that we can get the final zipfile size in the callback to .end. However, I would like to know that size way before adding files, so I can pass it along the pipeline before creating the archive.

Would it be possible to add another API to return the predicted size for a list of inputs whose size is provided?

For example:

yazl.size(
  fs.readdirSync("my/folder", { withFileTypes: true }).map(dirEntry => {
    return {
       ...dirEntry,
      size: fs.statSync(dirEntry.name).size,
    };
  }),
  {
    compress: false,
  },
);

Basically, it takes a list of file-like entries, with required name and size properties, and returns the final zipfile size.

It's fine to return -1 when the size can't be determined, but when it does (such as the size option is passed along the stream), the size should be computed. It's nicer that yazl exposes this number so that the calculation adheres to its method of archiving.

@sntran sntran changed the title Predicts the RE: API to predict final zipfile size Dec 27, 2021
@pklapperich
Copy link

Input file size isn't sufficient to predict the size of the output file. A file that's random data or already compressed will often not compress at all and sometimes the resulting zip file is larger than the inputs. Text files usually compress a lot. The only time you'd get correct predictions is when you create a zip file with compression disabled.

I'm not sure, but I think browsers will truncate if you tell the browser "it's 10,000 bytes" but then actually give it 10,005 bytes". I'm not sure that anything bad happens if the download terminates early, so maybe it's safe to "predict" there's no compression and the user will just think it's going to take a lot longer than it is until it ends abruptly.

This package doesn't not appear to be maintained.

@thejoshwolfe
Copy link
Owner

hi @sntran . sorry for the delayed response.

The usecase you're describing sounds like what the finalSizeCallback is designed for already. I added a paragraph to the documentation describing a usecase of emitting the Content-Length header in a web server before piping any of the archive contents.

I think the name finalSizeCallback is causing a lot of confusion; the name sounds like it would be called when everything is done, when really it's called when all the entries are queued up and the metadata is loaded; i think that's probably a poor naming choice on my part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants