Provide functionality for calculating MD5 hashes of files #48

janko · 2018-04-25T17:29:42Z

First of all, thanks a lot for creating this very useful library! 🙏

I recently needed to calculate an MD5 hash of a File object, and while I saw the section in the README showing how to do that, I really didn't like how much custom code it involves.

I was wondering, could maybe this functionality be part of this library? In comparison, Ruby has a Digest::MD5 class which supports calculating hash from a single string, incremental hashing in chunks, and calculating a hash from a file on disk.

Digest::MD5.hexdigest("string")
# or
md5 = Digest::MD5.new
md5.update("chunk1")
md5.update("chunk2")
md5.hexdigest
# or
Digest::MD5.file("/path/to/file").hexdigest

I took me quite a while to find a JavaScript library which simplifies reading a File object in chunks – chunked-file-reader – and it appears to work correctly (I get the same MD5 hash as with the snippet in the README here). So I came up with the following function:

function fileMD5 (file) {
  return new Promise(function (resolve, reject) {
    var spark  = new SparkMD5.ArrayBuffer(),
        reader = new ChunkedFileReader();

    reader.subscribe('chunk', function (e) {
      spark.append(e.chunk);
    });

    reader.subscribe('end', function (e) {
      var rawHash    = spark.end(true);
      var base64Hash = btoa(rawHash);

      resolve(base64Hash);
    });

    reader.readChunks(file);
  })
}

Since it took me a while to come up with this solution, I was wondering if it made sense to have that built into spark-md5.

The text was updated successfully, but these errors were encountered:

janko · 2018-04-25T17:34:04Z

If not, I think it would be nice to show this example in the README, so that people are more willing to copy-paste it into their projects.

satazor · 2018-04-26T22:15:46Z

Hello @janko-m. This library was primally made to be used in browser like environments. While it works in node, using the native crypto module will be much faster.

Having a method to read files and calculate the hash out of it would have to cater how to really read the files based on the environment: browser-like or node. Because of that, I don't think it makes much sense to have that built-in.

I'm willing to improve the README to make it clear how to calculate the hash of a file in both browser and node environments. Could you make a PR to add examples of both environments? The current example is for a browser environment.

Makes sense?

janko · 2018-04-27T12:42:25Z

Hey @satazor, thanks for a quick answer.

Having a method to read files and calculate the hash out of it would have to cater how to really read the files based on the environment: browser-like or node. Because of that, I don't think it makes much sense to have that built-in.

I was under the impression that this library was already considered "browser-only", because, as you said, for Node there is already the crypto module (and hasha which uses it). So, I'm not sure I fully understand, if the functionality for calculating a hash from a JavaScript File object is added, why would it mean that it should also support Node? If that's really the case, then I agree that it wouldn't make much sense.

I'm willing to improve the README to make it clear how to calculate the hash of a file in both browser and node environments. Could you make a PR to add examples of both environments? The current example is for a browser environment.

My intention was simplifying only the browser example that's already there, as I only have experience with using spark-md5 in the browser. Great, I'll send the PR then 👍

The chunked-file-reader comes with the functionality of reading a file in chunks, so we can simplify the file example a lot by offloading this logic to that package. I think this will make it much more approachable for people wanting to reuse that code. The chunked-file-reader package uses `readAsArrayBuffer()`, and we cannot use it for tests that use `readAsBinaryString()`. Also, chunked-file-reader always uses `File.prototype.slice`, but I think that's ok now, since `blob.mozSlice()` is only needed for Firefox 12 and earlier, though I don't on which version did Safari start supporting `File.prototype.slice` (I tested that it works on Safari 11 which is the current latest version). Closes satazor#48

janko mentioned this issue Apr 27, 2018

Simplify calculating hash from file with chunked-file-reader #49

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide functionality for calculating MD5 hashes of files #48

Provide functionality for calculating MD5 hashes of files #48

janko commented Apr 25, 2018

janko commented Apr 25, 2018 •

edited

Loading

satazor commented Apr 26, 2018 •

edited

Loading

janko commented Apr 27, 2018

Provide functionality for calculating MD5 hashes of files #48

Provide functionality for calculating MD5 hashes of files #48

Comments

janko commented Apr 25, 2018

janko commented Apr 25, 2018 • edited Loading

satazor commented Apr 26, 2018 • edited Loading

janko commented Apr 27, 2018

janko commented Apr 25, 2018 •

edited

Loading

satazor commented Apr 26, 2018 •

edited

Loading