Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use a byte sequence instead of a buffer #40

Closed
3 tasks done
jimmywarting opened this issue Jun 4, 2020 · 0 comments · Fixed by #44
Closed
3 tasks done

use a byte sequence instead of a buffer #40

jimmywarting opened this issue Jun 4, 2020 · 0 comments · Fixed by #44
Assignees

Comments

@jimmywarting
Copy link
Contributor

jimmywarting commented Jun 4, 2020

imagine how you would go about solving this:

const File = require('fetch-file') // creates a FIle-like IDL wrapper around fs
const Blob = require('fetch-blob')

const file = File.from('./package.json')
new Blob([ file ])

the blob's constructor spec says:

  1. Let bytes be an empty sequence of bytes. (it's not like a long ArrayBuffer holding all data)
  2. For each element in parts:
    1. ...
    2. ...
    3. If element is a Blob, append the bytes it represents to bytes.

2.3 don't say read the the blob and copy the content to a new instance. Beside the fetch-blob can't read the content of the File in a synchronous way to be able to successfully handle the blob parts. so the slicing method is also wrong, it should not slice the Buffer like fetch-blob dose today. it should instead update the byte sequences and and create a new offset of where it should start reading stuff from. This also means that arrayBuffer, text and stream is wrong too. fetch-blob should instead read each blobPart individually at a later stage to successfully be able to handle the File - the constructor should not read the file content.


when you take 2x2 GiB of files from a <input type="file" multiple> and concatenate b = new Blob([fileA, fileB]) then you won't end up with 4 GiB of RAM, it really means that you have a new blob container with two references point to where a snapshot is located. And if you slice it b.slice(0, -500) then you have one references to fileA and another reference to fileB with a offset from what it should represent. [[start to end of fileA], [start to end minus 500 of fileB]]

image taken from: https://docs.google.com/presentation/d/1MOm-8kacXAon1L2tF6VthesNjXgx0fp5AP17L7XDPSM/edit

Originally posted by @jimmywarting in node-fetch/node-fetch#835 (comment)

things needs fixing:

  • this[BUFFER] should be this[PARTS] since you can't concatenate all parts
  • read methods should loop over all parts and concatenate all parts into one stream/text/arraybuffer and taking the internal offset into consideration
  • slicing becomes more complicated since you are now dealing with parts instead of a buffer
    • what really should be done is just copying all parts (clone the blob) and kinda change the internal this[OFFSET] from where it should start reading from.
var b1 = new Blob([new Uint8Array(1000)])
var b2 = b1.slice(500) // uses the same parts with an offset=500
// b2 should not take up 500 byte more ram... 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant