Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New File System API #11018

Open
ry opened this issue Jun 17, 2021 · 25 comments
Open

New File System API #11018

ry opened this issue Jun 17, 2021 · 25 comments
Labels
runtime Relates to code in the runtime crate suggestion suggestions for new features (yet to be agreed) web related to Web APIs

Comments

@ry
Copy link
Member

ry commented Jun 17, 2021

This API would live in parallel with the existing low-level file system API provided by Deno.open, Deno.write, Deno.read, Deno.readDir, ...

Writing to a file:

const fileHandle = await Deno.getFileHandle("/tmp/foo.txt"); // Returns FileSystemFileHandle 
const writable = await fileHandle.createWritable();
const writer = writable.getWriter();
await writer.write(new TextEncoder().encode("Hello World"));
await writer.close();

Also directory listing

const directoryHandle = await Deno.getDirectoryHandle("/etc"); // Returns FileSystemDirectoryHandle 
for await (let [name, handle] of directoryHandle) {
  // ...
}

https://wicg.github.io/file-system-access/

Or maybe we have a Deno name-spaced directoryHandle corresponding to the cwd.

const fileHandle = await Deno.currentDirectory.getFileHandle("README.md");

// maybe root directory too?
const passwdHandle = await Deno.rootDirectory.getFileHandle("etc/passwd");
@ry ry added the suggestion suggestions for new features (yet to be agreed) label Jun 17, 2021
@lucacasonato lucacasonato added runtime Relates to code in the runtime crate web related to Web APIs labels Jun 17, 2021
@lucacasonato
Copy link
Member

This is blocked on #10969

@sebastienfilion
Copy link
Contributor

The example creates a blob and does nothing with...

const blob = new Blob(["Hello World"]);
const fileHandle = await Deno.getFileHandle("/tmp/foo.txt"); // Returns FileSystemFileHandle 
const writable = await fileHandle.createWritable();
const writer = writable.getWriter();
await writer.write(new TextEncoder().encode("Hello World"));
await writer.close();

@jimmywarting
Copy link
Contributor

jimmywarting commented Jun 17, 2021

Or maybe we have a Deno name-spaced directoryHandle corresponding to the cwd.

Maybe this could return the cwd (or a complete sandboxed path?):

navigator.storage.getDirectory()

It's your own sandboxed directory that you can read/write to without any permission promps


the getFile(path, [mimetype]) method i created in #10969 would pretty much do the same thing as Deno.getFileHandle("/tmp/foo.txt").then(fileHandle => fileHandle.getFile())

@josephrocca
Copy link
Contributor

josephrocca commented Jun 19, 2021

The Deno.rootDirectory approach seems best. But if the developer didn't give root read/write access, then it'd probably make sense to also give handles for the parameters of --allow-read=... and --allow-write=... via something like Deno.readableDirectories and Deno.writableDirectories? (Edit: Oh, but now I see that in ry's example, Deno's getFileHandle could maybe take a file path instead of just a name - the proposed Deno.rootDirectory and Deno.currentDirectory make more sense to me now)

As I mentioned in #2456 (comment), it would be great if code like this worked in Deno:

let dir = await globalThis.showDirectoryPicker();
let file = await dir.getFileHandle("hello.jpg", { create: true });
await blobToSave.stream().pipeTo(await file.createWritable());

In the browser, the above code triggers a directory picker (like a normal save-as dialogue) and then two permission prompts (one for reading, and one for writing). I'm not sure if there's a common UX/UI for directory-picking in the terminal (other than pasting a file path), but I like ncdu's simple/obvious UI - up/down to move between files and folders in a directory, and right/left to move into and out of a folder.

For UX/UI reference, here's the dialogue after selecting a directory called "inpaint" in the directory picker:

image (25)

And here's the dialogue after attempting to write content to that directory:
image (26)

I haven't used permission requests in Deno, but I imagine the UX flow could be similar? (Unless the program already has permission to read/write to that directory).

I like that the file/directory picker is separated from actually asking for permission to read/write to the selected path, because it makes it harder to accidentally give unintended permissions to a program.

@kt3k
Copy link
Member

kt3k commented Jun 28, 2021

Mozilla seems to have marked File System Access API as 'harmful' recently mozilla/standards-positions#545 though they say there's a subset of the API they're quite enthusiastic about.

@lucacasonato
Copy link
Member

I think it is just the method of opening files, because they don't consider the permission prompts strict enough. The actual interface seems to be relatively uncontroversial.

@crowlKats
Copy link
Member

crowlKats commented Jul 6, 2021

so started implementing this, and its not great a great API for the case of proper fs access; especially interaction with directories. removeEntry isnt great: you cant really delete the dir the directoryhandle is in, and traversing directories is horrible + you cant go up a directoryhandle. paths for all the api arent allowed but just filenames and one is supposed to traverse dirs by async iter/the name of the dir, and as such the whole api is just designed around this concept. this and some others things are just convoluted to use. sure, some of these things can be worked around/ignore parts of the spec, but it just seems like it will be a horrible user experience for usage in deno. this api is not suitable for what we want to replace.
In my opinion this is a step in the wrong direction. Yes we can use web streams for fs based things, but this API is not the solution. We should look into an alternative solution, be it not a standard spec. using web APIs is what deno strives for, even if they arent the best user experience, but for something as common as fs interaction, it is key that it is has a good user experience and is easy to use. The overall structure of this API isnt horrible, its just the usage of the methods and finer structure and intention behind it is what make it somewhat problematic.
That all said, i do think we should look a bit more into this API as my statements might be wrong to some degree.

@josephrocca
Copy link
Contributor

the usage of the methods and finer structure and intention behind it is what make it somewhat problematic.

@crowlKats This is interesting - do you have an insight into why the API's intention differs from what Deno would want? A lot of smart people worked on the API, and it has gone through several iterations, so presumably a bunch of thought has gone into the API design, and so I'm curious if you have any thoughts on whether this API is badly designed for the browser use-case too? Or do you think there some key difference between the two environments (Deno vs browser) that makes it fine for the browser use case, but bad for Deno?

paths for all the api arent allowed but just filenames and one is supposed to traverse dirs by async iter/the name of the dir, and as such the whole api is just designed around this concept. this and some others things are just convoluted to use.

Is this perhaps just a general feature of a low-level file system API? I.e. do these concepts of directory handles and file names map more closesly to OS filesystem APIs than the paths stuff you mentioned? I don't have a lot of experience with file system stuff, so I'm mostly just guessing here.

you cant really delete the dir the directoryhandle is in, and traversing directories is horrible + you cant go up a directoryhandle

I think the API is still in an MVP/V1 stage - see issues here: V2 milestone. Amongst those V2 issues there is a discussion around accessing the parent directory from a directory handle, for example.

@crowlKats
Copy link
Member

do you have an insight into why the API's intention differs from what Deno would want? A lot of smart people worked on the API, and it has gone through several iterations, so presumably a bunch of thought has gone into the API design, and so I'm curious if you have any thoughts on whether this API is badly designed for the browser use-case too? Or do you think there some key difference between the two environments (Deno vs browser) that makes it fine for the browser use case, but bad for Deno?

The API is fine for it's intended use; on a browser you arent going to do what one would do in deno in regards to interacting with the file system. Or rather, for a small set if directories, this API is fine, but then moment you want to do something with multiple nested directories, it gets quite painful.

Is this perhaps just a general feature of a low-level file system API? I.e. do these concepts of directory handles and file names map more closesly to OS filesystem APIs than the paths stuff you mentioned? I don't have a lot of experience with file system stuff, so I'm mostly just guessing here.

What i mean is:
We want to open file foo/bar/hello/world.txt. To do that you first have to open the foo dir, then from the foo dir handle the bar dir handle, and then the hello dir handle, and then you can open the world.txt handle. You can't directly open the path, you have to traverse each directory to reach what you want.

I think the API is still in an MVP/V1 stage - see issues here: V2 milestone. Amongst those V2 issues there is a discussion around accessing the parent directory from a directory handle, for example.

Oh thanks for pointing that out. I see there are some problems being discussed that I mentioned. Though even that doesn't really fix everything to make this viable in my opinion, given my other points.

@lucacasonato
Copy link
Member

We want to open file foo/bar/hello/world.txt. To do that you first have to open the foo dir, then from the foo dir handle the bar dir handle, and then the hello dir handle, and then you can open the world.txt handle. You can't directly open the path, you have to traverse each directory to reach what you want.

For this ry/me suggested Deno.getFileHandle. It takes a path and returns a FileSystemFileHandle.

@jimmywarting
Copy link
Contributor

jimmywarting commented Jul 6, 2021

I have also started working on a PR for native file system, i think it's grate in such a way that it's more safer than other.
You could hand over a fileHandle to a 3th party parser with or without write access to either parse/write a file conversion of some tool without knowing it's full path.

I have already made the most work and adapted most of https://github.com/jimmywarting/native-file-system-adapter to fit deno's eco system more and already have some (own) test passing

As for the removing the directory from itself: WICG/file-system-access#283

@lucacasonato
Copy link
Member

What concerns me more than directory traversal is that all writes unconditionally happen via a temporary file. The underlying file is updated atomically on close of the FileSystemWritableFileStream by means of a file move. I am on the fence about how I feel about this.

@jimmywarting
Copy link
Contributor

yea, it's a concern many ppl have over the API and is highly discussed upon in their issues. regarding copying/moving and atomic writes

@lucacasonato
Copy link
Member

Next problem: FileSystemFileHandle.getFile makes a on disk copy of the entire file to get an immutable view on it. This is unacceptable, even for a high level FS API. I am not in favour of this API as is anymore.

@jimmywarting
Copy link
Contributor

jimmywarting commented Jul 6, 2021

Nah, it dose not do a disk copy, it only stat's the for file size and last modified date.
but yea, it's immutable. if the last modified date changes when you call File.stream() then it should throw a error. That is what the browser dose
if you need to get a fresh update then you have to call getFile

My idea was that FileSystemFileHandle.getFile would return a BlobReferences backed up by a file on the disk

@lucacasonato
Copy link
Member

lucacasonato commented Jul 6, 2021

@jimmywarting That is not safe though.

  1. The on disk file can vanish. The File would then point to a non existent file.
  2. The file might shrink or grow. In this case the File.size getter is wrong and would not match the output of File.arrayBuffer().length.

@jimmywarting
Copy link
Contributor

jimmywarting commented Jul 6, 2021

The on disk file can vanish. The File would then point to a non existent file.
The file might shrink or grow. In this case the File.size getter is wrong and would not match the output of File.arrayBuffer().length.

That can happen in the browser also from <input type=file>.
you will still have a File instances assigned to some variable. The modified date and size will always be the same when u got the file the first time. The file.size is not a dynamic getter fn that always return the new size

if the file is removed, or the name, size or modifiedDate change when u call FileReader.readAs___ or file.stream(), file.arrayBuffer() then it throws a error. File.arrayBuffer() would not be able to resolve (as expected)

They do not copy the file: WICG/file-system-access#101 (comment)


if you get a file from <input type=file> and then remove it from HD, you will notice that you can no longer call file.text()
same thing if you do it with FileHandle in the browser showOpenFilePicker().then(h => h[1].getFile())

@lucacasonato
Copy link
Member

This is a huge issue though, as it is racy:

  1. User calls FileSystemFileHandle.getFile. This will cause a stat syscall to be preformed to get the current size of the file. It will also do an open call to get a handle to the file. The user gets a File with the specified size, linked to the handle.
  2. The user calls File.stream(). This will do another stat call to get the current size (and will throw if this is different to the original size). At this point we start reading bytes from the stream.

What happens if I now shrink the file while the streaming of the file is still ongoing? Is it going to complete, but return the same number of bytes that size says the file has? Is it going to throw?

@jimmywarting
Copy link
Contributor

jimmywarting commented Jul 6, 2021

I'm not that technical... i do not have any knowledge how the browser dose it. All i know is that it dose not copy a file

Either way i think it's important that we can receive a File instances somehow for use with FormData and handling (large) files in a standard way
Whether it be with the "File system access" or with a Deno namespace const file = Deno.getFile('./readme.md')

I really hope for a Deno.openAsBlob(path, { type }) got implemented ( Similar to NodeJS fs.openAsBlob )

@rektide
Copy link

rektide commented Aug 13, 2021

[Ed: @ry's proposal is already using File System Handle's, looks great.]

Access Handle is being developed to allow more direct access to files in File System Access specification, get away from the various atomic writes et cetera problems folks have mentioned.

There's also Storage Foundation API that has been under development, which similarly has intended to provide lower level access. There's ongoing discussion about how possibly these initiatives might merge.

Edit: the Access Handle and Storage Foundation (aka Origin Private File System/OPFS) work appears to be getting moved to WHATWG. See wicg/FSA#342 and whatwg/sg#171. 🤞

@josephrocca
Copy link
Contributor

josephrocca commented Feb 23, 2022

Just chiming in here again to say that I've been writing a lot of little utility scripts in the past few weeks using the File System Access API (to clean datasets, reorganise files, etc.) and they all first prompt for a directory to base themselves out of with

let dirHandle = await window.showDirectoryPicker();

It would be really nice if these scripts worked "out of the box" with Deno. This would require an interactive input, like prompt(). As mentioned here, a command line directory/file picker with a UI like ncdu would make sense I think (i.e. use arrow keys to navigate files/directories: up and down to scroll through files in directory, right to enter a sub-directory, left to go to parent, and select using enter key). I think it's this "level" of isomorphism that makes Deno feel so nice to use as a web developer. Something as simple has prompt() working in Deno is pleasantly surprising to newcomers, and I think Deno should lean into that (where it makes sense, of course).

I mention this again not because it's an immediate concern, but just so that (I hope) Deno doesn't don't go through any one-way doors with the design which would prevent this sort of thing from being implemented in the future. (Please ignore this comment if there's no risk of this)

@jed
Copy link

jed commented May 23, 2023

Given that Safari now supports OPFS, and FileSystemSyncAccessHandle seems to address most of the concerns @lucacasonato raised here, perhaps it's worth another look?

@jimmywarting
Copy link
Contributor

firefox also have OPFS now

@disintegrator
Copy link

disintegrator commented Dec 13, 2023

Wanted to share my experience around this issue. I'm currently working on an SDK which makes multipart/form-data requests using fetch and FormData. These requests typically include files - sometimes large files. We want to be able to stream these files from the filesystem rather than first read their contents into memory before appending to FormData. Here are two examples in Bun and Node.js that work out of the box:

// Node.js v20+

import fs from "node:fs";

async function run() {
  const file = await fs.openAsBlob("./sample.txt");
  const fd = new FormData();
  fd.append("file", file);

  const res = await fetch("https://httpbin.org/anything/upload", {
    method: "POST",
    body: fd,
  });

  console.log(await res.json());
}

run();
// Node.js v18
import { fileFrom } from "fetch-blob/from.js";

async function run() {
  const file = await fileFrom("./sample.txt");
  const fd = new FormData();
  fd.append("file", file);

  const res = await fetch("https://httpbin.org/anything/upload", {
    method: "POST",
    body: fd,
  });

  console.log(await res.json());
}

run();
// Bun
async function run() {
  const file = Bun.file("./sample.txt");
  const fd = new FormData();
  fd.append("file", file);

  const res = await fetch("https://httpbin.org/anything/upload", {
    method: "POST",
    body: fd,
  });

  console.log(await res.json());
}

run();

What's especially interesting is how the third-party library fetch-blob produces streamable file handles that are compatible with Node.js' FormData type. This is because the FormData class in undici, Node.js' official Fetch API library, is liberal in what it considers to be a Blob-like object (see here). Deno on the other hand has a very rigid definition which excludes Blob-like objects from third-party libraries - see this line in 21_formdata.js.

I'm wondering if Deno's FormData type can lean more on duck typing to permit third-party, Blob-like objects or perhaps introduce Deno.openAsBlob which supplants third-party libs. Until then, there doesn't appear to be an option to stream large files backed by the filesystem as part of multipart requests.

@jimmywarting
Copy link
Contributor

jimmywarting commented Dec 13, 2023

Hi @disintegrator
creator/author of fetch-blob here.

fetch-blob where created long before NodeJS decided to ship a Blob/File implementation of there own.
So it could therefore not extend any built in class like class File extends Blob { ... }

i think Blob where introduced some time around NodeJS v16 if i remember correctly.
between then and now that we got the new openAsBlob from v20+

durning this lifespan i saw some who tried create there own version of disc based blob/files before they discovered that fetch-blob existed.
and they solved this a bit differently then what fetch-blob did.
they extended native built in Blob that exist in NodeJS built in and override the .size and .stream() and .slice method instead.

this method have some slight more advantages. b/c they are instances of built in blob and are therefore more acceptable to other tools that use them.

i did something a while back to try and get something like disc based blob/files in deno a while back

export const openAsFile = async fileUrl => {
  const url = new URL(fileUrl)
  const basename = url.pathname.split('/').pop()
  const fsFile = await Deno.open(url)
  const stat = await fsFile.stat()
  fsFile.close()
  if (!stat.isFile) throw Error('not found')
  return new DiskFile([], basename, { path: url, stat })
}

const DiskFile = class File extends globalThis.File {
  #path = ''
  size = 0
  lastModified = 0

  constructor (fileUrl, fileName, options) {
    super([], fileName, options)
    const stat = options.stat
    this.#path = options.path
    this.size = stat.size
    this.lastModified = +stat.mtime
  }

  stream() {
    return Deno.openSync(this.#path).readable
  }

  async text() {
    let str = ''
    const ts = this.stream().pipeThrough(new TextDecoderStream())
    for await (let chunk of ts) str += chunk
    return str
  }

  async arrayBuffer() {
    // TODO: handle sliced blob
    const fsFile = await Deno.open(this.#path)
    // TODO: validate that the file have not changed.
    // const stat = await fsFile.stat()
    const uint8 = new Uint8Array(this.size)
    return uint8.buffer
  }

  slice (start, end, type = '') {
    throw new Error('Not impl.')
    return new DiskFile()
  }
}

very incomplete but at least gets the job done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
runtime Relates to code in the runtime crate suggestion suggestions for new features (yet to be agreed) web related to Web APIs
Projects
None yet
Development

No branches or pull requests

10 participants