Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SyncAccessHandle read() & write() and the use of "at:" #49

Closed
jesup opened this issue Sep 16, 2022 · 12 comments
Closed

SyncAccessHandle read() & write() and the use of "at:" #49

jesup opened this issue Sep 16, 2022 · 12 comments

Comments

@jesup
Copy link
Contributor

jesup commented Sep 16, 2022

The WPT tests seem to assume you must pass 'at', and if you don't all reads start at 0. This is ... odd, especially for implementing the equivalent to read(), which maintains an internal position in the file, and reads from there. It means for almost all reads (or writes), you must pass at, which means you must track the expected position in the file - likely in a wrapper around syncaccesshandle.read()/write(). Emscriptem certainly can track the location in the file and insert at:nnnn always, but this seems like an odd choice for an API (and not great for performance, due to the extra parsing and validation and the basically forced extra kernel call to seek -- we could avoid the seek() call if we also track the position and elide if if it's a no-op, but that's not worth it either).

I imagine this is legacy decisions from the non-OPFS File System API

Though maybe not legacy -- SyncAccessHandles came later. Can (should?) we change this? If you're writing a bunch of data, it's a whole lot more (unnecessary) bookkeeping and overhead.

   position += handle.write(data1, {at: position});
   position += handle.write(data2, {at: position});

But you can't quite do that, since if there's an error and a partial write, you won't notice it. To be pedantic about catching errors, you need to do:

try {
  written = handle.write(data1, {at: position});
  if (written != data1.length()) { /* error */} else { postion += written;}
  written = handle.write(data2, {at: position});
  if (written != data2.length()) { /* error */} else { postion += written;}
} catch(e) { ... }

With classic posix-like read/write (with an assumed position update), it would be

  handle.write(data1);
  handle.write(data2);

or

try {
  if (handle.write(data1) != data1.length) { /* error */}
  if (handle.write(data2) != data2.length) { /* error */}
} catch(e) { ... }
@jesup jesup mentioned this issue Sep 16, 2022
3 tasks
@a-sully
Copy link
Collaborator

a-sully commented Oct 10, 2022

Given that we've already shipped by defaulting to index 0, I don't think we should change that default. Unlike the change to all-sync SyncAccessHandles, which is technically not backwards-compatible but which we expect will be a no-op for virtually all uses, changing the default here would be a silent behavioral change.

However, this seems like a reasonable thing to allow developers to opt into. Something like { at: 'cursor' }?

"Seek" would be achievable by writing zero zero bytes at a given offset. But that raises the question of whether we need the equivalent of Posix's SEEK_CUR: a way to seek to an offset from the cursor. Conceptually, { at: 'cursor' + 10 }

@jesup
Copy link
Contributor Author

jesup commented Oct 13, 2022

If all current users are using emscripten, and it always passes at: (and per above it has to unless it just does a single read to read the entire file), then we could change it and no current application would be affected.

@jesup
Copy link
Contributor Author

jesup commented Oct 13, 2022

In fact, the application we're using to test worked perfectly with a moving cursor until I realized the WPTs tested for this default-to-0 behavior and switched it (and filed this issue)

@jesup
Copy link
Contributor Author

jesup commented Nov 9, 2022

Two points:

  1. The only app that could be affected by a change is one that re-reads the file starting at 0 (or re-writes the file starting at 0) without passing at:, and had previously read/written to the file. If as I believe emscripten is always passing 'at':, then this will basically never happen. There's little reason to re-read starting at 0 (and we have the file locked against other SyncAccessHandles). The only reason I can think of for write to default to 0 would be to write a new checkpoint, or some such -- but again, given emscripten, I doubt this is occurring.
  2. You now have two interfaces for dealing with files here: WritableFileStream, which has a cursor: wfs.write("1234567890"); wfs.seek(6); wfs.truncate(5); wfs.write('abc') -> '12345abc', and FileSystemSyncAccessHandle, which doesn't.

@a-sully
Copy link
Collaborator

a-sully commented Nov 15, 2022

I agree that the "default-to-cursor" behavior is better and this API probably should have been specified that way initially (especially considering that it matches the other file-writing API).

That being said, this API is being used in the wild and I want to make sure we don't silently break those applications...

That being said, we're lucky that this API is predominantly used by a handful of libraries (for now). As long as they're in the loop and not opposed to this change, it seems reasonable to make it.

@ankoh @sgbeal @tlively do you have strong opinions against the "default-to-cursor" behavior?

@tlively
Copy link

tlively commented Nov 15, 2022

"default-to-cursor" makes sense to me. Emscripten would continue to use "at:" like it always has, so no Emscripten applications would be broken by this change.

@sgbeal
Copy link

sgbeal commented Nov 15, 2022

... do you have strong opinions against the "default-to-cursor" behavior?

We're always specifying the "at" position (because that's how the sqlite3 API is shaped), so we'd be unaffected by any change in behavior when "at" isn't provided. Have it at!

@a-sully
Copy link
Collaborator

a-sully commented Nov 15, 2022

Thanks for the feedback! Changing the behavior SGTM. @jesup would you like to put up a PR?

@jesup
Copy link
Contributor Author

jesup commented Nov 29, 2022

I will

@jesup
Copy link
Contributor Author

jesup commented Nov 29, 2022

Actually that's a pain, since the SyncAccessHandle PR #21 hasn't been merged. I'll write some mods to that.
(edit) No, it did merge. Why doesn't my fork show it? Investigating

@jesup
Copy link
Contributor Author

jesup commented Nov 30, 2022

#76

@annevk
Copy link
Member

annevk commented Dec 3, 2022

Forgot to close this via the commit message. 730e2ad fixed this.

@annevk annevk closed this as completed Dec 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants