-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WasmFS JS API: Implement truncate #19543
WasmFS JS API: Implement truncate #19543
Conversation
assert(s.st_size == 8); | ||
|
||
EM_ASM( | ||
FS.truncate('truncatetest', 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with all these changes it would be good to look for other tests that already use this API, if any? If we were previously disabling, or not running, those tests under WASMFS it would be good to enable them now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see anything explicitly calling FS.truncate()
in the test dir, so I added them to test_fs_js_api
for now. Are there any places besides the test directory that could potentially contain disabled tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it would only be in the test dir. I guess we had not tests for that API. Thanks for looking into it. Hopefully after we are done with this we will have much better test coverage for the JS API!
src/library_wasmfs.js
Outdated
truncate: (path, len) => { | ||
if (len < 0) { | ||
throw new FS.ErrnoError({{{ cDefs.EINVAL }}}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one calls truncate("foo", -1)
I would probably expect it to try to resize of MAX_INT or something like that.. so maybe we don't need this check and we can just let the underlying call fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I was thinking, but I had one test that tried calling truncate
with -10. I tried printing out the value the syscall receives, which was 4294867286
. But instead of setting the return value to -1 and setting errno
accordingly, I think the syscall aborts.
Aborted()
RuntimeError: Aborted(). Build with -sASSERTIONS for more info.
at abort (/home/ubuntu/Documents/emsdk/emscripten/out/test/test_fs_js_api.js:1:5674)
at _abort (/home/ubuntu/Documents/emsdk/emscripten/out/test/test_fs_js_api.js:1:14749)
at <anonymous>:wasm-function[187]:0xd2a9
at <anonymous>:wasm-function[197]:0xe13e
at <anonymous>:wasm-function[172]:0xba9a
at <anonymous>:wasm-function[178]:0xce81
at _wasmfs_truncate (<anonymous>:wasm-function[261]:0x13101)
This is the error. It doesn't seem to match the pattern of how we've been returning JS API errors before (Normally the syscall returns -1
and I can return errno
to JS, then throw an error from JS with an error name and code). Is this close to what you were saying by letting the call fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we don't want to syscall aborting like that. If you build with -g
or --profiling-funcs
you should get functions names for those wasm functions in the backtrace BTW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is strange because doTruncate
does return -EINVAL when its input is negative. It would be good to get to the bottom of this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that doTruncate
takes in off_t size
, but truncate64
and ftruncate64
take in size as a uint64_t
, so is it possible that the size is always positive in doTruncate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is needed for the Node backend, as it silently accepts negative offsets rather than failing with EINVAL
, see: nodejs/node#35632.
Though, it looks like setSize()
is currently not implemented for the Node backend:
emscripten/system/lib/wasmfs/backends/node_backend.cpp
Lines 120 to 122 in 48b0645
int setSize(off_t size) override { | |
WASMFS_UNREACHABLE("TODO: implement NodeFile::setSize"); | |
} |
(i.e. this PR won't work for
-sNODERAWFS
+ -sWASMFS
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really interesting. I don't think I've been testing with NODEFS
or NODERAWFS
. Should I start adding those flags to test_fs_js_api
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If setSize()
(and insertMove()
due to FS.rename()
) is supported in WasmFS' Node backend (which is probably better as a follow-up), you could do:
--- a/test/test_core.py
+++ b/test/test_core.py
@@ -6017,6 +6017,7 @@ Module = {
def test_fs_writeFile(self):
self.do_run_in_out_file_test('fs/test_writeFile.cpp')
+ @also_with_noderawfs
def test_fs_js_api(self):
self.set_setting("FORCE_FILESYSTEM")
self.do_runf(test_file('fs/test_fs_js_api.c'), 'success')
After that, you can verify with:
$ ./test/runner wasmfs.test_fs_js_api* core2.test_fs_js_api*
Whether the test passes on both memory and Node backend, either with or without WasmFS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After going back to the problem this morning, I realized I never checked that the right value was getting passed to js_api.cpp
. I originally had _wamsfs_truncate
in js_api.cpp
take in size as type off_t
, and for some reason negative values would not be passed in correctly. After changing off_t
to long
, the tests pass without making any changes to syscalls.cpp
. I also tried passing in an int64_t
to js_api.cpp
, but that didn't seem to work. I initially wanted to use off_t
to match the POSIX API, but could there be a potential problem with using that type in js_api.cpp
?
Out of curiosity, I tried printing the values in syscalls.cpp
, and I could print uint64_t
as a signed integer using %lld
. When the uint64_t
then gets passed to doTruncate
(Which takes in size as type off_t
), does C automatically force the uint64_t
to become signed? What if the unsigned value is larger than the signed max?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
off_t
is defined here:
emscripten/system/lib/libc/musl/arch/emscripten/bits/alltypes.h
Lines 240 to 243 in 48b0645
#if defined(__NEED_off_t) && !defined(__DEFINED_off_t) | |
typedef _Int64 off_t; | |
#define __DEFINED_off_t | |
#endif |
So, it corresponds to int64_t
. I just opened PR #19559 for this.
Regarding the syscall interface, I think for consistency all the native syscalls take primitive types.. see system/lib/libc/musl/arch/emscripten/syscall_arch.h. If changing the types of those functions to |
system/lib/wasmfs/js_api.cpp
Outdated
if (err == -1) { | ||
return errno; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Syscalls don't set errno, so this doesn't look right. Syscalls return negative error codes and libc handles those by setting errno appropriately. I would expect this code to be checking whether err
is a negative error code (not necessarily -1) instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that errno
is being set by libc
behind the scenes? I've been doing this for the other js_api.cpp
functions and the error codes match with the legacy API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no, all those places should be fixed. For example, in _wasmfs_mknod
, it checks whether __syscall_mknodat
returns -1, and if so returns errno
. But __syscall_mknodat
never sets errno and doesn't return -1 on errors; instead, it either returns -EINVAL
or -EPERM
or other negative error codes that doOpen
can return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what went wrong now. The whole time I thought the syscalls were returning -1
, so the errno
values were returning correctly - turns out the if
statement is always false, so the methods were just returning the syscall return value and I never checked the actual return values of the syscalls before, which made all the tests pass. Those issues have all been addressed in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % separable comment.
system/lib/wasmfs/js_api.cpp
Outdated
int _wasmfs_truncate(char *path, long size) { | ||
return __syscall_truncate64((intptr_t)path, size); | ||
} | ||
|
||
int _wasmfs_ftruncate(int fd, long size) { | ||
return __syscall_ftruncate64(fd, size); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there's no extra logic necessary here, is it possible to skip these wrappers and just call the underlying syscall directly from the JS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, although I'm having trouble figuring out how to import the syscalls to library_wasmfs.js
. I've tried using __syscall_truncate64
and ___syscall_truncate64
(Three underscores) and neither worked. Normally, I would add _wasmfs_truncate
to emcc.py
, but I don't think it worked. How should I go about making the syscall available to call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that could be a followup.
When calling __syscall_truncate64
from JS you would use an extra underscore... but when adding it to the settings.REQUIRED_EXPORTS
list in emcc.py you would not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this as a follow-up sounds good to me.
As a side note, should we worry about the issue with using |
I'm hoping we can land #19559 today BTW.. since that tests already passed. |
emcc.py
Outdated
@@ -2338,6 +2338,8 @@ def phase_linker_setup(options, state, newargs): | |||
'_wasmfs_read', | |||
'_wasmfs_pread', | |||
'_wasmfs_symlink', | |||
'__syscall_truncate64', | |||
'__syscall_ftruncate64', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason to use the raw syscalls and not a _wasmfs_*
wrapper? Wrappers have the benefit that changes to the syscalls don't affect them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thomas suggested a little earlier that since truncate
doesn't have any additional processing, it might make sense to directly use the raw syscall. However, I do see your point that a wrapper would be beneficial. Would it be preferable to go back to calling the wrapper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now you should continue to use wrapper functions and we can separately/later consider weather they are actually needed since (IIUC) these are not the only once that are simple pass-through functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't block landing anything on this conversation, but I still think pass-through wrappers aren't that useful. If we update the syscalls (which should be extremely rare), then we would still have to update their callers. Whether those callers are in C or JS doesn't make much of a difference as far as I can tell.
system/lib/wasmfs/js_api.cpp
Outdated
return -errno; | ||
} | ||
return err; | ||
return __syscall_openat(AT_FDCWD, (intptr_t)path, flags, mode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these bugfixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We realized that this check isn't needed, since the syscalls do not actually set errno
. The tests were passing before because err
was never equal to -1, so the js_api.cpp
function directly returned the error without using errno, giving the desired effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make a separate PR for this cleanup stuff?
This PR implements
truncate
andftruncate
. Tests were also added totest_fs_js_api
.One thing I noticed was that both the man pages and
doTruncate
syscall note thattruncate
andftruncate
take in size as typeoff_t
, but__syscall_truncate64
and__syscall_ftruncate64
take in size as auint64_t
. This means that passing in negative numbers directly to__syscall_truncate64
and__syscall_ftruncate64
gives a weird number and no error, so I had to add a manual check inlibrary_wasmfs.js
, which I didn't do for the other methods. Was there a specific reason that the two syscalls take inuint64_t
instead ofoff_t
?