-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Corrupted file read/write #50717
Comments
I'm hopeful https://github.com/libuv/libuv/releases/tag/v1.47.0 might resolve it. But until then this is quite a serious issue. |
@nodejs/fs |
|
If necessary I can demo the issue on the production system where it occurs over a Zoom call. |
Does calling |
Sorry that was a typo |
One more thing. Are you checking for partial writes? |
Partial writes should not be possible here. But even if they were, not sure it matters? The file size is exactly the highWaterMark and stays that size regardless how much many times write is called. Also the read never completes and reads infinite amount of data. |
fwiw, I've setup a shared samba folder in my computer which I'm accessing via the cifs driver and couldn't reproduce the issue. |
Hm. Must be some combination of kernel ver + docker ver + smb + ioring. |
We are working on upgrading kernel + docker version. However, it will take a few weeks before I can confirm whether or not that fixes the issue. |
In the meantime, can you try running your script for a bit with perf like this and post the results somewhere? There might be some useful info there: $ perf record -e "io_uring:*" -e "cifs:*" -- node test.mjs
# store the data into a file
$ perf script -i perf.data > output.log |
Unfortunately that doesn't seem to work so well: Inside node bullseye container: root@a4bae0cec5c2:/usr/src/app# apt-get update
root@a4bae0cec5c2:/usr/src/app# apt-get install linux-perf
root@a4bae0cec5c2:/usr/src/app# perf
/usr/bin/perf: line 13: exec: perf_5.15: not found |
What's happenning is that the perf version installed in the container doesn't match the host kernel version. You should manually build and install the correct perf version in the container (here's a useful link with instructions you can adapt to your use case). Also, remember running the container with the |
Here's the result from perf running in the |
@deadbeef84 Thanks for the sending the results. From what I understand, it's the write operation the issue here as it's always reading from offset 0. The read operation seems to be working just fine. See from this portion of the logs in bold how the offset in the reads are changing correctly while not in the writes:
I'm not sure that upgrading kernel versions will fix this as I was briefly looking at the commits in io_uring and cifs between those versions and not sure there's anything related. Worth giving it a try though. |
Are you able to test with #50650 to see if the further kernel restrictions in libuv around when io_uring is enabled helps? |
I believe we encountered a potentially related issue but with WebSockets that intermittently disconnect in a production environment with more users using our platform. Doesn't happen in testing with one/two connections. We're using cWS https://github.com/encharm/cWS The issue started happening after we updated to Ubuntu 22.04 LTS from 20.04 LTS. The update triggered the io uring code path which wasn't available in an older kernel. I confirmed UV_USE_IO_URING=0 to fix the issue. Will libuv 1.47.0 be backported into Node 20 LTS? |
Sorry, I think I ran a modified version with explicit |
@deadbeef84 thanks for the response. I would like to keep investigating this but I would need to be able to reproduce your environment. How difficult would it be? |
@Rush, that seems an unrelated issue (though also related to io_uring). Could you open a different issue describing the issue and, if possible, a reproducer? Thanks! |
We noticed some corrupted files on our system and basically it seems that file read/write does not seem to increment the file handle offset. Hence writes will always write to offset 0 and reads will always read from offset 0.
From preliminary testing the issue seems to have been introduced in v20.3. I suspect #48078 and libuv/libuv#3952.
Host kernel: 5.15.104-0-lts
Docker Daemon: 20.10.9
Docker Image: node:21.1.0-bullseye and 20.9.0-bullseye
The issue only seems to occur when writing/reading to a specific SMB mount.
If we instead pass an explicit position to write/read then everything works as expected.
This also affects,
fs.createWriteStream
andfs.createReadStream
.The text was updated successfully, but these errors were encountered: