-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop over-allocating receive buffers for large writes #112
Conversation
The maxWrite constant used to size buffers was 16 MiB, even though InitRequest.Respond always tells the kernel we want writes of 128 KiB. Caveat: mount_darwin.go implies that on darwin we were still telling the kernel 16 MiB, but experimentally it was capping this to 1 MiB. Now use exactly those sizes on those operating systems. On FreeBSD continue to use 128 KiB, because I'm not sure what it does. Allocating and zeroing buffers in allocMessage is a large source of CPU usage in a read-heavy load; I believe this change will help there.
👍 in a testcase I get ~30% speedup with this patch |
@tv42 Can we get a yay/nay? |
Trying to find enough time.. trust me this is not forgotten. |
@tv42 I can relate. No problem. |
Super +1, also related to #120 haha. |
On #106 (the original version of this patch), @tv42 commented "Bumping that to 16MB improved OSX performance hugely" I assume that this only affects performance on large write operations. If that's so, I have a slightly different idea for solving this problem: allocate 16MB buffers, but copy small requests into new small buffers. Most requests will be under (say) 512 bytes, so the copying won't be too expensive. And it will save about 99% of the space. This works particularly well for "forget storms" that Linux sometimes sends. @tv42 would that be more likely to be merged? If so, I'll post it. |
If you're really feeling ambitious, you could do a full-on segregated list version, and over time, benchmarks could tune the size of the buckets heh. |
The big question is: do we really, really need to read every request in one go? Is there really no way to read the header first and then read the body of the request? Public documentation on the semantics of the fuse device is sparse. |
Yes, we need to read every request in one go. At least that's my read of the Linux kernel code and of the MacFuse code which appears to be a direct port of the Linux stuff. |
(On the other hand, if we issue a too-small read, we just get EIO and can try again with a larger buffer) |
This might be extremely naive, so feel free to yell at me (I'm just looking at https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/fuse.h etc now), but if the header has a defined size, or a min/max size, even if it means Conn.rio lock is locked a tiny bit longer, could you not do |
The header at least appears to be a defined size, so that's nice, but that last comment might still be nonsense. |
@peter-edge If I understand you correctly, you're proposing reading the header first, then the rest of the message. Unfortunately, that won't work, because of the somewhat unusual way that fuse works. Fuse has the weird rule that you have to read a whole request (header+body) in one read() syscall. If you try to read just the header (or anything short of the whole thing) you'll get EIO. |
Heh I just learned that, was reading more...commented too soon. So that's annoying. |
Two answers for that:
|
Dug through the history. The reasoning for the 16MB I'll push something that sorts this out Very Soon Now(tm). |
This should solved by 9e8e652. Had to jump through hoops to avoid OSXFUSE bugs, so ended up using my own commits with more of the reasoning written down. |
This is a straight rebase of #106 by @jacobsa.