Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop over-allocating receive buffers for large writes #112

Closed
wants to merge 1 commit into from

Conversation

damz
Copy link

@damz damz commented Nov 10, 2015

This is a straight rebase of #106 by @jacobsa.

The maxWrite constant used to size buffers was 16 MiB, even though
InitRequest.Respond always tells the kernel we want writes of 128 KiB.
Caveat: mount_darwin.go implies that on darwin we were still telling
the kernel 16 MiB, but experimentally it was capping this to 1 MiB.

Now use exactly those sizes on those operating systems. On FreeBSD
continue to use 128 KiB, because I'm not sure what it does.

Allocating and zeroing buffers in allocMessage is a large source of CPU
usage in a read-heavy load; I believe this change will help there.
@kahing
Copy link

kahing commented Dec 9, 2015

👍 in a testcase I get ~30% speedup with this patch

@damz
Copy link
Author

damz commented Dec 10, 2015

@tv42 Can we get a yay/nay?

@tv42
Copy link
Member

tv42 commented Dec 10, 2015

Trying to find enough time.. trust me this is not forgotten.

@damz
Copy link
Author

damz commented Dec 10, 2015

@tv42 I can relate. No problem.

@bufdev
Copy link

bufdev commented Jan 29, 2016

Super +1, also related to #120 haha.

@dturner-tw
Copy link

On #106 (the original version of this patch), @tv42 commented "Bumping that to 16MB improved OSX performance hugely"

I assume that this only affects performance on large write operations. If that's so, I have a slightly different idea for solving this problem: allocate 16MB buffers, but copy small requests into new small buffers. Most requests will be under (say) 512 bytes, so the copying won't be too expensive. And it will save about 99% of the space. This works particularly well for "forget storms" that Linux sometimes sends. @tv42 would that be more likely to be merged? If so, I'll post it.

@bufdev
Copy link

bufdev commented Jan 29, 2016

If you're really feeling ambitious, you could do a full-on segregated list version, and over time, benchmarks could tune the size of the buckets heh.

@damz
Copy link
Author

damz commented Jan 29, 2016

The big question is: do we really, really need to read every request in one go? Is there really no way to read the header first and then read the body of the request?

Public documentation on the semantics of the fuse device is sparse.

@tv42
Copy link
Member

tv42 commented Jan 29, 2016

@damz: Yes, the kernel interface (at this point) is one read(2) is one message; there are no length prefixes in the data.

As for everyone concerned with performance, the real fix (for Linux) is vmsplice: #35

@damz
Copy link
Author

damz commented Jan 29, 2016

@tv42 I guess you mean splice(2)? Well, for linux it is a bit moot, if it only supports a maximum of 128 kB writes anyway.

@dturner-tw
Copy link

Yes, we need to read every request in one go. At least that's my read of the Linux kernel code and of the MacFuse code which appears to be a direct port of the Linux stuff.

@dturner-tw
Copy link

(On the other hand, if we issue a too-small read, we just get EIO and can try again with a larger buffer)

@bufdev
Copy link

bufdev commented Jan 29, 2016

This might be extremely naive, so feel free to yell at me (I'm just looking at https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/fuse.h etc now), but if the header has a defined size, or a min/max size, even if it means Conn.rio lock is locked a tiny bit longer, could you not do syscall.Read for this size of the header, and then do a subsequent read afterwards?

@bufdev
Copy link

bufdev commented Jan 29, 2016

The header at least appears to be a defined size, so that's nice, but that last comment might still be nonsense.

@dturner-tw
Copy link

@peter-edge If I understand you correctly, you're proposing reading the header first, then the rest of the message. Unfortunately, that won't work, because of the somewhat unusual way that fuse works. Fuse has the weird rule that you have to read a whole request (header+body) in one read() syscall. If you try to read just the header (or anything short of the whole thing) you'll get EIO.

@bufdev
Copy link

bufdev commented Jan 29, 2016

Heh I just learned that, was reading more...commented too soon.

So that's annoying.

@tv42
Copy link
Member

tv42 commented Jun 24, 2016

@dturner-tw:

This works particularly well for "forget storms" that Linux sometimes sends

Two answers for that:

@tv42
Copy link
Member

tv42 commented Jun 24, 2016

Dug through the history. The reasoning for the 16MB maxWrite value was that OSXFUSE crapped out without it! I misremembered that as performance -- though I think it did help that too. See #42

I'll push something that sorts this out Very Soon Now(tm).

@tv42
Copy link
Member

tv42 commented Jun 24, 2016

This should solved by 9e8e652. Had to jump through hoops to avoid OSXFUSE bugs, so ended up using my own commits with more of the reasoning written down.

@tv42 tv42 closed this Jun 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants