[Q]: splice operations with two pipes #192

Mic92 · 2017-10-18T12:43:59Z

Is there a reason you have not used a single pipe and vmsplice instead of using two pipes to write the header? https://github.com/hanwen/go-fuse/blob/master/fuse/splice_linux.go#L22 I am currently working on my own implementation in rust and would be interested why you did that.

The text was updated successfully, but these errors were encountered:

hanwen · 2017-10-18T18:50:14Z

the reply header includes the response size, which you can only know if you read the data out of the fd.

Mic92 · 2017-10-18T18:54:46Z

What libfuse does is to fallback on short read from splice to sending data via iovector instead. However I expect in the average case to have less context switches compared to your implementation. Have you considered doing this? I have not measured both variants.

hanwen · 2017-10-19T08:33:29Z

oh, that is a good idea; I should have looked at libfuse when I implemented this.

btw, don't copy my API (returning a ReadResult), which is awkward. I somewhat regret that I didn't simply stick with Read(buf []byte), which is much more straightforward. The bazel.org FUSE api, which passes in the request so you can do req.Reply( .. ) is also more straightforward (but a little less composable)

hanwen · 2017-10-19T10:27:51Z

see 42d2adc

It's not that obvious to me that this that much better. It would be useful to see some benchmarks; I think it depends on the size and frequency of the partial reads.

You could decrease the number of syscalls by not clearing the pipe for successful reads afterwards, but that would make error handling more complicated.

Mic92 · 2017-10-19T12:35:39Z

@Nikratio I would be very interested in your opinion as well.

Nikratio · 2017-10-20T07:46:39Z

I'm afraid I can't contribute much. The exact rationale for adding splice support has been lost to the dust of history. For libfuse3, I would have liked to either always use splice or never use it (since I am pretty sure that the majority of filesystem developers and users have no real idea when to use it or not use it). Unfortunately, at the time I couldn't find any good benchmarks and didn't have the time to come up with something myself either (you'd first need to define what exactly a representative workload is).

Mic92 · 2017-10-20T08:34:26Z

So only @szmi could know that.

hanwen · 2017-10-20T12:25:48Z

some random measurements:

for go-fuse:

splice yields a 10% throughput improvement compared to no splice
opportunistic (what the question was originally about) yields another 10% improvement for small files. Curiously, it makes no difference for large files.

it's possible that the difference is larger for libfuse, since libfuse has less memory (de)allocation overhead. Let me test.

hanwen · 2017-10-20T12:57:23Z

I tried testing with libfuse3, but example/passthrough and example/passthrough_fh seem to use read through userspace. passthrough_ll looks as if it should be better (mentioning splice), but it is actually 2x slower for bulk reads.

go install github.com/hanwen/go-fuse/example/loopback; fusermount -u /tmp/x/ ; loopback /tmp/x /boot

with splice:

$ go install github.com/hanwen/go-fuse/example/benchmark-read-throughput && benchmark-read-throughput -bs 128 -limit 30000 /tmp/x/initramfs-0-rescue-12a4c82a414b4f18983362ce2122f69a.img
block size 128 kb: 30035.8 MB in 17.790569359s: 1688.30 MBs/s

without

$ go install github.com/hanwen/go-fuse/example/benchmark-read-throughput && benchmark-read-throughput -bs 128 -limit 30000 /tmp/x/initramfs-0-rescue-12a4c82a414b4f18983362ce2122f69a.img
block size 128 kb: 30035.8 MB in 19.25596025s: 1559.82 MBs/s

libfuse3

fusermount -u /tmp/z ; example/passthrough -f /tmp/z/
$ go install github.com/hanwen/go-fuse/example/benchmark-read-throughput && benchmark-read-throughput -bs 128 -limit 30000 /tmp/z/boot/initramfs-0-rescue-12a4c82a414b4f18983362ce2122f69a.img
block size 128 kb: 30035.8 MB in 20.437629672s: 1469.63 MBs/s

passthrough_fh
block size 128 kb: 30035.8 MB in 33.194200978s: 904.85 MBs/s

passthrough_ll
block size 128 kb: 30035.8 MB in 29.325975204s: 1024.21 MBs/s

Mic92 · 2017-10-20T13:11:08Z

I noticed yesterday that passthrough_ll don't use splice for read unless:

diff --git a/lib/fuse_lowlevel.c b/lib/fuse_lowlevel.c
index 031793a..d48539a 100644
--- a/lib/fuse_lowlevel.c
+++ b/lib/fuse_lowlevel.c
@@ -1913,6 +1913,7 @@ static void do_init(fuse_req_t req, fuse_ino_t nodeid, const void *inarg)
        LL_SET_DEFAULT(1, FUSE_CAP_ASYNC_DIO);
        LL_SET_DEFAULT(1, FUSE_CAP_IOCTL_DIR);
        LL_SET_DEFAULT(1, FUSE_CAP_ATOMIC_O_TRUNC);
+       LL_SET_DEFAULT(1, FUSE_CAP_SPLICE_WRITE);
        LL_SET_DEFAULT(se->op.write_buf, FUSE_CAP_SPLICE_READ);
        LL_SET_DEFAULT(se->op.getlk && se->op.setlk,
                       FUSE_CAP_POSIX_LOCKS);

is applied.

hanwen · 2017-10-20T13:25:36Z

I also tried cluefs which uses bazil.org/fuse. I removed the trace() calls, but

block size 128 kb: 145.8 MB in 6.003518478s: 24.29 MBs/s

(why do people like to use bazil.org/fuse? The mind boggles.)

hanwen · 2017-10-20T13:42:25Z

with Mic's patch:

$ go install github.com/hanwen/go-fuse/example/benchmark-read-throughput && benchmark-read-throughput -bs 128 -limit 30000 /tmp/z/boot/initramfs-0-rescue-12a4c82a414b4f18983362ce2122f69a.img
block size 128 kb: 30035.8 MB in 17.019056007s: 1764.84 MBs/s

so, a little faster than go-fuse (which is expected) but only a little (4%, which is pretty good)

hanwen · 2017-10-20T13:44:22Z

bazil.org/fuse uses a fresh buffer for each read,

https://github.com/bazil/fuse/blob/master/fs/serve.go#L1199

so large reads are dominated by allocation costs in the FUSE daemon.

hanwen · 2017-10-20T13:48:33Z

also, you asked about vmsplice, but in case of splicing, that is only useful for writing the header, no? What is the advantage of vmsplice over write(2) ?

Mic92 · 2017-10-20T13:50:04Z

Yes, this is just useful to write the header.

Mic92 · 2017-10-24T13:50:21Z

I wonder if memfd_create or an ordinary tmpfs-baked file could be used instead of a pipe, since it allows to seeking and writing at different offsets. Maybe it is slower because it requires more copies?

memfd would be way slower.

Mic92 · 2017-10-24T19:33:33Z

a micro-benchmark from me comparing splice, memfd, full discard and header discard for one memory page: https://gist.github.com/Mic92/c25ed7c331f6db927b246465420a55d7

hanwen · 2017-11-05T12:40:03Z

i'm going to close this for now.

Mic92 mentioned this issue Oct 20, 2017

Update FUSE kernel ABI zargony/fuse-rs#97

Merged

hanwen closed this as completed Nov 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Q]: splice operations with two pipes #192

[Q]: splice operations with two pipes #192

Mic92 commented Oct 18, 2017 •

edited

Loading

hanwen commented Oct 18, 2017

Mic92 commented Oct 18, 2017 •

edited

Loading

hanwen commented Oct 19, 2017

hanwen commented Oct 19, 2017

Mic92 commented Oct 19, 2017 •

edited

Loading

Nikratio commented Oct 20, 2017

Mic92 commented Oct 20, 2017

hanwen commented Oct 20, 2017

hanwen commented Oct 20, 2017

Mic92 commented Oct 20, 2017 •

edited

Loading

hanwen commented Oct 20, 2017

hanwen commented Oct 20, 2017

hanwen commented Oct 20, 2017

hanwen commented Oct 20, 2017

Mic92 commented Oct 20, 2017

Mic92 commented Oct 24, 2017 •

edited

Loading

Mic92 commented Oct 24, 2017

hanwen commented Nov 5, 2017

[Q]: splice operations with two pipes #192

[Q]: splice operations with two pipes #192

Comments

Mic92 commented Oct 18, 2017 • edited Loading

hanwen commented Oct 18, 2017

Mic92 commented Oct 18, 2017 • edited Loading

hanwen commented Oct 19, 2017

hanwen commented Oct 19, 2017

Mic92 commented Oct 19, 2017 • edited Loading

Nikratio commented Oct 20, 2017

Mic92 commented Oct 20, 2017

hanwen commented Oct 20, 2017

hanwen commented Oct 20, 2017

Mic92 commented Oct 20, 2017 • edited Loading

hanwen commented Oct 20, 2017

hanwen commented Oct 20, 2017

hanwen commented Oct 20, 2017

hanwen commented Oct 20, 2017

Mic92 commented Oct 20, 2017

Mic92 commented Oct 24, 2017 • edited Loading

Mic92 commented Oct 24, 2017

hanwen commented Nov 5, 2017

Mic92 commented Oct 18, 2017 •

edited

Loading

Mic92 commented Oct 18, 2017 •

edited

Loading

Mic92 commented Oct 19, 2017 •

edited

Loading

Mic92 commented Oct 20, 2017 •

edited

Loading

Mic92 commented Oct 24, 2017 •

edited

Loading