Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code cleanup and fix #1 #2

Merged
merged 2 commits into from
Feb 7, 2020
Merged

Code cleanup and fix #1 #2

merged 2 commits into from
Feb 7, 2020

Conversation

CarterLi
Copy link
Contributor

@CarterLi CarterLi commented Feb 6, 2020

I got about 20% performance reduction for not sending redundant text, details unknown

@frevib
Copy link
Owner

frevib commented Feb 6, 2020

Thanks for taking the time! I will merge when I have some time to look at it.

Strange that we have -20% performance.

@frevib
Copy link
Owner

frevib commented Feb 6, 2020

I get about -75% performance when not sending redundant text.

@CarterLi
Copy link
Contributor Author

CarterLi commented Feb 7, 2020

Do you getter better performance with sendmsg/recvmsg?

No, I don't. But sendmsg/recvmsg won't raise SIGPIPE when there are huge number of connections.

We might have to wait until Linux 5.6, will support send/recv.

send/recv won't be much better then sendmsg/recvmsg, except send/recv are easier to setup.

I will test send/recv when Ubuntu's daily build comes out. For now send/recv are unusable due to this bug

I get about -75% performance when not sending redundant text.

Really?

@CarterLi
Copy link
Contributor Author

CarterLi commented Feb 7, 2020

I'd like to contribute my (customized) code to liburing examples, it could be useful for benchmarking for the creator of io_uring IMO.

Nevertheless it's modified from your repo. Would you agree with that?


struct io_uring_sqe *sqe = io_uring_get_sqe(ring);
iovecs[fd].iov_len = size;
Copy link
Owner

@frevib frevib Feb 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason this line (iovecs[fd].iov_len = size;) causes a ~70% performance drop:

without iovecs[fd].iov_len = size;:

max@ubuntu:~/hdv/source/rust_echo_bench$ cargo run --release -- --address "localhost:6666" --number 50 --duration 30 --length 512
    Finished release [optimized] target(s) in 0.01s
     Running `target/release/echo_bench --address 'localhost:6666' --number 50 --duration 30 --length 512`
Benchmarking: localhost:6666
50 clients, running 512 bytes, 30 sec.

Speed: 387685 request/sec, 387685 response/sec
Requests: 11630554
Responses: 11630554

with iovecs[fd].iov_len = size;:

max@ubuntu:~/hdv/source/rust_echo_bench$ cargo run --release -- --address "localhost:6666" --number 50 --duration 30 --length 512
    Finished release [optimized] target(s) in 0.01s
     Running `target/release/echo_bench --address 'localhost:6666' --number 50 --duration 30 --length 512`
Benchmarking: localhost:6666
50 clients, running 512 bytes, 30 sec.

Speed: 124698 request/sec, 124698 response/sec
Requests: 3740948
Responses: 3740947

Copy link

@quininer quininer Feb 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the lack of this line is causing to wrong behavior.

readv does not modify iov_len, which will cause writev to always send 2048 bytes.

Since bufs is uninitialized, this may cause UB.

Ah, this seems to be expected behavior. It's marvellous, I'm interested in this reason.

Copy link
Owner

@frevib frevib Feb 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the lack of this line is causing to wrong behavior.

readv does not modify iov_len, which will cause writev to always send 2048 bytes.

Since bufs is uninitialized, this may cause UB.

Ah, this seems to be expected behavior. It's marvellous, I'm interested in this reason.

It might have something to do with the used test tool. I see the same performance jump/drop with the epoll echo server.

It's however odd that the performance drops with 70-80% in the case of io_uring and 50% with epoll.

Copy link

@quininer quininer Feb 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, rust_echo_bench expects to return data of same length, and return too much data will cause shorter read times.

see https://github.com/haraldh/rust_echo_bench/blob/master/src/main.rs#L126

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks.

I might keep this echo server as simple as possible. The goal is to let people learn people the basics of io_uring. Maybe handling partial writes is a bit too much ;)

Copy link
Contributor Author

@CarterLi CarterLi Feb 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still don't know what does a partial write mean. Could you explain it further?

EDIT: did you mean that situations that message lengths are greater then MAX_MESSAGE_LEN?

Also please see my fork repo, I fixed the issue there, and got some banchmark results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, are these benchmark tools really reliable? How can you ensure that it's not the tool that limits the performance results.

Copy link
Owner

@frevib frevib Feb 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still don't know what does a partial write mean. Could you explain it further?

EDIT: did you mean that situations that message lengths are greater then MAX_MESSAGE_LEN?

Partial write happens when we try to write more bytes to the socket send buffer than is actually free in this buffer. We write until the buffer is full and then return, writing only a part of the all the bytes. If the receiving end expects the full message to return, this could lead to odd behaviour.

Also please see my fork repo, I fixed the issue there, and got some banchmark results.

Nice, I’ll have a look asap!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, are these benchmark tools really reliable? How can you ensure that it's not the tool that limits the performance results.

No, they are not. @quininer found out there’s a problem when returning more bytes than the amount of bytes that was received: #2 (comment)

This would make the epoll echo server ~70% more performant than the io_uring echo server, on my machine. This also doesn’t make any sense :)

@frevib
Copy link
Owner

frevib commented Feb 7, 2020

I'd like to contribute my (customized) code to liburing examples, it could be useful for benchmarking for the creator of io_uring IMO.

Nevertheless it's modified from your repo. Would you agree with that?

Absolutely, go for it :)

@frevib
Copy link
Owner

frevib commented Feb 7, 2020

We might have to wait until Linux 5.6, will support send/recv.

send/recv won't be much better then sendmsg/recvmsg, except send/recv are easier to setup.

Agree. On another note I've had the problem that when using io_uring_prep_readv, cqe->res was returning negative values when using Linux < 5.4 (maybe lower). So I would like to see what send/recv is doing.

I will test send/recv when Ubuntu's daily build comes out. For now send/recv are unusable due to this bug

I get about -75% performance when not sending redundant text.

Really?

Yes, see #2 (review)

@CarterLi
Copy link
Contributor Author

CarterLi commented Feb 7, 2020

We might have to wait until Linux 5.6, will support send/recv.

send/recv won't be much better then sendmsg/recvmsg, except send/recv are easier to setup.

Agree. On another note I've had the problem that when using io_uring_prep_readv, cqe->res was returning negative values when using Linux < 5.4 (maybe lower). So I would like to see what send/recv is doing.

Negative value returned by cqe-res is -errno
What values were returned?

I will test send/recv when Ubuntu's daily build comes out. For now send/recv are unusable due to this bug

I get about -75% performance when not sending redundant text.

Really?

Yes, see #2 (review)

I saw your twitter.

simple echo server. io_uring +99% performance, -45% cpu usage.

Wasn't the performance boost too huge? ;)

@frevib
Copy link
Owner

frevib commented Feb 7, 2020

We might have to wait until Linux 5.6, will support send/recv.

send/recv won't be much better then sendmsg/recvmsg, except send/recv are easier to setup.

Agree. On another note I've had the problem that when using io_uring_prep_readv, cqe->res was returning negative values when using Linux < 5.4 (maybe lower). So I would like to see what send/recv is doing.

Negative value returned by cqe-res is -errno
What values were returned?

If I remember correctly -22 and for and for a different echo server -14. I can rollback the kernel and try.

I will test send/recv when Ubuntu's daily build comes out. For now send/recv are unusable due to this bug

I get about -75% performance when not sending redundant text.

Really?

Yes, see #2 (review)

I saw your twitter.

simple echo server. io_uring +99% performance, -45% cpu usage.

Wasn't the performance boost too huge? ;)

Yes, unrealistically huge. I can't believe it's only io_uring itself doing it, maybe vectored read/write also has something to do with it.

@frevib frevib self-assigned this Feb 7, 2020
@frevib frevib merged commit feefec2 into frevib:master Feb 7, 2020
@CarterLi
Copy link
Contributor Author

CarterLi commented Feb 7, 2020

We might have to wait until Linux 5.6, will support send/recv.

send/recv won't be much better then sendmsg/recvmsg, except send/recv are easier to setup.

Agree. On another note I've had the problem that when using io_uring_prep_readv, cqe->res was returning negative values when using Linux < 5.4 (maybe lower). So I would like to see what send/recv is doing.

Negative value returned by cqe-res is -errno
What values were returned?

If I remember correctly -22 and for and for a different echo server -14. I can rollback the kernel and try.

  1. -22: -EINVAL, Invalid argument
  2. -14: -EFAULT, Bad address. It can be caused by a kernel bug. If you are sure you may file a bug report here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants