-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sendmmsg support for UDP #1034
base: master
Are you sure you want to change the base?
Conversation
Catchup with laters iper3 changes
635e879
to
6f8918e
Compare
@bmah888 I have change the PR to use The UDP throughput enhancement achieved for high throughput interfaces is quite substantial, especially when the UDP messages are not large (witch is usually the case). |
@bmah888 Any plan to have this PR reviewed and merged? |
i = 0; /* count of messages sent */ | ||
r = 0; /* total bytes sent */ | ||
while (i < sp->sendmmsg_buffered_packets_count) { | ||
j = sendmmsg(sp->socket, &sp->msg[i], sp->sendmmsg_buffered_packets_count - i, MSG_DONTWAIT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before each sendmmsg(2)
call, you should poll for socket write readiness. In my tests, this can significantly improve performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more details:
- How do you do the polling?
- From your experience, what should be done if the socket is not ready for write? The current design of the code is that the function does not return before all was sent (or there was an error). Do you suggest that in case the socket is not ready for write the function will return successfully, but without sending anything or before all was sent?
- Do you understand why the method you suggest improve performance? I am asking since in any case, iperf3 will retry sending.
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you understand why the method you suggest improve performance?
In my case, I was doing raw syscalls in my Go program. The UDP socket opened by the Go runtime is in non-blocking mode. With sendmsg(2)
, if the sending operation was going to block, sendmsg(2)
would return -EAGAIN
or -EWOULDBLOCK
, which is handled by the Go runtime to poll for socket write readiness with epoll
. The calling goroutine can then be parked by the runtime to free the OS thread. (My limited understanding of Go internals might be inaccurate.)
Now with sendmmsg(2)
, according to the manual, a nonblocking call sends as many messages as possible (up to the limit specified by vlen) and returns immediately. By treating a non-complete return value the same way as -EAGAIN
and -EWOULDBLOCK
, that is, instead of immediately calling sendmmsg(2)
again, I instructed the Go runtime to poll for write readiness before the next sendmmsg(2)
call. This change yielded a 10% increase in throughput.
The current design of the code is that the function does not return before all was sent (or there was an error).
I'm not familiar with iperf3
's code base. I just read some code, and it seems to me that iperf3
uses sockets in blocking mode for UDP tests. In this case, maybe it's better to simply drop the MSG_DONTWAIT
flag, sendmmsg(2)
would then only return when all messages have been sent. This saves even more syscall overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@database64128, thanks a lot for the detailed explanation.
I will have to check how easy it is to implement this. To minimize iper3 design changes, the approach I took for sendmmsg
is to accumulate packets iperf3 is sending and send them in bursts using sendmmsg
. It may be that instead of the for
loop, sendmmsg
can be called once. In this case all the packets that were not sent can either be moved to the beginning of the buffer or ignored. (The issue with ignoring is that the packets are numbered, so the new packets numbering should start from the last successful packet sent.)
Version of iperf3 (or development branch, such as
master
or3.1-STABLE
) to which this pull request applies:3.10.1 latest master
Issues fixed (if any):
UDP throughput issue #873
Brief description of code changes (suitable for use as a commit message):
Add
sendmmsg
support for sending UDP messages for enhanced throughput.sendmmsg
is used by setting the-Z
option (which is currently used only for TCP), as it is regarded as the UDP's alternative to TCP's zero copy.The number of packets that are send by each call to
sendmmsg
is theburst
size set by the-b
option.Note:
configure.ac
was changed so runningbootstrap.sh; configure
is required for the changes to take effect. (New defines areHAVE_SENDMMSG
,HAVE_RECVMMSG
andHAVE-SEND_RECVMMSG
.)recvmmsg
is not used because tests showed does not help the throughput and event may hart it. However, the changes for testingrecvmmsg
are commented out iniperf_udp_recv()
and not removed in case further evaluation is desired. If this is not the case, then all changes tpiperf_udp_recv
can be removed.