-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gen_udp:send execution time increases with caller's queue length #6455
Comments
I think we got same issue for our app.
we also collect msacc, top processes(https://gist.github.com/stolen/9a28ed9403c724541b0ee5fcd822613e), message queue. |
We know what the problem is (the receive in |
@KennethL Thank you for looking at this. I thought that maybe was a bug but you are right and Reading again the code of Why does Looking at the underlying C function in Thank you :) |
I ran the benchmark with If anyone is interested in the numbers, I pasted the output at the end of the benchmark. I didn't know about this option when I opened the bug so I think this option can be used as alternative. Maybe the bug can be closed but I'll let you to decide if still worth looking at it or you prefer to close it. Thank you |
* raimo/prim_inet-send-optimization/GH-6455: Use the 'local' option in term_to_binary Fail noisily for error between prim_inet and inet_drv Rework how caller is handled Fix whitebox test Pass a reference to port command for receive optimization of reply
I have merged a solution into our 'master' branch for OTP-26.0-rc2 with internal ticket ID OTP-18520. Sadly enough there are more operations, such as |
@RaimoNiskanen Thank you :) |
Describe the bug
gen_udp:send
gets slowed down by the message queue that the caller has.I suspect the problem is inside the
gen_udp:send
implementation which calls toerlang:port_command
and waits in areceive
for the completion in this line. I don't think it puts any mark in the queue so thereceive
needs to do a full queue scan looking for theinet_reply
.We have been hit by this issue a few times in production recently. We have a process with the sole responsibility of receiving messages and forwarding them to a udp socket that built up a queue of 2M messages. At some point, the forwarder was getting messages at a regular rate but the queue wasn't going down. We found that the clients weren't receiving messages at the usual rate. In the server side, we found the process was spending all its time in
prim_inet:do_sendto
so we thought it was stuck ingen_udp:send
, but it was actually processing them very slowly. Looking at my benchmark, it would take approximately 50ms to process each message with a queue of 2M as we had.To Reproduce
I wrote a benchmark that reproduces the issue.
Expected behavior
I would expect
gen_udp:send
to take constant time or not be influenced so much by the caller's queue size.Affected versions
I checked in OTP25.1 and master.
Additional context
I opened a related topic in erlang forums before opening this one where I posted the code and the debug I did in more details.
The text was updated successfully, but these errors were encountered: