-
-
Notifications
You must be signed in to change notification settings - Fork 178
Sometimes it's important to check for socket writeability before trying to write #446
Comments
On further discussion (see the curio issue), it sounds like the tentative conclusion is:
|
At the lowest level in asyncio (i.e. if you have a socket) if you just
start sending, loop.sock_sendall() you will indeed be hit by this if the
optimization misfires. But an app can work around this using
loop.add_writer().
At the next level you have a Protocol/Transport pair, which has a
synchronous write() method that contains this optimization. That's not so
easy to work around at the app side, but there is a Protocol API that could
be used for this: pause_writing()/resume_writing(). We can probably change
the default transport implementation so that it uses these more
aggressively, without API changes.
asyncio streams are built on top of Protocol/Transport pairs, so at that
level we should be able to benefit from whatever we do for the previous
level.
@glyph has this reached Twisted yet?
PS Jim Gettys has been complaining about this for years. Glad something's
finally being done about it. And @njsmith, thanks for the clear
explanations!
|
I guess this is trying to address the buffer bloat problem?... |
@gjcarneiro: bufferbloat is a many-headed hydra, but yeah, this is about bufferbloat in the context of per-socket send buffers specifically. The discussion thread on the curio issue has lots more details. |
Not However, in the process of investigating this, I learned that we apparently removed the eager-write optimization many years ago: Digging into the history and viewing some of the discussion around that time, it seems that we were aware that it punished us pretty brutally on certain micro-benchmarks, but there's no realistic benchmark we could find where it impacts performance significantly. @dabeaz points out over on the other ticket that it's a massive performance penalty to an echo-server benchmark, and that's true; however, echo is not a realistic application. If you want to do anything interesting you need to talk to at least one other back-end service, which means that you need to carefully manage the relationship between two transports, which means you need a producer/consumer hookup. Once you have that, you can't really get the meat of the optimization that eager-writes give you, which is the ability to avoid the extra select/epoll/kqueue(etc) syscall between It also does punish the writer on benchmarks where you are synthesizing data on the CPU rather than getting it or processing it from a different remote source, but That said, I don't think Twisted is a great model to look towards for good support for tunables; tuning has historically been a weak point for us, because users who have significant performance demands almost always end up fixing them by making scaling up and down easier rather than optimizing throughput. Also, the only application where this sort of tuning tends to make any difference is something that is just shuttling around huge volumes of data without really processing it, and if you're doing that you're more likely to use HAProxy or something. That said, I really appreciate learning about this nuance of |
I should note that I have an interest in adding support for TCP_NOTSENT_LOWAT into Twisted because it's highly-valuable for HTTP/2, where it's extremely valuable to keep send buffers small if possible to prevent control frames getting blocked behind buffered stream data. That means that support for APIs of that kind is likely to want to be something asyncio provides as well. However, I disagree with @njsmith's assertion that asyncio just wants to start using it by default. In particular, for bulk unframed data transfers where throughput is more important than reactivity, applications will want to avoid spinning up the Python event loop wherever possible: for that reason, large writes are ideal and using TCP_NOTSENT_LOWAT with a bad value will have nasty negative performance impacts. The biggest case of this is for protocols like FTP and HTTP/1.1, particularly when In the worst-case of a 100% CPU-utilisation event loop, aggressively low values of TCP_NOTSENT_LOWAT can lead to pauses in data transfer because the event loop isn't able to respond to the POLLOUT event before the kernel send buffer empties entirely. It is much better for asyncio to expose this kind of tuneable rather than opt-into it by default. Let application developers decide what the performance characteristics of their protocols should be. |
Ah, but that can be handled by the library too. On OS X, the splitting of On Nov 15, 2016 04:44, "Cory Benfield" [email protected] wrote:
|
I recently discovered that Linux/OS X provide an important API (TCP_NOTSENT_LOWAT) that lets applications avoid queuing up excessive data inside the kernel's socket send buffers. (The socket send buffers are generally too big, for various reasons.) Unfortunately, it turns out that this API works by controlling when a socket is marked writeable by
select
and friends, but does not affect whether asend
call will succeed, so while you might think these are the same thing they actually aren't. [Edit: it turns out that this description is actually incorrect on Linux, though probably true on macOS -- see] I initially filed a bug on curio about this because curio was assuming they were the same, so I won't repeat all the details: dabeaz/curio#83@dabeaz points out that asyncio seems to make the same invalid optimization, so filing a bug here too.
The text was updated successfully, but these errors were encountered: