Distinguish unwritable channel from blocked transport thread #77838
Labels
:Distributed Coordination/Network
Http and internode communication implementations
>enhancement
Team:Distributed (Obsolete)
Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Today we log slow processing of outbound messages like this:
sending transport message [Request{indices:data/write/bulk[s][r]}{123456}{false}{false}{false}] of size [1234] on [Netty4TcpChannel{localAddress=/99.99.99.99:9999, remoteAddress=99.99.99.99/99.99.99.99:9999, profile=default}] took [12345ms] which is above the warn threshold of [5000ms] with success [true]
The
took
time records the time until the message is completely written to the channel, which includes time spent waiting for the channel to become writeable as well as time spent waiting for the transport thread to get around to the transmission. We care about the latter (it indicates a bug) much more than the former (it indicates network saturation). It would be good to distinguish these two causes.The text was updated successfully, but these errors were encountered: