Sniffer.sniffOnFailure performs blocking request on AsyncClient callback, freezing all subsequent requests until it fails #27984
Labels
:Clients/Java Low Level REST Client
Minimal dependencies Java Client for Elasticsearch
Elasticsearch version (5.5.3):
Plugins installed: []
JVM version: Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)
OS version: Darwin MacBook-Pro-2.local 17.3.0 Darwin Kernel Version 17.3.0: Thu Nov 9 18:09:22 PST 2017; root:xnu-4570.31.3~1/RELEASE_X86_64 x86_64
Description of the problem including expected versus actual behavior:
Initializing the RestClient in the following manner (using SniffOnFailureListener):
Causes a thread race and freeze due to lock in Apache AsyncHttpClient when cluster nodes terminate.
The issue seems to be at
org.elasticsearch.client.sniff.Sniffer#sniffOnFailure
line. This method invokes the sniffing process in a call stack that originates in the Async event loop's failure() callback. This blocking on the async HTTP client thread blocks all other requests to that HTTP client indefinitely (no timeout) until the sniffing request times out by itself.Thread dump in blocked state:
Steps to reproduce:
Expected: the request execution using performRequest should not exceed ~1s
Actual: all calls to the rest-client after the onFailure sniffing method begins execution are blocked until the sniffing request that was caused by a previous request failure is timed-out.
Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: