Add Circuit breaker on Transport ResponseHandlers #66196

easyice · 2020-12-11T02:32:22Z

for every in flight Transport request， we add a handler at org.elasticsearch.transport.Transport.ResponseHandlers

public long add(ResponseContext<? extends TransportResponse> holder) {
            long requestId = newRequestId();
            ResponseContext existing = handlers.put(requestId, holder);
            assert existing == null : "request ID already in use: " + requestId;
            return requestId;
      }

some times, the users sent a lot of request, such as query, that they exceed the system's capacity，then, CPU utilization reached 100%, but ,The client still sends a large number of requests，the requests will be add in ResponseHandlers#handlers

then the Elasticsearch nodes will be oom

This happened to me a few times,I dump the jvm memory，open with Eclipse Memory Analyzer, it show the ResponseHandlers#handlers used 7.7GB memory

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-12-11T09:44:15Z

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner · 2021-01-14T08:48:27Z

I don't think there's a general transport-level solution for limiting the size retained by a response handler since the memory retained by a handler is pretty much independent of the transport layer. The OP doesn't tell us anything about what requests correspond with the problematic response handlers, nor what version they're using. Likely culprits include searches (in which case this duplicates #67478), stats (in which case this duplicates #55550) or bulks (in which case this is resolved by the new indexing pressure mechanisms). @easyice to what requests do the problematic handlers relate?

Hailei · 2021-01-14T09:03:26Z

I don't think there's a general transport-level solution for limiting the size retained by a response handler since the memory retained by a handler is pretty much independent of the transport layer.

I can't agree any more

@easyice is my coworker, The cluster that this issue mentioned is the same to me

ES version: 6.8.0, The request were bulk in the last two accident @DaveCTurner

DaveCTurner · 2021-01-14T09:16:12Z

Ok, there is a solution for limiting the size retained by bulk response handlers, and it's already implemented (as of 7.9) so I think there's no further action needed here.

easyice · 2021-01-15T10:03:27Z

@DaveCTurner Thanks for reply, Let me add the relate requests, It doesn't just appear in bulk，also in search request,like this:

DaveCTurner · 2021-01-15T10:07:32Z

Indeed, we're tracking the issue for search responses at #67478.

easyice added >enhancement needs:triage Requires assignment of a team area label labels Dec 11, 2020

DaveCTurner added the :Distributed Coordination/Network Http and internode communication implementations label Dec 11, 2020

elasticmachine added the Team:Distributed Meta label for distributed team (obsolete) label Dec 11, 2020

ywelsch added the team-discuss label Jan 5, 2021

jimczi removed the needs:triage Requires assignment of a team area label label Jan 12, 2021

This comment has been minimized.

Sign in to view

DaveCTurner added the feedback_needed label Jan 14, 2021

DaveCTurner closed this as completed Jan 14, 2021

williamrandolph mentioned this issue Jan 27, 2021

Add circuit breaker for response sizes #67478

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Circuit breaker on Transport ResponseHandlers #66196

Add Circuit breaker on Transport ResponseHandlers #66196

easyice commented Dec 11, 2020

elasticmachine commented Dec 11, 2020

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

DaveCTurner commented Jan 14, 2021

Hailei commented Jan 14, 2021 •

edited

Loading

DaveCTurner commented Jan 14, 2021

easyice commented Jan 15, 2021

DaveCTurner commented Jan 15, 2021

Add Circuit breaker on Transport ResponseHandlers #66196

Add Circuit breaker on Transport ResponseHandlers #66196

Comments

easyice commented Dec 11, 2020

elasticmachine commented Dec 11, 2020

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

DaveCTurner commented Jan 14, 2021

Hailei commented Jan 14, 2021 • edited Loading

DaveCTurner commented Jan 14, 2021

easyice commented Jan 15, 2021

DaveCTurner commented Jan 15, 2021

Hailei commented Jan 14, 2021 •

edited

Loading