SearchShardFailure handling results in OOM #48910

altinp · 2019-11-07T22:42:21Z

This is either the same or closely related to #27596 reported by @mattweber - which was hoped fixed in 6.1+ but was closed for lack of feedback in the end. It likely was not fixed.

ES 6.6.2 (and 5.6.x)
latest OpenJDK 1.8
CentOS 7
ES heap 31GB
log-level: info
(This is at a client site and I don't have direct access to some details but can obtain if really needed.)

Problem is reproducible in various clusters, ranging from 8 to 50+ nodes, with thousands of shards. As long as a query fails (or more precisely is rejected) on each shard, e.g. because it trips too_many_clauses, an OOM will occur eventually when:
free_heap < 2 * (serialized_query_string_size) * num_shards
where serialized_query_string_size is at least as big as the user-submitted query.

So this is a case of the remedy being worse than the disease: in ES 6.6, the terms query now allows up to 64K terms (hence eventual clauses) by default, finally in line with the default terms-lookup limit, but if you write the query with (1K < separate term clauses < 64K), it's the protective mechanism that kills the coordinating node (just as in v 5.6 with a terms query with >1K terms)

I have analyzed the heap dumps and can provide reports. The reason seems to be that ShardSearchFailure holds on to two error strings that contain the whole pretty-printed query. One inside the cause [QueryShardException] and the other in the reason. In our ES 5.6 test, these strings were about the same size as the user query; in the ES 6.6 test, a simple 4MB user query (on-disk-size) such as:

{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "from.email.keyword": "foo1"
          }
        }, 
        {
          "term": {
            "from.email.keyword": "foo2"
          }
...

led to a 11.5 MB in-heap size for cause and reason (each)

2 *11.5MB * 1061 shards ~= 24 GB => OOM

Questions:

if the query trips a global limit, can't this be short-circuited at the query-coordinating node before sending it out to all data nodes (at least if that's not a coordinating-only node?). If not, detect that one of the incoming failures is of a global-tripwire type and drop the rest?
trying to return more than the first x bytes of the user query is clearly not useful
we have users and automated clients that can always submit (or programmatically generate!) queries with too many term clauses etc.

I will next submit screenshots and reports from the heap analysis.

NOTE: I don't think the issue is specific to logging the error, but to preparing a query response to return to the client. This contains at least one shard failure per index as seen in a response when the query is limited to a small number of indexes/shards and returns without OOM.

The text was updated successfully, but these errors were encountered:

altinp · 2019-11-08T02:03:24Z

Here is a screenshot from Eclipse MAT:

altinp · 2019-11-08T02:15:07Z

And zooming into one of the shard failures:

elasticmachine · 2019-11-08T20:30:10Z

Pinging @elastic/es-core-infra (:Core/Infra/Logging)

altinp · 2019-11-18T19:29:57Z

crash-query.json.zip
Hi @imotov,
Looks like this can be reproduced locally too, quite easily, and on the latest release, 7.4, as well:

get & run the latest release, using defaults (1G heap)
- can also run a 2/3-node local cluster instead, with same results
create an index with 80 shards (it can stay empty)
POST a term query like the above/attached. To create a sizable 4MB request, I had to go up to 40K clauses

The same OOM is produced. Note, if you turn off all logging, the OOM is ~immediate.
The heap dump looks much the same: the 4MB request I submitted is serialized to a ~6MB string internally (once "boost":1.0 etc. are added to each clause) and two error strings containing it are part of each shard failure as cause.detailMessage and reason.

So 12 * 80 = 960MB. In fact, the OOM is produced when the total from these strings is at about 750MB, given the system's own baseline heap usage.

elasticsearch.log

QueryBuilders that throw exceptions on shards when building the Lucene query returns the full serialization of the query builder in the exception message. For large queries that fails to execute due to the max boolean clause, this means that we keep a reference of these big messages for every shard that participate in the request. In order to limit the memory needed to hold these query shard exceptions in the coordinating node, this change removes the query builder serialization from the shard exception. The query is known by the user so there should be no need to repeat it on every shard exception. We could also omit the entire stack trace for known bad request exception but it would deserve a separate issue/pr. Closes elastic#51843 Closes elastic#48910

…ge (#51885) QueryBuilders that throw exceptions on shards when building the Lucene query returns the full serialization of the query builder in the exception message. For large queries that fails to execute due to the max boolean clause, this means that we keep a reference of these big messages for every shard that participate in the request. In order to limit the memory needed to hold these query shard exceptions in the coordinating node, this change removes the query builder serialization from the shard exception. The query is known by the user so there should be no need to repeat it on every shard exception. We could also omit the entire stack trace for known bad request exception but it would deserve a separate issue/pr. Closes #51843 Closes #48910

imotov added the :Core/Infra/Logging Log management and logging utilities label Nov 8, 2019

altinp mentioned this issue Feb 4, 2020

Account for TransportMultiSearchAction's response array in Circuit Breaker #32051

Open

jimczi mentioned this issue Feb 4, 2020

Remove the query builder serialization from QueryShardException message #51885

Merged

altinp mentioned this issue Feb 5, 2020

Don't serialize the entire query when query parsing fails #51843

Closed

jimczi closed this as completed in #51885 Feb 6, 2020

codebrain mentioned this issue Apr 1, 2020

7.7.0 meta ticket (Part 2) elastic/elasticsearch-net#4533

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SearchShardFailure handling results in OOM #48910

SearchShardFailure handling results in OOM #48910

altinp commented Nov 7, 2019 •

edited

Loading

altinp commented Nov 8, 2019

altinp commented Nov 8, 2019

elasticmachine commented Nov 8, 2019

altinp commented Nov 18, 2019 •

edited

Loading

SearchShardFailure handling results in OOM #48910

SearchShardFailure handling results in OOM #48910

Comments

altinp commented Nov 7, 2019 • edited Loading

altinp commented Nov 8, 2019

altinp commented Nov 8, 2019

elasticmachine commented Nov 8, 2019

altinp commented Nov 18, 2019 • edited Loading

altinp commented Nov 7, 2019 •

edited

Loading

altinp commented Nov 18, 2019 •

edited

Loading