Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bound the number of search results returned by elasticsearch #4026

Closed
nicktgr15 opened this issue Oct 31, 2013 · 7 comments
Closed

Bound the number of search results returned by elasticsearch #4026

nicktgr15 opened this issue Oct 31, 2013 · 7 comments
Labels

Comments

@nicktgr15
Copy link

Hello,

When making a search request (post) like the following to elasticsearch using kibana I get a java heap space error and my elasticsearch node can't recover.

{"query":{"filtered":{"query":{"bool":{"should":[{"query_string":{"query":"*"}}]}},"filter":{"bool":{"must":[{"match_all":{}},{"bool":{"must":[{"match_all":{}}]}}]}}}},"highlight":{"fields":{},"fragment_size":2147483647,"pre_tags":["@start-highlight@"],"post_tags":["@end-highlight@"]},"size":1000000,"sort":[{"_id":{"order":"desc"}}]}

Question: Is it possible to somehow restrict the maximum value of "size" that someone can use? In general, Is it possible to bound the number of results returned by elasticsearch to avoid out of memory errors?

I don't want to increase the heap size (currently 1gb) as this would not solve the problem.

Regards,
Nick

@nicktgr15
Copy link
Author

It looks like the reason why the nodes were unable to recover was the fact that the cluster was getting into a split brain state (multiple master nodes).

In general I don't think that there is a way to limit the number of results returned by a query.

@javanna
Copy link
Member

javanna commented Nov 4, 2013

We plan to have something called "circuit breaker" that allows to prevent queries from bringing down a node if there is not enough memory. The related issue is #2929 .
In your case your size is way too high though, thus I would suggest to just lower it to a reasonable amout of documents.

@ghost ghost assigned javanna Nov 4, 2013
@javanna javanna removed their assignment Aug 1, 2014
@javanna javanna added the discuss label Aug 1, 2014
@clintongormley
Copy link
Contributor

Rather than adding a setting specifically to limit the size of the priority queue, we should aim to limit the amount of memory used by a request, and how long a request can run. This potentially allows admins to specify different policies for different users.

First step is to add the priority queue to the circuit breaker.

@bobbyhubbard
Copy link

Just to be clear, #5466 addresses a bug related to specifying a size above 999999 that causes significant performance degradation. The size of the index and memory consumed seem to have absolutely nothing to do with the issue. i can reproduce this bug even with an index with 1 tiny document in it... so it can't be related to loading a huge result set in memory. For example in our production ES cluster (v1.1.1):

PUT sizebugtest/nada/1
{
  "key":"value"
}
PUT sizebugtest/nada/2
{
  "key":"value2"
}

#returns both documents in 2-3ms
GET sizebugtest/nada/_search?   
#returns both documents in 3-5ms
GET sizebugtest/nada/_search?size=999999
#returns both documents in 8-25ms
GET sizebugtest/nada/_search?size=9999999
#returns both documents in 50-100ms
GET sizebugtest/nada/_search?size=99999999
#returns both documents in 7000-30,000ms!! Somestimes times out. same 2 documents!
GET sizebugtest/nada/_search?size=999999999
#400 - Awesome...no longer an int...!
GET sizebugtest/nada/_search?size=9999999999

Why the significant difference in response time for same index simply by specifying a different size? That seems like a different issue from what #4026 addresses imo.

@clintongormley
Copy link
Contributor

@bobbyhubbard no they are related. specifying a large size (or a high from offset) means creating a large priority queue. By adding the size of the priority queue to the circuit breaker, we can abort the search if too much memory is required to service the request. That's a good generic solution instead of having a separate setting for each little part of the request.

@bobbyhubbard
Copy link

Ah ok. BTW - I just upgraded our dev environment to 1.3.2 to confirm if this was still an issue or not. In production running 1.1.1 I can reproduce it all day long using the test case above. However, I cannot reproduce it against 1.3.2.

@clintongormley
Copy link
Contributor

Closing in favour of #9311

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants