Soft limit added for max number of search contexts #29252

diwasjoshi · 2018-03-26T21:02:50Z

This change introduces a soft limit on the number of search contexts that can be made per node. The setting can be changed per index using the node.max_search_context setting.
Relates to 25244

Checklist

Have you signed the contributor license agreement? Yes
Have you followed the contributor guidelines? Yes
If submitting code, have you built your formula locally prior to submission with gradle check? Yes. Tests failing. Review needed before proceeding.
If submitting code, is your pull request against master? Unless there is a good reason otherwise, we prefer pull requests against master and will backport as needed. Yes
If submitting code, have you checked that your submission is for an OS that we support? Yes
If you are submitting this code for a class then read our policy for that. NA

elasticmachine · 2018-03-26T21:02:52Z

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

elasticmachine · 2018-03-26T21:02:52Z

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

elasticmachine · 2018-03-29T18:12:49Z

Pinging @elastic/es-search-aggs

jimczi

Thanks @diwasjoshi ! I left some comments regarding formatting, naming and when we should check the search contexts size. The default limit might be too low since it applies to all search (not only scroll) so I'd set something like 500 or 1000. Though this might be too big for scrolls so maybe we should apply this limit to scrolls only ?

jimczi · 2018-03-29T18:18:15Z

server/src/main/java/org/elasticsearch/search/SearchService.java

@@ -227,10 +229,10 @@ private void setDefaultSearchTimeout(TimeValue defaultSearchTimeout) {
    private void setDefaultAllowPartialSearchResults(boolean defaultAllowPartialSearchResults) {
        this.defaultAllowPartialSearchResults = defaultAllowPartialSearchResults;
    }
-    
+


nit: this change is not needed

jimczi · 2018-03-29T18:18:40Z

server/src/main/java/org/elasticsearch/search/SearchService.java

                this::setDefaultAllowPartialSearchResults);
-
-
+


nit: this is not needed

jimczi · 2018-03-29T18:18:56Z

server/src/main/java/org/elasticsearch/search/SearchService.java

@@ -161,7 +163,7 @@
    private volatile TimeValue defaultSearchTimeout;

    private volatile boolean defaultAllowPartialSearchResults;
-    
+


nit: please keep the formatting

jimczi · 2018-03-29T18:22:18Z

server/src/main/java/org/elasticsearch/search/SearchService.java

@@ -198,10 +200,10 @@ public SearchService(ClusterService clusterService, IndicesService indicesServic
        clusterService.getClusterSettings().addSettingsUpdateConsumer(DEFAULT_SEARCH_TIMEOUT_SETTING, this::setDefaultSearchTimeout);

        defaultAllowPartialSearchResults = DEFAULT_ALLOW_PARTIAL_SEARCH_RESULTS.get(settings);
-        clusterService.getClusterSettings().addSettingsUpdateConsumer(DEFAULT_ALLOW_PARTIAL_SEARCH_RESULTS, 
+        clusterService.getClusterSettings().addSettingsUpdateConsumer(DEFAULT_ALLOW_PARTIAL_SEARCH_RESULTS,


This can be added in the SearchService.

jimczi · 2018-03-29T18:22:22Z

server/src/main/java/org/elasticsearch/node/Node.java

+     * maximum of 100 is defensive to prevent generating too many search contexts.
+     */
+    public static final Setting<Integer> MAX_SEARCH_CONTEXT_SETTING =
+        Setting.intSetting("node.max_search_context", 100, 0, Property.NodeScope);


You can move this setting to SearchService. Also I would name it search.max_open_context or something similar, it is a per node setting but it is related to search so this would be consistent with the other search setting.

100 is too low in my opinion as this could be lower than the number of threads in the search thread pool on nodes that have many CPUs.

jimczi · 2018-03-29T18:23:11Z

server/src/main/java/org/elasticsearch/search/SearchService.java

@@ -566,6 +568,14 @@ final SearchContext createAndPutContext(ShardSearchRequest request) throws IOExc

    final SearchContext createContext(ShardSearchRequest request) throws IOException {
        final DefaultSearchContext context = createSearchContext(request, defaultSearchTimeout);
+        System.out.println("----activecontexts----" + activeContexts.size());


please remove this or use the logger instead.

jimczi · 2018-03-29T18:29:32Z

server/src/main/java/org/elasticsearch/search/SearchService.java

@@ -566,6 +568,14 @@ final SearchContext createAndPutContext(ShardSearchRequest request) throws IOExc

    final SearchContext createContext(ShardSearchRequest request) throws IOException {
        final DefaultSearchContext context = createSearchContext(request, defaultSearchTimeout);
+        System.out.println("----activecontexts----" + activeContexts.size());
+        if (activeContexts.size() >= Node.MAX_SEARCH_CONTEXT_SETTING.get(settings)) {


You can check the size before creating the search context otherwise you need to close it. Though this method is not synchronized so you may not get an accurate count on the concurrent map. Not sure if it's an issue though, the alternative would be to synchronize the addition of a new search context in the map and the check of the size but that would add contention in the search service. @s1monw any opinion on this ?

calling ConcurrentHashMap#size() can be quite expensive IMO. I think we should keep track of the open ctx in a counter instead of using the map. I don't think being a little off here makes a difference. I think we don't need to add any sychronization changes here.

jimczi · 2018-03-29T18:29:54Z

server/src/test/java/org/elasticsearch/search/SearchServiceTests.java

@@ -409,7 +447,7 @@ public void testCanMatch() throws IOException {
            Strings.EMPTY_ARRAY, false, new AliasFilter(null, Strings.EMPTY_ARRAY), 1f, allowPartialSearchResults)));

        assertTrue(service.canMatch(new ShardSearchLocalRequest(indexShard.shardId(), 1, SearchType.QUERY_THEN_FETCH,
-            new SearchSourceBuilder(), Strings.EMPTY_ARRAY, false, new AliasFilter(null, Strings.EMPTY_ARRAY), 1f, 
+            new SearchSourceBuilder(), Strings.EMPTY_ARRAY, false, new AliasFilter(null, Strings.EMPTY_ARRAY), 1f,


nit: not needed

jpountz

I left some additional comments to Jim's.

jpountz · 2018-04-04T09:13:51Z

server/src/test/java/org/elasticsearch/search/SearchServiceTests.java

+        try (SearchContext context = service.createAndPutContext(new ShardSearchLocalRequest(indexShard.shardId(), 1, SearchType.DEFAULT,
+            new SearchSourceBuilder(), new String[0], false, new AliasFilter(null, Strings.EMPTY_ARRAY), 1.0f, true))) {
+            assertNotNull(context);
+        } catch (IllegalStateException ex) {


can you use expectThrows to check for the exception/message?

jpountz · 2018-04-04T09:26:37Z

server/src/main/java/org/elasticsearch/node/Node.java

+     * maximum of 100 is defensive to prevent generating too many search contexts.
+     */
+    public static final Setting<Integer> MAX_SEARCH_CONTEXT_SETTING =
+        Setting.intSetting("node.max_search_context", 100, 0, Property.NodeScope);


100 is too low in my opinion as this could be lower than the number of threads in the search thread pool on nodes that have many CPUs.

jpountz · 2018-04-04T09:27:59Z

server/src/main/java/org/elasticsearch/search/SearchService.java

@@ -566,6 +568,13 @@ final SearchContext createAndPutContext(ShardSearchRequest request) throws IOExc

    final SearchContext createContext(ShardSearchRequest request) throws IOException {
        final DefaultSearchContext context = createSearchContext(request, defaultSearchTimeout);
+        if (activeContexts.size() >= Node.MAX_SEARCH_CONTEXT_SETTING.get(settings)) {


should we rather add this check at the beginning of createAndPutContext? This method only creates a new context, it doesn't add it to the list of active contexts?

jpountz · 2018-04-04T09:33:51Z

server/src/main/java/org/elasticsearch/search/SearchService.java

@@ -566,6 +568,13 @@ final SearchContext createAndPutContext(ShardSearchRequest request) throws IOExc

    final SearchContext createContext(ShardSearchRequest request) throws IOException {
        final DefaultSearchContext context = createSearchContext(request, defaultSearchTimeout);
+        if (activeContexts.size() >= Node.MAX_SEARCH_CONTEXT_SETTING.get(settings)) {
+            throw new IllegalStateException(


I think ElasticsearchException would allow to return a HTTP status code that indicates that the request should be retried later, this would be better than a 500/Internal Error?

jimczi · 2018-04-09T08:45:16Z

We discussed with @jpountz and agreed that we should discuss the value for the default limit. 100 is too low and we need to decide if scrolls should be included in the limit. We'll discuss this internally in our search meeting and I'll update this issue when we reached an agreement.

diwasjoshi · 2018-04-16T13:53:42Z

@jimczi @jpountz I have fixed all review changes, including the change suggested by @s1monw for making a new counter. I will wait for update on default limit from your side.

mayya-sharipova · 2018-04-16T17:05:01Z

server/src/main/java/org/elasticsearch/search/SearchService.java

@@ -547,6 +557,13 @@ private SearchContext findContext(long id, TransportRequest request) throws Sear
    }

    final SearchContext createAndPutContext(ShardSearchRequest request) throws IOException {
+        if (numActiveContexts >= MAX_OPEN_CONTEXT.get(settings)) {


Can this function be called by multiple threads at the same time? And if so, should numActiveContexts be volatile?

Yes it can @mayya-sharipova but if we restrict the counting to scroll queries let's use an AtomicInteger. The overhead should be limited since we create the search context for a scroll query only on the initial request.

jimczi

Thanks @diwasjoshi . We discussed internally and decided that this setting should only be applied to scroll query (request.scroll() != null). This way it is easier to choose a default value (we agreed on 500 being a good start) and we don't mix queries in the same counter. Would you mind changing this pr to restrict the counting to scroll queries ?

colings86 · 2018-07-27T13:07:10Z

@diwasjoshi are you still interested in working on this change?

diwasjoshi · 2018-07-29T21:32:04Z

@colings86 sorry I couldn't work on this, I will work on the review changes this week.

pgomulka · 2018-10-01T07:40:51Z

@diwasjoshi feels like you were close to finish this. Are you still intending to work on this issue?

colings86 · 2018-10-24T11:01:52Z

Since there hasn't been any activity on this PR for a while I am inclined to close it for now. @diwasjoshi if you are still interested in working on this and have time to update the PR with the latest master branch and address the review comments then please reopen this PR

diwasjoshi force-pushed the fix-softLimitSearchContexts branch 2 times, most recently from 289c092 to 3587714 Compare March 26, 2018 21:11

soft limit added for max number of search contexts

2ee128d

diwasjoshi force-pushed the fix-softLimitSearchContexts branch from 3587714 to 2ee128d Compare March 26, 2018 21:12

jimczi added the :Search/Search Search-related issues that do not fall into other categories label Mar 29, 2018

jimczi added the >feature label Mar 29, 2018

jimczi requested changes Mar 29, 2018

View reviewed changes

diwasjoshi force-pushed the fix-softLimitSearchContexts branch 6 times, most recently from c371373 to 4aad80a Compare April 1, 2018 21:00

review fixes

044d013

diwasjoshi force-pushed the fix-softLimitSearchContexts branch from 4aad80a to 044d013 Compare April 1, 2018 21:01

jpountz requested changes Apr 4, 2018

View reviewed changes

jimczi added the discuss label Apr 9, 2018

diwasjoshi added 3 commits April 10, 2018 23:20

max_open_context moved to searchservice

ec7eb51

counter for max open contexts

9e84a7a

formatting fixes

f279cef

diwasjoshi force-pushed the fix-softLimitSearchContexts branch from ee3d06d to 9e84a7a Compare April 16, 2018 13:37

mayya-sharipova reviewed Apr 16, 2018

View reviewed changes

jimczi requested changes Apr 16, 2018

View reviewed changes

jimczi mentioned this pull request Apr 16, 2018

Keep tombstones for expired search context ids #29543

Closed

tomcallahan removed the discuss label Jul 27, 2018

colings86 closed this Oct 24, 2018

jimczi mentioned this pull request Dec 3, 2018

Added soft limit to open scroll contexts #25244 #36009

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Soft limit added for max number of search contexts #29252

Soft limit added for max number of search contexts #29252

diwasjoshi commented Mar 26, 2018

elasticmachine commented Mar 26, 2018

elasticmachine commented Mar 26, 2018

elasticmachine commented Mar 29, 2018

jimczi left a comment

jimczi Mar 29, 2018

jimczi Mar 29, 2018

jimczi Mar 29, 2018

jimczi Mar 29, 2018

jimczi Mar 29, 2018

jpountz Apr 4, 2018

jimczi Mar 29, 2018

jimczi Mar 29, 2018

s1monw Apr 11, 2018

jimczi Mar 29, 2018

jpountz left a comment

jpountz Apr 4, 2018

jpountz Apr 4, 2018

jpountz Apr 4, 2018

jpountz Apr 4, 2018

jimczi commented Apr 9, 2018

diwasjoshi commented Apr 16, 2018

mayya-sharipova Apr 16, 2018

jimczi Apr 16, 2018 •

edited

Loading

jimczi left a comment

colings86 commented Jul 27, 2018

diwasjoshi commented Jul 29, 2018

pgomulka commented Oct 1, 2018 •

edited

Loading

colings86 commented Oct 24, 2018

		@@ -161,7 +163,7 @@
		private volatile TimeValue defaultSearchTimeout;

		private volatile boolean defaultAllowPartialSearchResults;

Soft limit added for max number of search contexts #29252

Soft limit added for max number of search contexts #29252

Conversation

diwasjoshi commented Mar 26, 2018

elasticmachine commented Mar 26, 2018

elasticmachine commented Mar 26, 2018

elasticmachine commented Mar 29, 2018

jimczi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpountz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jimczi commented Apr 9, 2018

diwasjoshi commented Apr 16, 2018

Choose a reason for hiding this comment

jimczi Apr 16, 2018 • edited Loading

Choose a reason for hiding this comment

jimczi left a comment

Choose a reason for hiding this comment

colings86 commented Jul 27, 2018

diwasjoshi commented Jul 29, 2018

pgomulka commented Oct 1, 2018 • edited Loading

colings86 commented Oct 24, 2018

jimczi Apr 16, 2018 •

edited

Loading

pgomulka commented Oct 1, 2018 •

edited

Loading