-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure external refreshes will also refresh internal searcher to minimize segment creation #27253
Conversation
…ize segment creation We cut over to internal and external IndexReader/IndexSeacher in elastic#26972 which uses two independent searcher managers. This has the downside that refreshes of the external reader will never clear the internal version map which in-turn will trigger additional and potentially unnecessary segment flushes since memory must be freed. Under heavy indexing load with low refresh intervals this can cause excessive segment creation which causes high GC activity and significantly increases the required segment merges. This change adds a dedicated external reference manager that delegates refreshes to the internal reference manager that then `steals` the refreshed reader from the internal reference manager for external usage. This ensures that external and internal readers are consistent on an external refresh. As a sideeffect this also releases old segments referenced by the internal reference manager which can potentially hold on to already merged away segments until it is refreshed due to a flush or indexing activity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left some nits that would have helped me while reading this PR. feel free to do with them as you please.
@@ -224,8 +224,9 @@ public InternalEngine(EngineConfig engineConfig) { | |||
throw e; | |||
} | |||
} | |||
internalSearcherManager = createSearcherManager(new SearcherFactory(), false); | |||
externalSearcherManager = createSearcherManager(new SearchFactory(logger, isClosed, engineConfig), true); | |||
externalSearcherManager = createSearcherManager(new SearchFactory(logger, isClosed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this fits on one line. yay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, good that this boolean is goon.
private final SearcherFactory searcherFactory; | ||
private final SearcherManager internalSearcherManager; | ||
|
||
ExternalSearcherManager(SearcherManager manager, SearcherFactory searcherFactory) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: manger -> internalManager
@@ -456,18 +526,17 @@ private String loadOrGenerateHistoryUUID(final IndexWriter writer, boolean force | |||
return uuid; | |||
} | |||
|
|||
private SearcherManager createSearcherManager(SearcherFactory searcherFactory, boolean readSegmentsInfo) throws EngineException { | |||
private ExternalSearcherManager createSearcherManager(SearchFactory factory) throws EngineException { | |||
boolean success = false; | |||
SearcherManager searcherManager = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: searcherManager -> internalSearcherManager
@@ -456,18 +526,17 @@ private String loadOrGenerateHistoryUUID(final IndexWriter writer, boolean force | |||
return uuid; | |||
} | |||
|
|||
private SearcherManager createSearcherManager(SearcherFactory searcherFactory, boolean readSegmentsInfo) throws EngineException { | |||
private ExternalSearcherManager createSearcherManager(SearchFactory factory) throws EngineException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: factory -> externalSearcherFactory
@@ -4226,7 +4226,7 @@ public void assertSameReader(Searcher left, Searcher right) { | |||
List<LeafReaderContext> rightLeaves = ElasticsearchDirectoryReader.unwrap(right.getDirectoryReader()).leaves(); | |||
assertEquals(rightLeaves.size(), leftLeaves.size()); | |||
for (int i = 0; i < leftLeaves.size(); i++) { | |||
assertSame(leftLeaves.get(i).reader(), rightLeaves.get(0).reader()); | |||
assertSame(leftLeaves.get(i).reader(), rightLeaves.get(i).reader()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sneaky :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah suddenly we have a test that triggered it 💃
@elasticmachine test this please |
* master: (22 commits) Update Tika version to 1.15 Aggregations: bucket_sort pipeline aggregation (#27152) Introduce templating support to timezone/locale in DateProcessor (#27089) Increase logging on qa:mixed-cluster tests Update to AWS SDK 1.11.223 (#27278) Improve error message for parse failures of completion fields (#27297) Ensure external refreshes will also refresh internal searcher to minimize segment creation (#27253) Remove optimisations to reuse objects when applying a new `ClusterState` (#27317) Decouple `ChannelFactory` from Tcp classes (#27286) Fix find remote when building BWC Remove colons from task and configuration names Add unreleased 5.6.5 version number testCreateSplitIndexToN: do not set `routing_partition_size` to >= `number_of_routing_shards` Snapshot/Restore: better handle incorrect chunk_size settings in FS repo (#26844) Add limits for ngram and shingle settings (#27211) (#27318) Correct comment in index shard test Roll translog generation on primary promotion ObjectParser: Replace IllegalStateException with ParsingException (#27302) scripted_metric _agg parameter disappears if params are provided (#27159) Update discovery-ec2.asciidoc ...
…mize segment creation (#27253) We cut over to internal and external IndexReader/IndexSearcher in #26972 which uses two independent searcher managers. This has the downside that refreshes of the external reader will never clear the internal version map which in-turn will trigger additional and potentially unnecessary segment flushes since memory must be freed. Under heavy indexing load with low refresh intervals this can cause excessive segment creation which causes high GC activity and significantly increases the required segment merges. This change adds a dedicated external reference manager that delegates refreshes to the internal reference manager that then `steals` the refreshed reader from the internal reference manager for external usage. This ensures that external and internal readers are consistent on an external refresh. As a sideeffect this also releases old segments referenced by the internal reference manager which can potentially hold on to already merged away segments until it is refreshed due to a flush or indexing activity.
@s1monw It appears the "backport pending" label can be removed now? |
* 6.x: (27 commits) Reduce synchronization on field data cache [Test] #27342 Fix SearchRequests#testValidate Fail queries with scroll that explicitely set request_cache (#27342) add json-processor support for non-map json types (#27335) Upgrade AWS SDK Jackson Databind to 2.6.7.1 Properly format IndexGraveyard deletion date as date (#27362) Stop responding to ping requests before master abdication (#27329) [Test] Fix POI version in packaging tests Added release notes for 6.0.0 Update 6.0.0-beta1.asciidoc Allow affix settings to specify dependencies (#27161) Tests: Improve size regex in documentation test (#26879) reword comment Ensure external refreshes will also refresh internal searcher to minimize segment creation (#27253) Remove unnecessary logger creation for doc values field data [DOCS] Fixed link to docker content Plugins: Add versionless alias to all security policy codebase properties (#26756) [DOCS] Move X-Pack-specific Docker content (#27333) [Test] Fix S3BlobStoreContainerTests.testNumberOfMultiparts() Set minimum_master_nodes to all nodes for REST tests (#27344) ...
@jasontedor agreed. I removed the backport pending label. |
We cut over to internal and external IndexReader/IndexSeacher in #26972 which uses
two independent searcher managers. This has the downside that refreshes of the external
reader will never clear the internal version map which in-turn will trigger additional
and potentially unnecessary segment flushes since memory must be freed. Under heavy
indexing load with low refresh intervals this can cause excessive segment creation which
causes high GC activity and significantly increases the required segment merges.
This change adds a dedicated external reference manager that delegates refreshes to the
internal reference manager that then
steals
the refreshed reader from the internalreference manager for external usage. This ensures that external and internal readers
are consistent on an external refresh. As a sideeffect this also releases old segments
referenced by the internal reference manager which can potentially hold on to already merged
away segments until it is refreshed due to a flush or indexing activity.