-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS] add warning for read-write indices in force merge documentation #28869
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
1 similar comment
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
@ppf2 would you like to have a look at this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall, I left some minor comments.
Force merge should only be called against *read-only indices*. Running | ||
force merge against a read-write index can cause very large segments to be produced | ||
(>5Gb per segment), and the merge policy will never consider it for merging again until | ||
it has 75% of deleted docs. This can cause very large segments to remain in the shards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this 75% is correct, it would be shard_size/5GB but I think it's fine to say "until it mostly consists of deleted docs".
@@ -10,6 +10,12 @@ This call will block until the merge is complete. If the http connection is | |||
lost, the request will continue in the background, and any new requests will | |||
block until the previous force merge is complete. | |||
|
|||
[WARNING] | |||
Force merge should only be called against *read-only indices*. Running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do WARNING: Force merge...
and our documentation processor will make a nice warning block out of this paragraph.
@jpountz your requested changes have been added and the commit was rebased/squashed back into one |
lgtm thanks all! |
Pinging @elastic/es-core-infra |
Hi and thanks for this very useful addition to the docs 😌 Do you think it would be worth adding an accompanying tip around re-running force merge after altering a lot of documents? I'll attempt to explain by example: Our use case is analytics on time series data. We have one index per month where new data is only written to the newer indices. We force merge the older indices for optimal query times. Every now and again we need to add a new attribute to all of the docs and populate using scripted update by query or similar. Having read this issue (thanks again!) I now know that it is because our old already-force-merged indices are not eligible for cleanup due to them having been force merged in the past. Adding a note might help future travellers. wdyt? |
@jpountz can you have another look please? |
@gondalez I have seen too many issues when force-merge was run too often, and basically none when it wasn't run when it should so I'd rather keep the messaging clear for now and only say that force-merge should only run against read-only indices. |
* master: Upgrade to Lucene-7.4-snapshot-6705632810 (#30519) add version compatibility from 6.4.0 after backport, see #30319 (#30390) Security: Simplify security index listeners (#30466) Add proper longitude validation in geo_polygon_query (#30497) Remove Discovery.AckListener.onTimeout() (#30514) Build: move generated-resources to build (#30366) Reindex: Fold "with all deps" project into reindex (#30154) Isolate REST client single host tests (#30504) Solve Gradle deprecation warnings around shadowJar (#30483) SAML: Process only signed data (#30420) Remove BWC repository test (#30500) Build: Remove xpack specific run task (#30487) AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests LLClient: Add setJsonEntity (#30447) Expose CommonStatsFlags directly in IndicesStatsRequest. (#30163) Silence IndexUpgradeIT test failures. (#30430) Bump Gradle heap to 1792m (#30484) [docs] add warning for read-write indices in force merge documentation (#28869) Avoid deadlocks in cache (#30461) Test: remove hardcoded list of unconfigured ciphers (#30367) mute SplitIndexIT due to #30416 Docs: Test examples that recreate lang analyzers (#29535) BulkProcessor to retry based on status code (#29329) Add GET Repository High Level REST API (#30362) add a comment explaining the need for RetryOnReplicaException on missing mappings Add `coordinating_only` node selector (#30313) Stop forking groovyc (#30471) Avoid setting connection request timeout (#30384) Use date format in `date_range` mapping before fallback to default (#29310) Watcher: Increase HttpClient parallel sent requests (#30130) # Conflicts: # x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/LocalStateCompositeXPackPlugin.java
* 6.x: Upgrade to Lucene-7.4-snapshot-6705632810 (#30519) Remove Discovery.AckListener.onTimeout() (#30514) Build: move generated-resources to build (#30366) Reindex: Fold "with all deps" project into reindex (#30154) Isolate REST client single host tests (#30504) Remove BWC repository test (#30500) Build: Remove xpack specific run task (#30487) AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests LLClient: Add setJsonEntity (#30447) [docs] add warning for read-write indices in force merge documentation (#28869) Avoid deadlocks in cache (#30461) BulkProcessor to retry based on status code (#29329) Avoid setting connection request timeout (#30384) Test: remove hardcoded list of unconfigured ciphers (#30367) Add GET Repository High Level REST API (#30362) mute SplitIndexIT due to #30416 Docs: Test examples that recreate lang analyzers (#29535) add a comment explaining the need for RetryOnReplicaException on missing mappings Pass the task to broadcast actions (#29672) Stop forking groovyc (#30471) Add `coordinating_only` node selector (#30313) Fix accidental error in changelog Use date format in `date_range` mapping before fallback to default (#29310) Watcher: Increase HttpClient parallel sent requests (#30130) [Security][Tests] Azeri(Turkish) locale tripps opensaml dependency
* es/ccr: (78 commits) Upgrade to Lucene-7.4-snapshot-6705632810 (elastic#30519) add version compatibility from 6.4.0 after backport, see elastic#30319 (elastic#30390) Security: Simplify security index listeners (elastic#30466) Add proper longitude validation in geo_polygon_query (elastic#30497) Remove Discovery.AckListener.onTimeout() (elastic#30514) Build: move generated-resources to build (elastic#30366) Reindex: Fold "with all deps" project into reindex (elastic#30154) Isolate REST client single host tests (elastic#30504) Solve Gradle deprecation warnings around shadowJar (elastic#30483) SAML: Process only signed data (elastic#30420) Remove BWC repository test (elastic#30500) Build: Remove xpack specific run task (elastic#30487) AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests Enable soft-deletes in v6.4 LLClient: Add setJsonEntity (elastic#30447) [CCR] Read changes from Lucene instead of translog (elastic#30120) Expose CommonStatsFlags directly in IndicesStatsRequest. (elastic#30163) Silence IndexUpgradeIT test failures. (elastic#30430) Bump Gradle heap to 1792m (elastic#30484) [docs] add warning for read-write indices in force merge documentation (elastic#28869) ...
I think some information on what to do if a force merge was already issued would be welcomed. How do you get the index back to a state where the normal cleanup policies will apply? |
This PR updates force merge documentation:
Closes #28843