Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] add warning for read-write indices in force merge documentation #28869

Merged
merged 1 commit into from
May 9, 2018

Conversation

mueedc-zz
Copy link

@mueedc-zz mueedc-zz commented Mar 1, 2018

This PR updates force merge documentation:

  • Add warning message to specify force merge API calls should only be used against read-only indices

Closes #28843

@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

1 similar comment
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@mueedc-zz mueedc-zz changed the title [docs] add warning for read-write indices in force merge documentation [DOCS] add warning for read-write indices in force merge documentation Mar 1, 2018
@javanna
Copy link
Member

javanna commented Mar 2, 2018

@ppf2 would you like to have a look at this?

@javanna javanna requested a review from ppf2 March 2, 2018 09:59
@javanna javanna added the >docs General docs changes label Mar 2, 2018
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall, I left some minor comments.

Force merge should only be called against *read-only indices*. Running
force merge against a read-write index can cause very large segments to be produced
(>5Gb per segment), and the merge policy will never consider it for merging again until
it has 75% of deleted docs. This can cause very large segments to remain in the shards.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this 75% is correct, it would be shard_size/5GB but I think it's fine to say "until it mostly consists of deleted docs".

@@ -10,6 +10,12 @@ This call will block until the merge is complete. If the http connection is
lost, the request will continue in the background, and any new requests will
block until the previous force merge is complete.

[WARNING]
Force merge should only be called against *read-only indices*. Running
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just do WARNING: Force merge... and our documentation processor will make a nice warning block out of this paragraph.

@mueedc-zz
Copy link
Author

@jpountz your requested changes have been added and the commit was rebased/squashed back into one

@ppf2
Copy link
Member

ppf2 commented Mar 15, 2018

lgtm thanks all!

@colings86 colings86 added the :Data Management/Indices APIs APIs to create and manage indices and templates label Apr 24, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@gondalez
Copy link

Hi and thanks for this very useful addition to the docs 😌

Do you think it would be worth adding an accompanying tip around re-running force merge after altering a lot of documents?

I'll attempt to explain by example:

Our use case is analytics on time series data. We have one index per month where new data is only written to the newer indices. We force merge the older indices for optimal query times.

Every now and again we need to add a new attribute to all of the docs and populate using scripted update by query or similar.
At the end of the process we have to manually force merge the older indices to purge the very many deleted docs and reclaim space. If we do not manually force merge the deleted docs remain.

Having read this issue (thanks again!) I now know that it is because our old already-force-merged indices are not eligible for cleanup due to them having been force merged in the past.

Adding a note might help future travellers. wdyt?

@mueedc-zz
Copy link
Author

@jpountz I can add a tip for re-running force merge after altering many documents as per @gondalez comment. let me know your thoughts

@javanna
Copy link
Member

javanna commented May 7, 2018

@jpountz can you have another look please?

@javanna javanna added the review label May 7, 2018
@jpountz
Copy link
Contributor

jpountz commented May 9, 2018

@gondalez I have seen too many issues when force-merge was run too often, and basically none when it wasn't run when it should so I'd rather keep the messaging clear for now and only say that force-merge should only run against read-only indices.

@jpountz jpountz added :Data Management/Indices APIs APIs to create and manage indices and templates >docs General docs changes and removed :Data Management/Indices APIs APIs to create and manage indices and templates >docs General docs changes labels May 9, 2018
@jpountz jpountz merged commit bf141a3 into elastic:master May 9, 2018
dnhatn added a commit that referenced this pull request May 10, 2018
* master:
  Upgrade to Lucene-7.4-snapshot-6705632810 (#30519)
  add version compatibility from 6.4.0 after backport, see #30319 (#30390)
  Security: Simplify security index listeners (#30466)
  Add proper longitude validation in geo_polygon_query (#30497)
  Remove Discovery.AckListener.onTimeout() (#30514)
  Build: move generated-resources to build (#30366)
  Reindex: Fold "with all deps" project into reindex (#30154)
  Isolate REST client single host tests (#30504)
  Solve Gradle deprecation warnings around shadowJar (#30483)
  SAML: Process only signed data (#30420)
  Remove BWC repository test (#30500)
  Build: Remove xpack specific run task (#30487)
  AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests
  LLClient: Add setJsonEntity (#30447)
  Expose CommonStatsFlags directly in IndicesStatsRequest. (#30163)
  Silence IndexUpgradeIT test failures. (#30430)
  Bump Gradle heap to 1792m (#30484)
  [docs] add warning for read-write indices in force merge documentation (#28869)
  Avoid deadlocks in cache (#30461)
  Test: remove hardcoded list of unconfigured ciphers (#30367)
  mute SplitIndexIT due to #30416
  Docs: Test examples that recreate lang analyzers  (#29535)
  BulkProcessor to retry based on status code (#29329)
  Add GET Repository High Level REST API (#30362)
  add a comment explaining the need for RetryOnReplicaException on missing mappings
  Add `coordinating_only` node selector (#30313)
  Stop forking groovyc (#30471)
  Avoid setting connection request timeout (#30384)
  Use date format in `date_range` mapping before fallback to default (#29310)
  Watcher: Increase HttpClient parallel sent requests (#30130)

# Conflicts:
#	x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/LocalStateCompositeXPackPlugin.java
dnhatn added a commit that referenced this pull request May 10, 2018
* 6.x:
  Upgrade to Lucene-7.4-snapshot-6705632810 (#30519)
  Remove Discovery.AckListener.onTimeout() (#30514)
  Build: move generated-resources to build (#30366)
  Reindex: Fold "with all deps" project into reindex (#30154)
  Isolate REST client single host tests (#30504)
  Remove BWC repository test (#30500)
  Build: Remove xpack specific run task (#30487)
  AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests
  LLClient: Add setJsonEntity (#30447)
  [docs] add warning for read-write indices in force merge documentation (#28869)
  Avoid deadlocks in cache (#30461)
  BulkProcessor to retry based on status code (#29329)
  Avoid setting connection request timeout (#30384)
  Test: remove hardcoded list of unconfigured ciphers (#30367)
  Add GET Repository High Level REST API (#30362)
  mute SplitIndexIT due to #30416
  Docs: Test examples that recreate lang analyzers  (#29535)
  add a comment explaining the need for RetryOnReplicaException on missing mappings
  Pass the task to broadcast actions (#29672)
  Stop forking groovyc (#30471)
  Add `coordinating_only` node selector (#30313)
  Fix accidental error in changelog
  Use date format in `date_range` mapping before fallback to default (#29310)
  Watcher: Increase HttpClient parallel sent requests (#30130)
  [Security][Tests] Azeri(Turkish) locale tripps opensaml dependency
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 14, 2018
* es/ccr: (78 commits)
  Upgrade to Lucene-7.4-snapshot-6705632810 (elastic#30519)
  add version compatibility from 6.4.0 after backport, see elastic#30319 (elastic#30390)
  Security: Simplify security index listeners (elastic#30466)
  Add proper longitude validation in geo_polygon_query (elastic#30497)
  Remove Discovery.AckListener.onTimeout() (elastic#30514)
  Build: move generated-resources to build (elastic#30366)
  Reindex: Fold "with all deps" project into reindex (elastic#30154)
  Isolate REST client single host tests (elastic#30504)
  Solve Gradle deprecation warnings around shadowJar (elastic#30483)
  SAML: Process only signed data (elastic#30420)
  Remove BWC repository test (elastic#30500)
  Build: Remove xpack specific run task (elastic#30487)
  AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests
  Enable soft-deletes in v6.4
  LLClient: Add setJsonEntity (elastic#30447)
  [CCR] Read changes from Lucene instead of translog (elastic#30120)
  Expose CommonStatsFlags directly in IndicesStatsRequest. (elastic#30163)
  Silence IndexUpgradeIT test failures. (elastic#30430)
  Bump Gradle heap to 1792m (elastic#30484)
  [docs] add warning for read-write indices in force merge documentation (elastic#28869)
  ...
@silashansen
Copy link

I think some information on what to do if a force merge was already issued would be welcomed.

How do you get the index back to a state where the normal cleanup policies will apply?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates >docs General docs changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants