Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refresh after force-merge #76345

Merged

Conversation

DaveCTurner
Copy link
Contributor

Today a force-merge will write out the newly-merged segments and then
flush them, but it does not automatically release the segments from
before the merge. This will retain extra data on disk until the next
refresh, which could be a long time away if the user has disabled
periodic refreshes or is not searching the index frequently.

With this commit we change the behaviour always to refresh after a
force-merge.

Closes #74649
Backport of #76221

Today a force-merge will write out the newly-merged segments and then
flush them, but it does not automatically release the segments from
before the merge. This will retain extra data on disk until the next
refresh, which could be a long time away if the user has disabled
periodic refreshes or is not searching the index frequently.

With this commit we change the behaviour always to refresh after a
force-merge.

Closes elastic#74649
Backport of elastic#76221
@DaveCTurner DaveCTurner added >enhancement :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. backport v7.15.0 labels Aug 11, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Aug 11, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner
Copy link
Contributor Author

So this is a bit different in 7.x because of synced-flush. In particular we already refresh the internal searcher in tryRenewSyncCommit under some circumstances "to ensure we release unreferenced segments" but this doesn't help if the external searcher is still holding onto old things. I think we just refresh anyway, Lucene can tell it's a no-op and avoid doing additional work the second time round.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I agree to your assessment, it is hard to come up with a case where this really would result in extra work/segments being produced.

@DaveCTurner DaveCTurner merged commit cb2451d into elastic:7.x Aug 11, 2021
@DaveCTurner DaveCTurner deleted the 2021-08-11-refresh-on-force-merge-7x branch August 11, 2021 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants