Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ILM may get stuck waiting for segment count to reach expected number #43245

Closed
dakrone opened this issue Jun 14, 2019 · 2 comments · Fixed by #43246
Closed

ILM may get stuck waiting for segment count to reach expected number #43245

dakrone opened this issue Jun 14, 2019 · 2 comments · Fixed by #43246
Assignees
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management

Comments

@dakrone
Copy link
Member

dakrone commented Jun 14, 2019

When ILM attempts a force merge it immediately moves to a step that waits until the segment count reaches the expected number. However, it's possible for a force merge to silently stop (for example, if the shard being merged moves to a different node the force merge does not continue on the new node).

This can lead to the index being stuck indefinitely in the the SegmentCountStep.

@dakrone dakrone added >bug :Data Management/ILM+SLM Index and Snapshot lifecycle management labels Jun 14, 2019
@dakrone dakrone self-assigned this Jun 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

dakrone added a commit to dakrone/elasticsearch that referenced this issue Jun 14, 2019
It's possible for force merges kicked off by ILM to silently stop (due
to a node relocating for example). In which case, the segment count may
not reach what the user configured. In the subsequent `SegmentCountStep`
waiting for the expected segment count may wait indefinitely. Because of
this, this commit makes force merges "best effort" and then changes the
`SegmentCountStep` to simply report (at INFO level) if the merge was not
successful.

Relates to elastic#42824
Resolves elastic#43245
dakrone added a commit that referenced this issue Jun 17, 2019
It's possible for force merges kicked off by ILM to silently stop (due
to a node relocating for example). In which case, the segment count may
not reach what the user configured. In the subsequent `SegmentCountStep`
waiting for the expected segment count may wait indefinitely. Because of
this, this commit makes force merges "best effort" and then changes the
`SegmentCountStep` to simply report (at INFO level) if the merge was not
successful.

Relates to #42824
Resolves #43245
dakrone added a commit that referenced this issue Jun 17, 2019
It's possible for force merges kicked off by ILM to silently stop (due
to a node relocating for example). In which case, the segment count may
not reach what the user configured. In the subsequent `SegmentCountStep`
waiting for the expected segment count may wait indefinitely. Because of
this, this commit makes force merges "best effort" and then changes the
`SegmentCountStep` to simply report (at INFO level) if the merge was not
successful.

Relates to #42824
Resolves #43245
dakrone added a commit that referenced this issue Jun 17, 2019
It's possible for force merges kicked off by ILM to silently stop (due
to a node relocating for example). In which case, the segment count may
not reach what the user configured. In the subsequent `SegmentCountStep`
waiting for the expected segment count may wait indefinitely. Because of
this, this commit makes force merges "best effort" and then changes the
`SegmentCountStep` to simply report (at INFO level) if the merge was not
successful.

Relates to #42824
Resolves #43245
dakrone added a commit that referenced this issue Jun 17, 2019
It's possible for force merges kicked off by ILM to silently stop (due
to a node relocating for example). In which case, the segment count may
not reach what the user configured. In the subsequent `SegmentCountStep`
waiting for the expected segment count may wait indefinitely. Because of
this, this commit makes force merges "best effort" and then changes the
`SegmentCountStep` to simply report (at INFO level) if the merge was not
successful.

Relates to #42824
Resolves #43245
dakrone added a commit that referenced this issue Jun 17, 2019
It's possible for force merges kicked off by ILM to silently stop (due
to a node relocating for example). In which case, the segment count may
not reach what the user configured. In the subsequent `SegmentCountStep`
waiting for the expected segment count may wait indefinitely. Because of
this, this commit makes force merges "best effort" and then changes the
`SegmentCountStep` to simply report (at INFO level) if the merge was not
successful.

Relates to #42824
Resolves #43245
@gaocx2000cn
Copy link

About 'Make ILM force merging best effort (#43246)'

There are cluster with 3 shards on 3 data node, forcemerge with max_num_segments=1 against Elasticsearch7.0.1 cluster will spend twice as much time as Elasticsearch6.8.13 cluster.
Is this a bug?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants