-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail force-merges on read-only engines #64756
Conversation
Pinging @elastic/es-distributed (:Distributed/Engine) |
A few thoughts:
Adding the |
Hi @DaveCTurner , I have updated the PR to support index write block for force merge API instead. Would you please help to review again?
|
That doesn't work either, it's fine to force-merge an index that merely has a write block. Indeed ILM applies such a block before force-merging. Conversely there's no requirement for a frozen index to have a write block. It does by default but you can remove it. |
Yes, I could remove the write block for frozen index. I have two questions:
|
We discussed this as a team and concluded that simply throwing an unconditional
We preferred the second idea: we think it would cover most cases to fail the merge iff the number of segments in the shard was greater than the |
So the rough approximation is that if shard already contains exactly |
Fewer is ok, otherwise yes that's right. |
Hi @DaveCTurner , I have updated the PR, fail force merge if max_num_segments is fewer than current, otherwise do no-op. Would you please help to review? |
@elasticmachine ok to test |
Thanks for merging master @howardhuanghua :) I was just about to ask... |
Thank you @DaveCTurner , since TencentCloudES repository is forked from elastic, this PR is merged from TencentCloudES to elastic, so I cannot tick the box to let maintainers push code to the branch. Next time I will branch from elastic repository directly, that would be worked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, except for one missing corner case.
@@ -375,6 +375,14 @@ public void flush(boolean force, boolean waitIfOngoing) throws EngineException { | |||
@Override | |||
public void forceMerge(boolean flush, int maxNumSegments, boolean onlyExpungeDeletes, | |||
boolean upgrade, boolean upgradeOnlyAncientSegments, String forceMergeUUID) { | |||
if (maxNumSegments < lastCommittedSegmentInfos.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rejects force-merge requests that do not specify the maxNumSegments
parameter at all, since that comes through to here as maxNumSegments == ForceMergeSegments.Defaults.MAX_NUM_SEGMENTS == -1
. We should accept these requests too, it's ok for them to do nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, I have updated the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @howardhuanghua.
Today we treat all force-merges on a read-only (e.g. frozen) engine as no-ops, indicating to the client that they succeeded even if they had no effect. This commit corrects that behaviour, resolving the resulting confusion, by rejecting force-merges on read-only engines that are definitely not no-ops.
Today we treat all force-merges on a read-only (e.g. frozen) engine as no-ops, indicating to the client that they succeeded even if they had no effect. This commit corrects that behaviour, resolving the resulting confusion, by rejecting force-merges on read-only engines that are definitely not no-ops. Co-authored-by: Howard <[email protected]>
In our customer's production cluster, customer complains that some of the indices cannot do force merge to clean deleted docs.
Here is the output of force merge api call:
From
_cat/segments
api, we could see this index has more than one segments in each shard:We try to open low level log to check why force merge could not be executed:
But we got nothing output about the above index force merge operation. We cost long time to add extra log to figure out that the index has been frozen^^.
Currently even index has write block, we still allow user to execute force merge api, but I think we at least need to add some trace level log to let user know why the merge cannot be executed. So this PR add some useful low level log info to indicate index is read only during refresh, flush and force merge operations.