-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix trimUnsafeCommits for indices created before 6.2 #57187
Conversation
Pinging @elastic/es-distributed (:Distributed/Recovery) |
try { | ||
maybeCheckIndex(); // check index here and won't do it again if ops-based recovery occurs | ||
recoveryState.setStage(RecoveryState.Stage.TRANSLOG); | ||
if (safeCommit.isPresent() == false) { | ||
assert globalCheckpoint == UNASSIGNED_SEQ_NO || indexSettings.getIndexVersionCreated().before(Version.V_6_2_0) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new test found this issue where the index is synced flush, but the global checkpoint is still unassigned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what issue is being addressed here. How is moving this condition further down (after maybeCheckIndex) helping?
Is the issue that we have not properly moved to the translog stage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, which part of the new tests show this issue, and is it something that can also be triggered with a single restart?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I should have explained better.
Is the issue that we have not properly moved to the translog stage?
That's correct. Previously, we do not move the recovery stage from INDEX to TRANSLOG if we don't have the safe commit, which can be the case if the index was created before 6.2 or the global checkpoint is still unassigned. Here we expect a file-based recovery to happen, and we will move the recovery stage to TRANSLOG in the clean files step. However, if the shard has a synced flush, we won't execute the clean files step and trip the assertion.
Also, which part of the new tests show this issue, and is it something that can also be triggered with a single restart?
Yes, I will add it to the full cluster restart suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left one comment, o.w. looking good.
try { | ||
maybeCheckIndex(); // check index here and won't do it again if ops-based recovery occurs | ||
recoveryState.setStage(RecoveryState.Stage.TRANSLOG); | ||
if (safeCommit.isPresent() == false) { | ||
assert globalCheckpoint == UNASSIGNED_SEQ_NO || indexSettings.getIndexVersionCreated().before(Version.V_6_2_0) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what issue is being addressed here. How is moving this condition further down (after maybeCheckIndex) helping?
Is the issue that we have not properly moved to the translog stage?
Test failure here also looks relevant |
@ywelsch It's ready again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks Nhat!
Thanks Yannick. |
If an upgraded node is restarted multiple times without flushing a new index commit, then we will wrongly exclude all commits from the starting commits. This bug is reproducible with these minimal steps: (1) create an empty index on 6.1.4 with translog retention disabled, (2) upgrade the cluster to 7.7.0, (3) restart the upgraded the cluster. The problem is that with the new translog policy can trim translog without having a new index commit, while the existing commit still refers to the previous translog generation. Closes elastic#57091
If an upgraded node is restarted multiple times without flushing a new index commit, then we will wrongly exclude all commits from the starting commits. This bug is reproducible with these minimal steps: (1) create an empty index on 6.1.4 with translog retention disabled, (2) upgrade the cluster to 7.7.0, (3) restart the upgraded the cluster. The problem is that with the new translog policy can trim translog without having a new index commit, while the existing commit still refers to the previous translog generation. Closes #57091
If the previous peer recovery failed after copying segment files, then the safe commit invariant won't hold in the next recovery. Relates #57187
If the previous peer recovery failed after copying segment files, then the safe commit invariant won't hold in the next recovery. Relates #57187
If the previous peer recovery failed after copying segment files, then the safe commit invariant won't hold in the next recovery. Relates elastic#57187
If an upgraded node is restarted multiple times without flushing a new index commit, then we will wrongly exclude all commits from the starting commits. This bug is reproducible with these minimal steps: (1) create an empty index on 6.1.4 with translog retention disabled, (2) upgrade the cluster to 7.7.0, (3) restart the upgraded the cluster. The problem is that with the new translog policy can trim translog without having a new index commit, while the existing commit still refers to the previous translog generation. Closes #57091
If the recovery source is on an old node (before 7.2), then the recovery target won't have the safe commit after phase1 because the recovery source does not send the global checkpoint in the clean_files step. And if the recovery fails and retries, then the recovery stage won't transition properly. If a sync_id is used in peer recovery, then the clean_files step won't be executed to move the stage to TRANSLOG. This issue was addressed in #57187, but not forward-ported to 8.0. Closes #57708
Backport of elastic/elasticsearch#57187 Fixes #11756. (cherry picked from commit c05620a) # Conflicts: # docs/appendices/release-notes/unreleased.rst # server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Backport of elastic/elasticsearch#57187 Fixes #11756. (cherry picked from commit c05620a)
If an upgraded node is restarted multiple times without flushing a new index commit, then we will wrongly exclude all commits from the starting commits. This bug is reproducible with these minimal steps: (1) create an empty index on 6.1.4 with translog retention disabled, (2) upgrade the cluster to 7.7.0, (3) restart the upgraded the cluster. The problem is that with the new translog policy can trim translog without having a new index commit, while the existing commit still refers to the previous translog generation.
Closes #57091