Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix trimUnsafeCommits for indices created before 6.2 #57187

Merged
merged 3 commits into from
May 27, 2020

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented May 27, 2020

If an upgraded node is restarted multiple times without flushing a new index commit, then we will wrongly exclude all commits from the starting commits. This bug is reproducible with these minimal steps: (1) create an empty index on 6.1.4 with translog retention disabled, (2) upgrade the cluster to 7.7.0, (3) restart the upgraded the cluster. The problem is that with the new translog policy can trim translog without having a new index commit, while the existing commit still refers to the previous translog generation.

Closes #57091

@dnhatn dnhatn added >bug blocker :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. v7.8.0 v7.7.1 v7.9.0 labels May 27, 2020
@dnhatn dnhatn requested a review from ywelsch May 27, 2020 05:51
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Recovery)

@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 27, 2020
try {
maybeCheckIndex(); // check index here and won't do it again if ops-based recovery occurs
recoveryState.setStage(RecoveryState.Stage.TRANSLOG);
if (safeCommit.isPresent() == false) {
assert globalCheckpoint == UNASSIGNED_SEQ_NO || indexSettings.getIndexVersionCreated().before(Version.V_6_2_0) :
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test found this issue where the index is synced flush, but the global checkpoint is still unassigned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what issue is being addressed here. How is moving this condition further down (after maybeCheckIndex) helping?
Is the issue that we have not properly moved to the translog stage?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, which part of the new tests show this issue, and is it something that can also be triggered with a single restart?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have explained better.

Is the issue that we have not properly moved to the translog stage?

That's correct. Previously, we do not move the recovery stage from INDEX to TRANSLOG if we don't have the safe commit, which can be the case if the index was created before 6.2 or the global checkpoint is still unassigned. Here we expect a file-based recovery to happen, and we will move the recovery stage to TRANSLOG in the clean files step. However, if the shard has a synced flush, we won't execute the clean files step and trip the assertion.

Also, which part of the new tests show this issue, and is it something that can also be triggered with a single restart?

Yes, I will add it to the full cluster restart suite.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left one comment, o.w. looking good.

try {
maybeCheckIndex(); // check index here and won't do it again if ops-based recovery occurs
recoveryState.setStage(RecoveryState.Stage.TRANSLOG);
if (safeCommit.isPresent() == false) {
assert globalCheckpoint == UNASSIGNED_SEQ_NO || indexSettings.getIndexVersionCreated().before(Version.V_6_2_0) :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what issue is being addressed here. How is moving this condition further down (after maybeCheckIndex) helping?
Is the issue that we have not properly moved to the translog stage?

@ywelsch
Copy link
Contributor

ywelsch commented May 27, 2020

Test failure here also looks relevant

@dnhatn dnhatn requested a review from ywelsch May 27, 2020 12:20
@dnhatn
Copy link
Member Author

dnhatn commented May 27, 2020

@ywelsch It's ready again.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Nhat!

@dnhatn
Copy link
Member Author

dnhatn commented May 27, 2020

Thanks Yannick.

@dnhatn dnhatn merged commit ba5a085 into elastic:7.7 May 27, 2020
@dnhatn dnhatn deleted the 7.7-translog-policy branch May 27, 2020 15:05
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request May 27, 2020
If an upgraded node is restarted multiple times without flushing a new 
index commit, then we will wrongly exclude all commits from the starting
commits. This bug is reproducible with these minimal steps: (1) create
an empty index on 6.1.4 with translog retention disabled, (2) upgrade 
the cluster to 7.7.0, (3) restart the upgraded the cluster. The problem
is that with the new translog policy can trim translog without having a
new index commit, while the existing commit still refers to the previous
translog generation.

Closes elastic#57091
dnhatn added a commit that referenced this pull request May 27, 2020
If an upgraded node is restarted multiple times without flushing a new
index commit, then we will wrongly exclude all commits from the starting
commits. This bug is reproducible with these minimal steps: (1) create
an empty index on 6.1.4 with translog retention disabled, (2) upgrade
the cluster to 7.7.0, (3) restart the upgraded the cluster. The problem
is that with the new translog policy can trim translog without having a
new index commit, while the existing commit still refers to the previous
translog generation.

Closes #57091
dnhatn added a commit that referenced this pull request May 27, 2020
If the previous peer recovery failed after copying segment files, then
the safe commit invariant won't hold in the next recovery.

Relates #57187
dnhatn added a commit that referenced this pull request May 27, 2020
If the previous peer recovery failed after copying segment files, then
the safe commit invariant won't hold in the next recovery.

Relates #57187
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request May 27, 2020
If the previous peer recovery failed after copying segment files, then
the safe commit invariant won't hold in the next recovery.

Relates elastic#57187
dnhatn added a commit that referenced this pull request May 27, 2020
If an upgraded node is restarted multiple times without flushing a new
index commit, then we will wrongly exclude all commits from the starting
commits. This bug is reproducible with these minimal steps: (1) create
an empty index on 6.1.4 with translog retention disabled, (2) upgrade
the cluster to 7.7.0, (3) restart the upgraded the cluster. The problem
is that with the new translog policy can trim translog without having a
new index commit, while the existing commit still refers to the previous
translog generation.

Closes #57091
dnhatn added a commit that referenced this pull request Jun 15, 2020
If the recovery source is on an old node (before 7.2), then the recovery
target won't have the safe commit after phase1 because the recovery
source does not send the global checkpoint in the clean_files step. And
if the recovery fails and retries, then the recovery stage won't
transition properly. If a sync_id is used in peer recovery, then the
clean_files step won't be executed to move the stage to TRANSLOG.

This issue was addressed in #57187, but not forward-ported to 8.0. 
 
Closes #57708
seut added a commit to crate/crate that referenced this pull request Jan 6, 2022
seut added a commit to crate/crate that referenced this pull request Jan 6, 2022
mergify bot pushed a commit to crate/crate that referenced this pull request Jan 10, 2022
mergify bot pushed a commit to crate/crate that referenced this pull request Jan 10, 2022
Backport of elastic/elasticsearch#57187

Fixes #11756.

(cherry picked from commit c05620a)

# Conflicts:
#	docs/appendices/release-notes/unreleased.rst
#	server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
seut added a commit to crate/crate that referenced this pull request Jan 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker >bug :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.7.1 v7.8.0 v7.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants