Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip global checkpoint sync for closed indices #41874

Merged
merged 7 commits into from
May 21, 2019

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented May 7, 2019

The verifying-before-close step ensures the global checkpoints on all shard copies are in sync; thus, we don' t need to sync global checkpoints for closed indices.

Relate #33888 (comment)

@dnhatn dnhatn added >non-issue :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. v8.0.0 v7.2.0 labels May 7, 2019
@dnhatn dnhatn requested review from ywelsch and henningandersen May 7, 2019 03:17
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@dnhatn dnhatn mentioned this pull request May 7, 2019
50 tasks
@dnhatn dnhatn changed the title Closed indices should not sync global checkpoint Do not fail global checkpoint sync on closed indices May 7, 2019
Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be an option to skip resync for closed indices? I think we rely on either having a perfect commit to use or doing a file based recovery anyway.


@Override
protected boolean allowClosedIndices() {
return true; // it's okay to sync global checkpoints on closed indices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks dangerous, since ReadOnlyEngine.syncTransLog() does nothing. So we will accept the global checkpoint on replica, but it will not be persisted.

Looks like RecoveryTarget.finalizeRecovery also relies on that method, so maybe we should implement it. I think this means that the global checkpoint is not persisted, but the replica is marked in sync. If this replica then later becomes primary after for instance a full restart, it could have an old global checkpoint and would fail the shard? Probably a different issue/PR (if I am right).

@dnhatn
Copy link
Member Author

dnhatn commented May 8, 2019

@henningandersen Thanks for looking.

Would it be an option to skip resync for closed indices?

Yes, we can skip as resync is a noop for closed indices. I explored this option then prefer other option to avoid adding one more exception for closed indices. I can make this change again - it's pretty contained.

This looks dangerous, since ReadOnlyEngine.syncTransLog() does nothing. So we will accept the global checkpoint on replica, but it will not be persisted.

We can add an assertion to make sure that the persisted global checkpoint equals to the in-memory value.

@dnhatn dnhatn changed the title Do not fail global checkpoint sync on closed indices Skip global checkpoint sync for closed indices May 20, 2019
@dnhatn
Copy link
Member Author

dnhatn commented May 20, 2019

@henningandersen I discussed this with Tanguy and Yannick during GAH. We need to keep resync; otherwise, replicas won't see the new primary term. I updated this PR to skip global checkpoint sync for closed indices. Can you please take another look? Thank you!

@dnhatn dnhatn requested a review from henningandersen May 20, 2019 15:52
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dnhatn
Copy link
Member Author

dnhatn commented May 21, 2019

Thanks @henningandersen @ywelsch.

@dnhatn dnhatn merged commit 5785941 into elastic:master May 21, 2019
@dnhatn dnhatn deleted the donot_sync_checkpoint branch May 21, 2019 23:54
dnhatn added a commit that referenced this pull request May 22, 2019
The verifying-before-close step ensures the global checkpoints on all
shard copies are in sync; thus, we don' t need to sync global
checkpoints for closed indices.

Relate #33888
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
The verifying-before-close step ensures the global checkpoints on all
shard copies are in sync; thus, we don' t need to sync global
checkpoints for closed indices.

Relate elastic#33888
@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. >non-issue v7.2.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants