-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto-expand replicas only after failing nodes #30553
Merged
ywelsch
merged 4 commits into
elastic:master
from
ywelsch:auto-expand-replicas-on-node-removal
May 14, 2018
Merged
Auto-expand replicas only after failing nodes #30553
ywelsch
merged 4 commits into
elastic:master
from
ywelsch:auto-expand-replicas-on-node-removal
May 14, 2018
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ywelsch
added
>bug
v7.0.0
:Distributed Coordination/Allocation
All issues relating to the decision making around placing a shard (both master logic & on the nodes)
v6.4.0
labels
May 13, 2018
Pinging @elastic/es-distributed |
bleskes
approved these changes
May 14, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the extra iteration.
…as-on-node-removal
Thanks @bleskes |
ywelsch
added a commit
that referenced
this pull request
May 14, 2018
#30423 combined auto-expansion in the same cluster state update where nodes are removed. As the auto-expansion step would run before deassociating the dead nodes from the routing table, the auto-expansion would possibly remove replicas from live nodes instead of dead ones. This commit reverses the order to ensure that when nodes leave the cluster that the auto-expand-replica functionality only triggers after failing the shards on the removed nodes. This ensures that active shards on other live nodes are not failed if the primary resided on a now dead node. Instead, one of the replicas on the live nodes first gets promoted to primary, and the auto- expansion (removing replicas) only triggers in a follow-up step (but still same cluster state update). Relates to #30456 and follow-up of #30423
dnhatn
added a commit
that referenced
this pull request
May 15, 2018
* 6.x: Revert "Silence IndexUpgradeIT test failures. (#30430)" [DOCS] Remove references to changelog and to highlights Revert "Mute ML upgrade test (#30458)" [ML] Fix BWC version for backport of #30125 [Docs] Improve section detailing translog usage (#30573) [Tests] Relax allowed delta in extended_stats aggregation (#30569) Fail if reading from closed KeyStoreWrapper (#30394) [ML] Reverse engineer Grok patterns from categorization results (#30125) Derive max composite buffers from max content len Update build file due to doc file rename SQL: Extract SQL request and response classes (#30457) Remove the changelog (#30593) Revert "Add deprecation warning for default shards (#30587)" Silence IndexUpgradeIT test failures. (#30430) Add deprecation warning for default shards (#30587) [DOCS] Adds 6.4.0 release highlight pages [DOCS] Adds release highlight pages (#30590) Docs: Document how to rebuild analyzers (#30498) [DOCS] Fixes title capitalization in security content LLRest: Add equals and hashcode tests for Request (#30584) [DOCS] Fix realm setting names (#30499) [DOCS] Fix path info for various security files (#30502) Docs: document precision limitations of geo_bounding_box (#30540) Fix non existing javadocs link in RestClientTests Auto-expand replicas only after failing nodes (#30553)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
>bug
:Distributed Coordination/Allocation
All issues relating to the decision making around placing a shard (both master logic & on the nodes)
v6.4.0
v7.0.0-beta1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#30423 combined auto-expansion in the same cluster state update where nodes are removed. As the auto-expansion step would run before deassociating the dead nodes from the routing table, the auto-expansion would possibly remove replicas from live nodes instead of dead ones. This PR reverses the order to ensure that when nodes leave the cluster that the auto-expand-replica functionality only triggers after failing the shards on the removed nodes. This ensures that active shards on other live nodes are not failed if the primary resided on a now dead node.
Instead, one of the replicas on the live nodes first gets promoted to primary, and the auto-expansion (removing replicas) only triggers in a follow-up step (but still same cluster state update).
Relates to #30456 (comment)
and follow-up of #30423