Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle leader task being dead in RestoreState #3502

Merged
merged 1 commit into from
Nov 15, 2017

Conversation

schmichael
Copy link
Member

Fixes the panic mentioned in
#3420 (comment)

While a leader task dying serially stops all follower tasks, the
synchronizing of state is asynchrnous. Nomad can shutdown before all
follower tasks have updated their state to dead thus saving the state
necessary to hit this panic: have a non-terminal alloc with a dead
leader.

The actual fix is a simple nil check to not assume non-terminal allocs
leader's have a TaskRunner.

Binary attached:

nomad.gz

@schmichael schmichael mentioned this pull request Nov 3, 2017
@schmichael schmichael force-pushed the b-3420-restore-dead-leader branch from 38b27ea to 7195e50 Compare November 13, 2017 18:58
Copy link
Contributor

@dadgar dadgar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Add to changelog and LGTM

Fixes the panic mentioned in
#3420 (comment)

While a leader task dying serially stops all follower tasks, the
synchronizing of state is asynchrnous. Nomad can shutdown before all
follower tasks have updated their state to dead thus saving the state
necessary to hit this panic: *have a non-terminal alloc with a dead
leader.*

The actual fix is a simple nil check to not assume non-terminal allocs
leader's have a TaskRunner.
@schmichael schmichael force-pushed the b-3420-restore-dead-leader branch from 7195e50 to 0de0e1d Compare November 15, 2017 18:36
@schmichael schmichael merged commit 0cdc12b into master Nov 15, 2017
@schmichael schmichael deleted the b-3420-restore-dead-leader branch November 15, 2017 19:21
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants