Skip to content
This repository has been archived by the owner on Aug 25, 2021. It is now read-only.

Remove preStop hook that calls consul leave from Consul servers #764

Merged
merged 5 commits into from
Jan 22, 2021

Conversation

ishustava
Copy link
Contributor

@ishustava ishustava commented Jan 13, 2021

Background

Currently, when Consul servers are restarted, we call the consul leave command in the preStop hook that is registered with the servers. This approach has a couple of problems:

  1. This command does not work when ACLs are enabled on the cluster because this command requires a valid ACL token with agent:write permission. Furthermore, we cannot reasonably make it work because even if the server's ACL token had that permission, we could not retrieve it because it's stored in the Consul's data directory. We could work around this by removing the preStop hook and instead setting leave_on_terminate to true on the servers, which would accomplish the same behavior. However, this solution is not recommended because of point 2.
  2. The voting members of the consul cluster (aka the consul servers) should not gracefully leave the cluster. That is so that the quorum size is not affected during upgrades or downtime. See Leader still elected when below quorum consul#5947.

Changes proposed in this PR:

  • Remove the preStop hook calling consul leave from the consul server pods

How I've tested this PR:

  • Deployed Consul from this helm chart, ran kubectl rollout restart sts/..., and checked that servers came up healthy.
  • Deployed the current stable version of the helm chart, upgraded to this version and checked that the upgrade goes through without problems

How I expect reviewers to test this PR:

Code review

Checklist:

  • CHANGELOG entry added (HashiCorp engineers only, community PRs should not add a changelog entry)

@ishustava ishustava marked this pull request as ready for review January 22, 2021 00:16
@ishustava ishustava requested review from a team, ndhanushkodi and kschoche and removed request for a team January 22, 2021 00:16
Copy link
Contributor

@kschoche kschoche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @ishustava !
I'd personally call it a bug fix but all looks great, thanks for taking this on.

@ishustava
Copy link
Contributor Author

I'd personally call it a bug fix but all looks great, thanks for taking this on.

good point, @kschoche. Let me fix that.

@ishustava ishustava changed the title Provide a deterministic node ID for each consul server Remove preStop hook that calls consul leave from Consul servers Jan 22, 2021
@ishustava ishustava added the bug Something isn't working label Jan 22, 2021
CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@ndhanushkodi ndhanushkodi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 looks great!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants