Skip to content
This repository has been archived by the owner on Jul 27, 2023. It is now read-only.

Add playbook to force consul leader election #948

Merged
merged 3 commits into from
Jun 20, 2016

Conversation

langston-barrett
Copy link
Contributor

See discussion on #763, related to #566

@stevendborrelli
Copy link
Contributor

Does this require a consul restart or reload to take effect?

We could also populate this file with the inventory data.

@langston-barrett
Copy link
Contributor Author

As @ChrisAubuchon noted in #763, "consul.json already has retry_after_leave set. That causes Consul to rejoin."

We could definitely repopulate peers.json with inventory data, but since retry_join is set to that same list of IPs (namely, consul_servers_group), it shouldn't make a difference.

@ryane
Copy link
Contributor

ryane commented Jan 5, 2016

@stevendborrelli, @siddharthist is there any more work needed on this? has it been tested? should we add a small doc explaining why you would use this?

@TanyaCouture
Copy link
Contributor

@ryane @siddharthist OS testing in process. Deploying and installing playbook. Are there other tests that should be run?

@ryane
Copy link
Contributor

ryane commented Jan 5, 2016

@TanyaCouture yes, we definitely want to verify if this playbook will fix a consul cluster that is unable to elect a leader (after a crash, for example)

@TanyaCouture
Copy link
Contributor

  1. Successful Openstack installation.
  2. The consul log showed: failed to sync remote state: No cluster leader
  3. Restarted consul: ansible all -m service -a "name=consul state=restarted" --become
  4. The updated consul log showed: No cluster leader
  5. Ran the new playbook and restarted consul again
  6. The updated consul log still showed: No cluster leader

@langston-barrett
Copy link
Contributor Author

We should take a look at @abn's playbook for doing this.

@stevendborrelli
Copy link
Contributor

We've had enough folks having issues with consul that adding something similar to @abn's playbook makes sense. I don't think the current PR is sufficient.

@langston-barrett
Copy link
Contributor Author

@stevendborrelli I'll update ASAP.

@kbroughton
Copy link
Contributor

One thing I had to do in #1338 was modify the peers.json content with a default(groups['all']) for when consul_dc_group is not set. You might want to test with consul_dc_group unset to verify.

{% for host in groups[consul_servers_group] | intersect(groups[consul_dc_group] | default(groups['all'])) %}"{{ hostvars[host].private_ipv4 }}:8300"{% if not loop.last %}, 
{% endif %}{% endfor %}

this is a variant of the playbook that worked for me and for @RaunoVV on
gitter.

https://gitter.im/CiscoCloud/mantl?at=573398303a05b11b6a4c11e6
@ryane
Copy link
Contributor

ryane commented May 12, 2016

added an updated version of the playbook that worked for me and for @RaunoVV on gitter. probably could use more testing/validation

@ryane ryane modified the milestone: 1.2 Jun 9, 2016
@ryane
Copy link
Contributor

ryane commented Jun 20, 2016

this has been successfully tested a couple of times and definitely fixes the issue in some scenarios. It is safe to merge now and if we identify other scenarios where it doesn't work, we can create new issues for it.

@ryane ryane merged commit 339e68f into master Jun 20, 2016
@ryane ryane deleted the feature/force-leader-election branch June 20, 2016 17:31
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants