Skip to content
This repository has been archived by the owner on Nov 18, 2017. It is now read-only.

How should Postgres behave if etcd is unavailable / unpredictable? #7

Open
Winslett opened this issue May 7, 2015 · 2 comments
Open

Comments

@Winslett
Copy link
Contributor

Winslett commented May 7, 2015

Some of the scenarios to consider are:

  • governor for Postgres primary cannot communicate with etcd, but rest of cluster can
  • no governors in Postgres cluster can communicate with etcd
  • etcd crashes and recovers. after recovery, etcd leader TTLs have expired.
  • etcd crashes and recovers. after recovery, the initialization key is empty. this would cause issues when new members would come online and race to initialize

The first decision to answer is: should Postgres cluster go readonly if etcd fails? Or, should the Postgres cluster keep the current Primary, but not have automatic failover functionality?

@tvb
Copy link

tvb commented May 7, 2015

I think it is best to go readonly so replication from the new elected master can continue.

@bjoernbessert
Copy link

In my opinion, i would prefer to keep the current Primary and do not take any "decision" when you loose the "brain". In general, i try to avoid the situation that the HA software/layer itself is able to bring down the protected application (and the protected application itself haven't any problems at all).

The Postgres cluster can go read-only as an additional safety measure, but it's not a must-have.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants