Skip to content

Commit

Permalink
Merge branch 'vitabaks:master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
FactorT authored May 7, 2024
2 parents 99379d8 + 0384a32 commit 5bb00eb
Show file tree
Hide file tree
Showing 51 changed files with 663 additions and 403 deletions.
2 changes: 1 addition & 1 deletion .config/python/dev/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ molecule-plugins==23.5.3
ansible-lint==24.2.0
yamllint==1.35.1
attrs==23.2.0
black==24.2.0
black==24.3.0
bracex==2.4
cffi==1.16.0
click==8.1.7
Expand Down
122 changes: 41 additions & 81 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,103 +9,66 @@

### Production-ready PostgreSQL High-Availability Cluster (based on "Patroni" and DCS "etcd" or "consul"). Automating with Ansible.

This Ansible playbook is designed for deploying a PostgreSQL high availability cluster on dedicated physical servers for a production environment. The cluster can also be deployed on virtual machines and in the Cloud.
The **postgresql_cluster** project is designed to deploy and manage high-availability PostgreSQL clusters in production environments. This solution is tailored for use on dedicated physical servers, virtual machines, and within both on-premises and cloud-based infrastructures.

In addition to deploying new clusters, this playbook also support the deployment of cluster over already existing and running PostgreSQL. You can convert your basic PostgreSQL installation to a high availability cluster. Just specify the variable `postgresql_exists='true'` in the inventory file.
**Attention!** Your PostgreSQL will be stopped before running in cluster mode (please plan for a short downtime of databases).
This project not only facilitates the creation of new clusters but also offers support for integrating with pre-existing PostgreSQL instances. If you intend to upgrade your conventional PostgreSQL setup to a high-availability configuration, then just set `postgresql_exists=true` in the inventory file. Be aware that initiating cluster mode requires temporarily stopping your existing PostgreSQL service, which will lead to a brief period of database downtime. Please plan this transition accordingly.

:trophy: **Use the [sponsoring](https://github.com/vitabaks/postgresql_cluster#sponsor-this-project) program to get personalized support, or just to contribute to this project.**

---

:trophy: **Use the [sponsoring](https://github.com/vitabaks/postgresql_cluster#sponsor-this-project) program to get personalized support, or just to contribute to this project.**
### Supported setups of Postgres Cluster

## Index
- [Cluster types](#cluster-types)
- [[Type A] PostgreSQL High-Availability with HAProxy Load Balancing](#type-a-postgresql-high-availability-with-haproxy-load-balancing)
- [[Type B] PostgreSQL High-Availability only](#type-b-postgresql-high-availability-only)
- [[Type C] PostgreSQL High-Availability with Consul Service Discovery (DNS)](#type-c-postgresql-high-availability-with-consul-service-discovery-dns)
- [Compatibility](#compatibility)
- [Supported Linux Distributions:](#supported-linux-distributions)
- [PostgreSQL versions:](#postgresql-versions)
- [Ansible version](#ansible-version)
- [Requirements](#requirements)
- [Port requirements](#port-requirements)
- [Recommendations](#recommenations)
- [Deployment: quick start](#deployment-quick-start)
- [Variables](#variables)
- [Cluster Scaling](#cluster-scaling)
- [Steps to add a new postgres node](#steps-to-add-a-new-postgres-node)
- [Steps to add a new balancer node](#steps-to-add-a-new-balancer-node)
- [Restore and Cloning](#restore-and-cloning)
- [Create cluster with pgBackRest:](#create-cluster-with-pgbackrest)
- [Create cluster with WAL-G:](#create-cluster-with-wal-g)
- [Point-In-Time-Recovery:](#point-in-time-recovery)
- [Maintenance](#maintenance)
- [Changing PostgreSQL configuration parameters](#changing-postgresql-configuration-parameters)
- [Update the PostgreSQL HA Cluster](#update-the-postgresql-ha-cluster)
- [PostgreSQL major upgrade](#postgresql-major-upgrade)
- [Disaster Recovery](#disaster-recovery)
- [etcd](#etcd)
- [PostgreSQL (databases)](#postgresql-databases)
- [How to start from scratch](#how-to-start-from-scratch)
- [License](#license)
- [Author](#author)
- [Sponsor this project](#sponsor-this-project)
- [Feedback, bug-reports, requests, ...](#feedback-bug-reports-requests-)

## Cluster types
![postgresql_cluster](images/postgresql_cluster.png#gh-light-mode-only)
![postgresql_cluster](images/postgresql_cluster.dark_mode.png#gh-dark-mode-only)

You have three schemes available for deployment:

### [Type A] PostgreSQL High-Availability with HAProxy Load Balancing
![TypeA](images/TypeA.png)
#### 1. PostgreSQL High-Availability only

> To use this scheme, specify `with_haproxy_load_balancing: true` in variable file vars/main.yml
This is simple scheme without load balancing (used by default).

This scheme provides the ability to distribute the load on reading. This also allows us to scale out the cluster (with read-only replicas).
##### Components of high availability:

- port 5000 (read / write) master
- port 5001 (read only) all replicas
- [**Patroni**](https://github.com/zalando/patroni) is a template for you to create your own customized, high-availability solution using Python and - for maximum accessibility - a distributed configuration store like ZooKeeper, etcd, Consul or Kubernetes. Used for automate the management of PostgreSQL instances and auto failover.

###### if variable "synchronous_mode" is 'true' (vars/main.yml):
- port 5002 (read only) synchronous replica only
- port 5003 (read only) asynchronous replicas only
- [**etcd**](https://github.com/etcd-io/etcd) is a distributed reliable key-value store for the most critical data of a distributed system. etcd is written in Go and uses the [Raft](https://raft.github.io/) consensus algorithm to manage a highly-available replicated log. It is used by Patroni to store information about the status of the cluster and PostgreSQL configuration parameters.

> :heavy_exclamation_mark: Your application must have support sending read requests to a custom port (ex 5001), and write requests (ex 5000).
[What is Distributed Consensus?](http://thesecretlivesofdata.com/raft/)

##### Components of high availability:
[**Patroni**](https://github.com/zalando/patroni) is a template for you to create your own customized, high-availability solution using Python and - for maximum accessibility - a distributed configuration store like ZooKeeper, etcd, Consul or Kubernetes. Used for automate the management of PostgreSQL instances and auto failover.
To provide a single entry point (VIP) for database access is used "vip-manager".

[**etcd**](https://github.com/etcd-io/etcd) is a distributed reliable key-value store for the most critical data of a distributed system. etcd is written in Go and uses the [Raft](https://raft.github.io/) consensus algorithm to manage a highly-available replicated log. It is used by Patroni to store information about the status of the cluster and PostgreSQL configuration parameters.
- [**vip-manager**](https://github.com/cybertec-postgresql/vip-manager) (_optional, if the `cluster_vip` variable is specified_) is a service that gets started on all cluster nodes and connects to the DCS. If the local node owns the leader-key, vip-manager starts the configured VIP. In case of a failover, vip-manager removes the VIP on the old leader and the corresponding service on the new leader starts it there.

[What is Distributed Consensus?](http://thesecretlivesofdata.com/raft/)
- [**PgBouncer**](https://pgbouncer.github.io/features.html) (optional, if the `pgbouncer_install` variable is `true`) is a connection pooler for PostgreSQL.

##### Components of load balancing:
[**HAProxy**](http://www.haproxy.org/) is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications.
#### 2. PostgreSQL High-Availability with HAProxy Load Balancing

[**confd**](https://github.com/kelseyhightower/confd) manage local application configuration files using templates and data from etcd or consul. Used to automate HAProxy configuration file management.
To use this scheme, specify `with_haproxy_load_balancing: true` in variable file vars/main.yml

[**Keepalived**](https://github.com/acassen/keepalived) provides a virtual high-available IP address (VIP) and single entry point for databases access.
Implementing VRRP (Virtual Router Redundancy Protocol) for Linux.
In our configuration keepalived checks the status of the HAProxy service and in case of a failure delegates the VIP to another server in the cluster.
This scheme provides the ability to distribute the load on reading. This also allows us to scale out the cluster (with read-only replicas).

[**PgBouncer**](https://pgbouncer.github.io/features.html) is a connection pooler for PostgreSQL.
- port 5000 (read / write) master
- port 5001 (read only) all replicas

###### if variable "synchronous_mode" is 'true' (vars/main.yml):
- port 5002 (read only) synchronous replica only
- port 5003 (read only) asynchronous replicas only

### [Type B] PostgreSQL High-Availability only
![TypeB](images/TypeB.png)
:heavy_exclamation_mark: Note: Your application must have support sending read requests to a custom port 5001, and write requests to port 5000.

This is simple scheme without load balancing `Used by default`
##### Components of load balancing:

To provide a single entry point (VIP) for database access is used "vip-manager". If the variable `cluster_vip` is specified (optional).
- [**HAProxy**](http://www.haproxy.org/) is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications.

[**vip-manager**](https://github.com/cybertec-postgresql/vip-manager) is a service that gets started on all cluster nodes and connects to the DCS. If the local node owns the leader-key, vip-manager starts the configured VIP. In case of a failover, vip-manager removes the VIP on the old leader and the corresponding service on the new leader starts it there. \
Written in Go. Cybertec Schönig & Schönig GmbH https://www.cybertec-postgresql.com
- [**confd**](https://github.com/kelseyhightower/confd) manage local application configuration files using templates and data from etcd or consul. Used to automate HAProxy configuration file management.

- [**Keepalived**](https://github.com/acassen/keepalived) (_optional, if the `cluster_vip` variable is specified_) provides a virtual high-available IP address (VIP) and single entry point for databases access.
Implementing VRRP (Virtual Router Redundancy Protocol) for Linux. In our configuration keepalived checks the status of the HAProxy service and in case of a failure delegates the VIP to another server in the cluster.

### [Type C] PostgreSQL High-Availability with Consul Service Discovery (DNS)
![TypeC](images/TypeC.png)
#### 3. PostgreSQL High-Availability with Consul Service Discovery (DNS)

> To use this scheme, specify `dcs_type: consul` in variable file vars/main.yml
To use this scheme, specify `dcs_type: consul` in variable file vars/main.yml

This scheme is suitable for master-only access and for load balancing (using DNS) for reading across replicas. Consul [Service Discovery](https://developer.hashicorp.com/consul/docs/concepts/service-discovery) with [DNS resolving ](https://developer.hashicorp.com/consul/docs/discovery/dns) is used as a client access point to the database.

Expand All @@ -120,13 +83,13 @@ Example: `replica.postgres-cluster.service.dc1.consul`, `replica.postgres-cluste

It requires the installation of a consul in client mode on each application server for service DNS resolution (or use [forward DNS](https://developer.hashicorp.com/consul/tutorials/networking/dns-forwarding?utm_source=docs) to the remote consul server instead of installing a local consul client).

---

## Compatibility
RedHat and Debian based distros (x86_64)

###### Supported Linux Distributions:
- **Debian**: 10, 11, 12
- **Ubuntu**: 18.04, 20.04, 22.04
- **Ubuntu**: 20.04, 22.04
- **CentOS**: 7, 8
- **CentOS Stream**: 8, 9
- **Oracle Linux**: 7, 8, 9
Expand Down Expand Up @@ -218,15 +181,14 @@ If you’d prefer a cross-data center setup, where the replicating databases are
There are quite a lot of things to consider if you want to create a really robust etcd cluster, but there is one rule: *do not placing all etcd members in your primary data center*. See some [examples](https://www.cybertec-postgresql.com/en/introduction-and-how-to-etcd-clusters-for-patroni/).


- **How to prevent data loss in case of autofailover (synchronous_modes and pg_rewind)**:
- **How to prevent data loss in case of autofailover (synchronous_modes)**:

Due to performance reasons, a synchronous replication is disabled by default.

To minimize the risk of losing data on autofailover, you can configure settings in the following way:
- synchronous_mode: 'true'
- synchronous_mode_strict: 'true'
- synchronous_commit: 'on' (or 'remote_apply')
- use_pg_rewind: 'false' (enabled by default)

---

Expand Down Expand Up @@ -591,13 +553,6 @@ A new installation can now be made from scratch.

---

## License
Licensed under the MIT License. See the [LICENSE](./LICENSE) file for details.

## Author
Vitaliy Kukharik (PostgreSQL DBA) \
[email protected]

## Sponsor this project

Join our sponsorship program to directly contribute to our project's growth and gain exclusive access to personalized support. Your sponsorship is crucial for innovation and progress. Become a sponsor today!
Expand All @@ -614,7 +569,12 @@ Support our work through a crypto wallet:

USDT (TRC20): `TSTSXZzqDCUDHDjZwCpuBkdukjuDZspwjj`

---
## License
Licensed under the MIT License. See the [LICENSE](./LICENSE) file for details.

## Author
Vitaliy Kukharik (PostgreSQL DBA) \
[email protected]

## Feedback, bug-reports, requests, ...
Are [welcome](https://github.com/vitabaks/postgresql_cluster/issues)!
6 changes: 6 additions & 0 deletions add_balancer.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@
any_errors_fatal: true
gather_facts: true
pre_tasks:
- name: "Set variable: ansible_python_interpreter"
ansible.builtin.set_fact:
ansible_python_interpreter: "/usr/bin/env python3"
when: "'python3' not in (ansible_python_interpreter | default(''))"
tags: always

- name: Include main variables
ansible.builtin.include_vars: "vars/main.yml"
tags: always
Expand Down
6 changes: 6 additions & 0 deletions add_pgnode.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@
- ansible.builtin.import_tasks: roles/patroni/handlers/main.yml

pre_tasks:
- name: "Set variable: ansible_python_interpreter"
ansible.builtin.set_fact:
ansible_python_interpreter: "/usr/bin/env python3"
when: "'python3' not in (ansible_python_interpreter | default(''))"
tags: always

- name: Include main variables
ansible.builtin.include_vars: "vars/main.yml"
tags: always
Expand Down
6 changes: 6 additions & 0 deletions balancers.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@
vip_manager_disable: false # or 'true' for disable vip-manager service (if installed)

pre_tasks:
- name: "Set variable: ansible_python_interpreter"
ansible.builtin.set_fact:
ansible_python_interpreter: "/usr/bin/env python3"
when: "'python3' not in (ansible_python_interpreter | default(''))"
tags: always

- name: Include main variables
ansible.builtin.include_vars: "vars/main.yml"
tags: always
Expand Down
6 changes: 6 additions & 0 deletions config_pgcluster.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@
hosts: postgres_cluster
gather_facts: true
pre_tasks:
- name: "Set variable: ansible_python_interpreter"
ansible.builtin.set_fact:
ansible_python_interpreter: "/usr/bin/env python3"
when: "'python3' not in (ansible_python_interpreter | default(''))"
tags: always

- name: Include main variables
ansible.builtin.include_vars: "vars/main.yml"

Expand Down
6 changes: 6 additions & 0 deletions consul.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,12 @@
environment: "{{ proxy_env | default({}) }}"

pre_tasks:
- name: "Set variable: ansible_python_interpreter"
ansible.builtin.set_fact:
ansible_python_interpreter: "/usr/bin/env python3"
when: "'python3' not in (ansible_python_interpreter | default(''))"
tags: always

- name: Include main variables
ansible.builtin.include_vars: "vars/main.yml"
tags: always
Expand Down
6 changes: 6 additions & 0 deletions deploy_pgcluster.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@
environment: "{{ proxy_env | default({}) }}"

pre_tasks:
- name: "Set variable: ansible_python_interpreter"
ansible.builtin.set_fact:
ansible_python_interpreter: "/usr/bin/env python3"
when: "'python3' not in (ansible_python_interpreter | default(''))"
tags: always

- name: Include main variables
ansible.builtin.include_vars: "vars/main.yml"
tags: always
Expand Down
6 changes: 6 additions & 0 deletions etcd_cluster.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,12 @@
gather_facts: true

pre_tasks:
- name: "Set variable: ansible_python_interpreter"
ansible.builtin.set_fact:
ansible_python_interpreter: "/usr/bin/env python3"
when: "'python3' not in (ansible_python_interpreter | default(''))"
tags: always

- name: Include main variables
ansible.builtin.include_vars: "vars/main.yml"
tags: always
Expand Down
Binary file added images/postgresql_cluster.dark_mode.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/postgresql_cluster.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
35 changes: 18 additions & 17 deletions inventory
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,12 @@
# The specified ip addresses will be used to listen by the cluster components.
# Attention! Specify private IP addresses so that the cluster does not listen a public IP addresses.
# For deploying via public IPs, add 'ansible_host=public_ip_address' variable for each node.

# "postgresql_exists='true'" if PostgreSQL is already exists and running
#
# "postgresql_exists=true" if PostgreSQL is already exists and running
# "hostname=" variable is optional (used to change the server name)
# "new_node=true" to add a new server to an existing cluster using the add_pgnode.yml playbook

# In this example, all components will be installed on PostgreSQL nodes.
# You can deploy the haproxy balancers and the etcd or consul cluster on other dedicated servers (recomended).
# patroni_tags="key=value" the Patroni tags in "key=value" format separated by commas.
# balancer_tags="key=value" the Balancer tags for the /replica, /sync, /async endpoints. Add the tag to the 'patroni_tags' variable first.

# if dcs_exists: false and dcs_type: "etcd"
[etcd_cluster] # recommendation: 3, or 5-7 nodes
Expand All @@ -21,24 +20,26 @@
10.128.64.140 consul_node_role=server consul_bootstrap_expect=true consul_datacenter=dc1
10.128.64.142 consul_node_role=server consul_bootstrap_expect=true consul_datacenter=dc1
10.128.64.143 consul_node_role=server consul_bootstrap_expect=true consul_datacenter=dc1
#10.128.64.144 consul_node_role=client consul_datacenter=dc1
#10.128.64.144 consul_node_role=client consul_datacenter=dc2
#10.128.64.145 consul_node_role=client consul_datacenter=dc2

# if with_haproxy_load_balancing: true
[balancers]
10.128.64.140
10.128.64.142
10.128.64.143
#10.128.64.144 new_node=true
10.128.64.140 # balancer_tags="datacenter=dc1"
10.128.64.142 # balancer_tags="datacenter=dc1"
10.128.64.143 # balancer_tags="datacenter=dc1"
#10.128.64.144 balancer_tags="datacenter=dc2"
#10.128.64.145 balancer_tags="datacenter=dc2" new_node=true

# PostgreSQL nodes
[master]
10.128.64.140 hostname=pgnode01 postgresql_exists=false
10.128.64.140 hostname=pgnode01 postgresql_exists=false # patroni_tags="datacenter=dc1"

[replica]
10.128.64.142 hostname=pgnode02 postgresql_exists=false
10.128.64.143 hostname=pgnode03 postgresql_exists=false
#10.128.64.144 hostname=pgnode04 postgresql_exists=false new_node=true
10.128.64.142 hostname=pgnode02 postgresql_exists=false # patroni_tags="datacenter=dc1"
10.128.64.143 hostname=pgnode03 postgresql_exists=false # patroni_tags="datacenter=dc1"
#10.128.64.144 hostname=pgnode04 postgresql_exists=false patroni_tags="datacenter=dc2"
#10.128.64.145 hostname=pgnode04 postgresql_exists=false patroni_tags="datacenter=dc2" new_node=true

[postgres_cluster:children]
master
Expand All @@ -55,9 +56,9 @@ ansible_ssh_port='22'
ansible_user='root'
ansible_ssh_pass='secretpassword' # "sshpass" package is required for use "ansible_ssh_pass"
#ansible_ssh_private_key_file=
#ansible_python_interpreter='/usr/bin/python3' # is required for use python3
ansible_python_interpreter='/usr/bin/env python3'

[pgbackrest:vars]
ansible_user='postgres'
ansible_ssh_pass='secretpassword'
#ansible_user='postgres'
#ansible_ssh_pass='secretpassword'

Loading

0 comments on commit 5bb00eb

Please sign in to comment.