Disaster Recovery setup in Postgres

Blogs

Seamless CI/CD with Azure DevOps and Databricks: Automating Data Engineering Workflows
September 10, 2025
Failover Clustering setup in Postgres
November 24, 2025

Disaster Recovery setup in Postgres

High availability and disaster recovery capabilities are essential for any production-ready database system. PostgreSQL is reliable, but by itself it does not provide automatic failover, leader election, or cluster coordination. To achieve a fully automated DR environment, we can combine four mature open-source tools: Patroni, Etcd, HAProxy, and Docker.

This article explains how to build a three-node containerized PostgreSQL DR setup using these components. The architecture includes one primary PostgreSQL node, one replica node, and a third node dedicated to running Etcd and HAProxy. This design provides a simple but robust foundation for high availability, automatic failover, and transparent client routing.

Architecture Overview

The architecture consists of three separate servers (or virtual machines), each running containers for specific roles.

Node 1: Patroni + PostgreSQL (Primary)

This node runs:

  • Patroni to manage PostgreSQL

  • PostgreSQL data directory
    This node becomes the initial leader of the cluster during bootstrapping.

Node 2: Patroni + PostgreSQL (Replica)

This node runs:

  • Patroni

  • PostgreSQL as a hot standby replica
    It continuously receives WAL data from the primary node and stays ready to take over when required.

Node 3: Etcd + HAProxy

This is the control plane of the cluster. It runs:

  • Etcd, which is used as the Distributed Configuration Store (DCS)

  • HAProxy, which acts as a single routing endpoint for client connections

Node 3 does not store database data. Its responsibility is cluster coordination and routing.

Role of Each Component

Docker

Docker provides a clean, reproducible, and isolated runtime environment for each service. Because Patroni, Etcd, PostgreSQL, and HAProxy run inside containers, the setup is easy to deploy, rebuild, scale, or migrate.

Etcd

Etcd is a strongly consistent distributed key-value store. Patroni uses it to maintain cluster metadata, including:

  • Leader lock (who is the primary)

  • Health information of each node

  • Failover coordination

  • Heartbeat and status information

Etcd is central to ensuring that only one node acts as the primary at a time, which prevents data corruption.

Patroni

Patroni provides the automation layer for PostgreSQL high availability. It handles:

  • Initial cluster bootstrap

  • Leader election

  • Replication configuration

  • Automatic failover and promotion of replicas

  • Continuous health monitoring

  • Integration with Etcd

Patroni exposes a REST API on each node, allowing HAProxy to understand which node is currently the primary.

PostgreSQL

PostgreSQL runs inside each Patroni container. On the primary node it accepts read/write traffic, while the replica node uses streaming replication to continuously synchronize its data directory.

HAProxy

HAProxy acts as a reverse proxy and load balancer for PostgreSQL connections. Instead of pointing applications directly at a particular PostgreSQL server, all connections go through HAProxy. HAProxy uses Patroni’s REST API to determine which node is currently the primary and routes traffic accordingly. During failover, connections automatically shift to the new primary with no application-side configuration changes.

Cluster Workflow

The cluster operates as follows:

  1. Etcd starts on Node 3 and listens on port 2379.

  2. Patroni on Node 1 and Node 2 both connect to Etcd.

  3. Patroni performs leader election. Node 1 becomes the primary, and Node 2 registers as a replica.

  4. Patroni configures PostgreSQL streaming replication automatically.

  5. HAProxy, also on Node 3.

  6. All client connections are directed to HAProxy, which forwards the traffic to the current primary.

  7. If Node 1 fails, Patroni promotes Node 2 to primary. Etcd records this change.

  8. HAProxy begins routing traffic to Node 2 without requiring any manual intervention.

This results in a fully automated failover setup.

Step-by-Step Deployment

Below is a condensed version of how the cluster is deployed using Docker.

1. Create a Docker Network on Each Node

docker network create patroni-net

This ensures all containers can communicate using container names.

2. Deploy Etcd on Node 3

Create a directory and add the following docker-compose file:

version: "3.8"

services:
etcd:
image: quay.io/coreos/etcd:v3.5.9
container_name: etcd
command: >
etcd --name etcd0
--data-dir /etcd-data
--advertise-client-urls http://0.0.0.0:2379
--listen-client-urls http://0.0.0.0:2379
ports:
- "2379:2379"
networks:
- patroni-net

networks:
patroni-net:
external: true

Start Etcd:

docker compose up -d

Verify that Etcd is reachable:

curl http://<node3-ip>:2379/version

3. Deploy Patroni + PostgreSQL on Node 1 (Primary)

Create a patroni-primary.yml configuration file:

scope: pg-cluster
name: patroni-primary

restapi:
listen: 0.0.0.0:8008
connect_address: patroni-primary:8008

etcd:
host: <node3-ip>:2379

bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
initdb:
- encoding: UTF8
- data-checksums

postgresql:
listen: 0.0.0.0:5432
connect_address: patroni-primary:5432
data_dir: /data/pgdata
authentication:
superuser:
username: postgres
password: postgres
replication:
username: replicator
password: replicator

Create a docker-compose file:

version: "3.8"

services:
primary:
image: ghcr.io/zalando/spilo-16:3.3-p1
container_name: patroni-primary
volumes:
- ./patroni-primary.yml:/config/patroni.yml
networks:
- patroni-net
ports:
- "5432:5432"
- "8008:8008"

networks:
patroni-net:
external: true

Start the container:

docker compose up -d

4. Deploy Patroni + PostgreSQL on Node 2 (Replica)

Create a patroni-replica.yml file similar to the primary, with the name updated:

scope: pg-cluster
name: patroni-replica

restapi:
listen: 0.0.0.0:8008
connect_address: patroni-replica:8008

etcd:
host: <node3-ip>:2379

postgresql:
listen: 0.0.0.0:5432
connect_address: patroni-replica:5432
data_dir: /data/pgdata
authentication:
superuser:
username: postgres
password: postgres
replication:
username: replicator
password: replicator

Start Node 2:

docker compose up -d

5. Deploy HAProxy on Node 3

Create the HAProxy configuration file:

frontend postgresql
bind *:5000
mode tcp
default_backend patroni_nodes

backend patroni_nodes
mode tcp
option tcp-check
tcp-check connect
tcp-check send "GET /master HTTP/1.1rnHost: localhostrnrn"
tcp-check expect rstring ""role": "master""
server primary <node1-ip>:5432 check port 8008
server replica <node2-ip>:5432 check port 8008

Run HAProxy:

docker run -d --name haproxy
-v ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
--network patroni-net
-p 5000:5000 haproxy

Failover Testing

  1. Confirm that Node 1 is the leader:

docker exec -it patroni-primary patronictl list
  1. Stop Node 1:

docker stop patroni-primary

Patroni on Node 2 will promote itself as the new primary. Etcd records the leadership change and HAProxy automatically reroutes traffic.

  1. Restart Node 1:

docker start patroni-primary

Node 1 re-enters the cluster as a replica.

Conclusion

This three-node architecture delivers a fully automated PostgreSQL DR solution with transparent failover and a dedicated configuration and routing layer. By leveraging Patroni, Etcd, HAProxy, and Docker, the environment becomes both powerful and portable.


Lochan R

Leave a Reply

Your email address will not be published. Required fields are marked *