nual intervention to policy enforcement, which scales with infrastructure-as-code and automated routing.
Core Solution
Replication setup requires four coordinated layers: database configuration, network isolation, application routing, and validation pipeline. The following implementation uses PostgreSQL as the reference architecture, but the patterns apply to MySQL, MariaDB, and cloud-native managed databases.
Step 1: Primary Configuration
Configure the primary to expose WAL (Write-Ahead Log) streams, reserve replication slots, and isolate replication traffic.
-- postgresql.conf
wal_level = replica
max_wal_senders = 10
max_replication_slots = 10
wal_keep_size = 2GB
synchronous_commit = on
synchronous_standby_names = 'replica1'
Create a dedicated replication slot to prevent WAL removal before the replica consumes it:
SELECT pg_create_physical_replication_slot('replica1_slot');
Step 2: Replica Provisioning
Initialize the replica using pg_basebackup, configure streaming replication, and enforce read-only mode.
pg_basebackup -h primary.internal -U repl_user -D /var/lib/postgresql/data \
--wal-method=stream --checkpoint=fast --slot=replica1_slot
On the replica, create standby.signal and configure connection to primary:
# postgresql.conf (replica)
primary_conninfo = 'host=primary.internal port=5432 user=repl_user password=secret application_name=replica1'
restore_command = 'cp /var/lib/postgresql/wal_archive/%f %p'
recovery_target_timeline = 'latest'
Step 3: Application Routing & Consistency Boundaries
Replicas must not be exposed blindly to read traffic. Implement lag-aware routing that excludes replicas exceeding the RPO threshold.
TypeScript validation router:
import { Pool, PoolClient } from 'pg';
interface ReplicaHealth {
instance: string;
lagBytes: number;
isReady: boolean;
}
export class ReplicationRouter {
private readonly MAX_LAG_BYTES = 10 * 1024 * 1024; // 10MB
private pools: Map<string, Pool> = new Map();
constructor(private replicas: string[]) {
this.replicas.forEach(host => {
this.pools.set(host, new Pool({ host, user: 'app_read', database: 'app_db' }));
});
}
async getHealthyReplica(): Promise<PoolClient | null> {
const healthChecks = await Promise.all(
this.replicas.map(async host => {
const pool = this.pools.get(host)!;
try {
const res = await pool.query(`
SELECT
pg_wal_lsn_diff(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()) AS lag_bytes
`);
const lagBytes = Number(res.rows[0].lag_bytes);
return { instance: host, lagBytes, isReady: lagBytes <= this.MAX_LAG_BYTES };
} catch {
return { instance: host, lagBytes: Infinity, isReady: false };
}
})
);
const healthy = healthChecks.find(r => r.isReady);
if (!healthy) return null;
return this.pools.get(healthy.instance)!.connect();
}
}
Step 4: Monitoring & Validation Pipeline
Lag metrics alone are insufficient. Track WAL generation rate, checkpoint duration, and replication slot retention age. Export metrics to Prometheus via pg_stat_replication, pg_stat_archiver, and pg_replication_slots. Alert on:
- Slot age > 2 hours
- WAL send queue depth > 100MB
- Replica replay pause > 5 seconds
- Synchronous commit timeout spikes
Architecture decisions:
- Replication network isolation: Route replication traffic over a dedicated VPC subnet or interface to prevent client query contention from starving WAL shipping.
- Slot management: Use
pg_replication_slots with automated cleanup on replica decommission. Never drop slots manually in production.
- Failover governance: Implement automated promotion only after quorum verification and LSN alignment. Use tools like Patroni, MHA, or cloud-native failover controllers.
- Read consistency boundaries: Define explicit RPO/RTO SLAs per workload. Financial reads require strict lag thresholds; analytics workloads tolerate async drift.
Pitfall Guide
-
Treating replication lag as a static threshold instead of a time-series signal
Lag spikes during checkpoint stalls or long-running transactions. Alert on sustained lag (>30s) and rate-of-change, not instantaneous values. Use exponential moving averages to filter noise.
-
Not reserving replication slots
Without slots, the primary removes WAL segments once they're archived, breaking replication after brief downtime. Always create slots and monitor pg_replication_slots.active. Drop slots only after replica decommissioning.
-
Mixing backup and replication strategies
Replication is not backup. It propagates corruption, drops, and logical errors instantly. Maintain separate point-in-time recovery (PITR) with WAL archiving and periodic full snapshots. Never rely on replicas for disaster recovery without validation.
-
Ignoring transaction ID wraparound on long-running replicas
PostgreSQL uses 32-bit XIDs. Replicas that fall behind for months may hit wraparound, forcing a vacuum freeze or shutdown. Monitor age(datfrozenxid) and schedule aggressive autovacuum on replicas if they serve read traffic.
-
Failover without quorum verification
Promoting a replica during a network partition creates split-brain. Require quorum checks, LSN alignment, and client disconnect before promotion. Use fencing mechanisms (fence_agent, pg_rewind) to prevent dual-primary scenarios.
-
Hardcoding replica endpoints
Replicas scale, fail, and rotate. Use service discovery (Consul, Kubernetes Endpoints, or DNS SRV) with health-aware routing. Hardcoded IPs break during scaling events and increase failover time.
-
Skipping post-promotion consistency validation
After promotion, verify pg_is_in_recovery() returns false, check for unapplied WAL, and run schema/data checksums. Stale promotions leave orphaned connections and inconsistent indexes.
Best practices from production:
- Implement automated
pg_verify_checksums on replicas weekly
- Use connection poolers (PgBouncer, ProxySQL) with lag-aware routing
- Run chaos tests simulating network partitions and WAL sender crashes
- Enforce RPO/RTO in CI/CD with synthetic failover drills
- Document replication topology as code (Terraform, Ansible, or GitOps)
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Low-latency read scaling | Asynchronous + lag-aware routing | Maximizes throughput; lag filtered at application layer | Low infrastructure, moderate dev effort |
| Zero-downtime failover (RPO=0) | Synchronous replication + automated promotion | Guarantees no data loss; requires low-latency network | High network cost, strict topology constraints |
| Cross-region disaster recovery | Semi-synchronous + WAL archiving + async read replica | Balances RPO with cross-AZ latency; archiving enables PITR | Moderate infrastructure, higher storage costs |
| Cost-optimized analytics | Asynchronous logical replication to separate cluster | Isolates analytical load; tolerates minutes of lag | Low primary impact, separate compute costs |
Configuration Template
Primary (postgresql.conf)
wal_level = replica
max_wal_senders = 10
max_replication_slots = 10
wal_keep_size = 2GB
synchronous_commit = on
synchronous_standby_names = 'replica1'
listen_addresses = '*'
Primary (pg_hba.conf)
host replication repl_user 10.0.0.0/8 md5
host all all 10.0.0.0/8 md5
Replica (postgresql.conf)
primary_conninfo = 'host=primary.internal port=5432 user=repl_user password=secret application_name=replica1'
restore_command = 'cp /var/lib/postgresql/wal_archive/%f %p'
recovery_target_timeline = 'latest'
hot_standby = on
Replica initialization
touch /var/lib/postgresql/data/standby.signal
pg_ctl -D /var/lib/postgresql/data start
Slot creation (SQL)
SELECT pg_create_physical_replication_slot('replica1_slot');
Quick Start Guide
- Spin up primary container:
docker run --name pg-primary -e POSTGRES_PASSWORD=secret -p 5432:5432 -d postgres:16-alpine
- Create replication user and slot:
docker exec pg-primary psql -U postgres -c "CREATE ROLE repl_user WITH REPLICATION LOGIN PASSWORD 'secret'; SELECT pg_create_physical_replication_slot('repl_slot');"
- Initialize replica:
docker run --name pg-replica -e POSTGRES_PASSWORD=secret -v pgdata-replica:/var/lib/postgresql/data -d postgres:16-alpine then copy base backup: pg_basebackup -h 127.0.0.1 -U repl_user -D /var/lib/postgresql/data --wal-method=stream --checkpoint=fast --slot=repl_slot
- Start replica in standby mode:
touch /var/lib/postgresql/data/standby.signal, set primary_conninfo in postgresql.conf, then pg_ctl -D /var/lib/postgresql/data start. Verify with SELECT * FROM pg_stat_replication; on primary.
Total setup time: ~4 minutes. Validation: replica streams WAL, lag stays under 50ms under normal load, and read routing excludes the replica if lag exceeds threshold.