deterministic sequence: baseline monitoring β memory allocation β I/O and checkpoint behavior β connection management β query and index optimization. Deviating from this order causes symptom-chasing and configuration drift.
Step 1: Establish Baseline Monitoring
Enable pg_stat_statements before making changes. Without baseline metrics, tuning is guesswork.
-- Enable extension (requires superuser)
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Track all statements including those from extensions
ALTER SYSTEM SET pg_stat_statements.track = 'all';
SELECT pg_reload_conf();
Query the top resource consumers:
SELECT query, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
Step 2: Memory Allocation
PostgreSQL uses shared memory for buffer management and per-session memory for sorting/hashing. Misallocation causes swapping or cache thrashing.
shared_buffers: Cache for frequently accessed data pages. Set to 25% of system RAM. Do not exceed 30%; Linux page cache handles OS-level caching efficiently.
effective_cache_size: Estimate of OS + PostgreSQL cache available for query planning. Set to 75% of RAM. Guides the planner to prefer index scans over sequential scans when data is likely cached.
work_mem: Memory per operation (sort, hash join, materialization). Set conservatively: work_mem = (RAM * 0.25) / max_connections. Overallocation triggers swap under concurrent sorts.
Step 3: WAL and Checkpoint Tuning
Write-Ahead Log configuration dictates write throughput and crash recovery time.
wal_buffers: Set to -1 (auto-manages to 1/32 of shared_buffers).
max_wal_size: Increase to 4GB for write-heavy workloads. Reduces checkpoint frequency.
checkpoint_completion_target: Set to 0.9. Spreads checkpoint I/O over 90% of the checkpoint interval, preventing I/O spikes.
Step 4: Connection Management
PostgreSQL forks a process per connection. Default max_connections = 100 collapses under modern async runtimes. Use connection pooling instead of increasing max_connections.
// TypeScript: PGBouncer connection routing
import { Pool } from 'pg';
const pool = new Pool({
host: 'pgbouncer.internal',
port: 6432,
database: 'production_db',
user: 'app_user',
password: process.env.DB_PASSWORD,
max: 50, // PGBouncer handles multiplexing
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
export const query = async (text: string, params?: any[]) => {
const client = await pool.connect();
try {
const res = await client.query(text, params);
return res;
} finally {
client.release();
}
};
Step 5: Query and Index Optimization
Use EXPLAIN (ANALYZE, BUFFERS) to validate execution plans. Focus on:
- Covering indexes: Include all queried columns to avoid heap fetches.
- Partial indexes: Filter on high-selectivity predicates (
WHERE status = 'active').
- Expression indexes: Precompute
LOWER(email) or date_trunc('day', created_at).
-- Covering index example
CREATE INDEX idx_orders_customer_status_covering
ON orders (customer_id, status) INCLUDE (total_amount, created_at);
-- Validate execution
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT customer_id, total_amount
FROM orders
WHERE status = 'active' AND customer_id = 48291;
Architecture rationale: Memory configuration resolves 60% of latency issues. WAL/checkpoint tuning resolves 25% of write bottlenecks. Connection pooling prevents process overhead. Index optimization handles the remaining 15%. This sequence prevents over-indexing before memory is sized, and avoids connection exhaustion before WAL is tuned.
Pitfall Guide
- Setting
shared_buffers to 50% of RAM: Linux page cache already caches frequently accessed files. Doubling cache layers increases memory pressure without improving hit rates. Stick to 25%.
- Over-allocating
work_mem: work_mem is per-operation, not per-connection. A query with two sorts and a hash join consumes work_mem * 3. Under 200 concurrent connections, this triggers swap and kills performance. Calculate conservatively.
- Ignoring
autovacuum thresholds: Dead tuples accumulate, causing table bloat and index degradation. Default thresholds trigger too late for high-write tables. Adjust autovacuum_vacuum_scale_factor and autovacuum_analyze_scale_factor per table.
- Indexing every filtered column: Indexes slow writes, increase WAL volume, and require maintenance. Create indexes only for columns appearing in
WHERE, JOIN, or ORDER BY clauses with selectivity > 10%.
- Tuning without load testing: Changing parameters in isolation provides no validation. Use
pgbench or k6 with production-like query distributions. Measure before/after.
- Relying on ORM-generated queries: ORMs emit
SELECT *, missing LIMIT, and implicit cross joins. Validate every generated query with EXPLAIN (ANALYZE). Add .select() and .where() constraints explicitly.
- Disabling
fsync for performance: fsync = off eliminates crash safety. Data loss on power failure or kernel panic is guaranteed. Use synchronous_commit = off for non-critical writes instead, if durability trade-offs are acceptable.
Best practices from production:
- Use
pgtune as a starting baseline, not a final configuration.
- Monitor
pg_stat_bgwriter for checkpoint frequency and pg_stat_io for I/O patterns.
- Right-size
work_mem based on actual sort/hash operations reported in pg_stat_statements.
- Schedule
VACUUM FULL only during maintenance windows; prefer autovacuum tuning for continuous operation.
- Index bloat detection: query
pg_stat_user_tables and rebuild indexes when n_dead_tup > n_live_tup * 0.2.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Read-heavy analytics (70%+ SELECT) | Increase effective_cache_size, add covering indexes, enable shared_preload_libraries = 'pg_stat_statements' | Maximizes buffer hit ratio; reduces disk I/O for repeated scans | Defers read replica provisioning by 3β4x |
| Write-heavy transactional (high INSERT/UPDATE) | Tune max_wal_size, set checkpoint_completion_target = 0.9, increase maintenance_work_mem | Reduces checkpoint I/O spikes; accelerates index builds and vacuuming | Lowers provisioned IOPS costs by 30β50% |
| Mixed OLTP with 100+ concurrent connections | Deploy PGBouncer, reduce max_connections to 50, right-size work_mem | Prevents process-fork overhead; avoids memory swapping under concurrency | Eliminates need for vertical scaling until 5x traffic growth |
| Legacy ORM application with unoptimized queries | Enable log_min_duration_statement = 200, audit slow queries, add targeted indexes | Identifies ORM-generated full table scans; provides low-effort optimization path | Reduces cloud database tier costs by 20β40% |
Configuration Template
# postgresql.conf - Production OLTP Baseline (32GB RAM, 8 vCPU)
# Memory
shared_buffers = '8GB'
effective_cache_size = '24GB'
work_mem = '16MB'
maintenance_work_mem = '1GB'
# WAL & Checkpoints
wal_buffers = '-1'
max_wal_size = '4GB'
checkpoint_completion_target = '0.9'
checkpoint_timeout = '15min'
# Connections & Logging
max_connections = '100'
log_min_duration_statement = '200'
log_checkpoints = 'on'
log_connections = 'off'
log_disconnections = 'off'
# Query Planner
random_page_cost = '1.1'
effective_io_concurrency = '200'
default_statistics_target = '100'
# Autovacuum
autovacuum_max_workers = '3'
autovacuum_naptime = '30s'
track_activities = 'on'
track_counts = 'on'
track_io_timing = 'on'
# Extensions
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = 'all'
Quick Start Guide
- Snapshot current state: Run
SELECT * FROM pg_stat_bgwriter; and SELECT * FROM pg_stat_io; to capture baseline I/O and checkpoint behavior.
- Apply configuration: Replace
postgresql.conf parameters with the Production Template matching your RAM/CPU. Execute SELECT pg_reload_conf(); to apply without downtime.
- Deploy connection pooler: Run PGBouncer in transaction mode (
pool_mode = transaction), point application connection strings to PGBouncer port 6432, and set application pool max to 30.
- Validate with load: Execute
pgbench -c 50 -j 4 -T 300 -f production_query.sql to simulate production concurrency. Compare p95 latency and TPS against baseline.
- Index critical paths: Query
pg_stat_statements for top 5 slow queries. Add covering or partial indexes where EXPLAIN (ANALYZE, BUFFERS) shows high Heap Fetches or Seq Scan. Re-run load test to confirm improvement.