s a systematic pipeline: measure, analyze, restructure, and validate. The following implementation targets PostgreSQL as the reference engine, but the principles apply to MySQL, MariaDB, and compatible cloud databases.
Step 1: Establish Baseline Measurement
Enable query logging and statistics collection before making changes. Blind optimization introduces regressions.
-- Enable slow query logging (postgresql.conf)
log_min_duration_statement = 200;
log_statement = 'none';
log_duration = off;
-- Install pg_stat_statements extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
Query the top consumers:
SELECT query, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
Step 2: Analyze Execution Plans
Run EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) on target queries. Focus on:
Seq Scan vs Index Scan
Hash Join vs Nested Loop vs Merge Join
Sort operations spilling to disk
Rows Removed by Filter ratios
Step 3: Index Strategy & Composite Ordering
Indexes are not free. They increase write amplification and storage. Apply them surgically based on query patterns.
Composite index column order follows this rule:
- Equality filters first
- Range filters second
- Order-by columns third (if matching sort direction)
-- Unoptimized: frequent query filters on status, date range, and sorts by created_at
SELECT id, user_id, amount, status
FROM transactions
WHERE status = 'completed'
AND created_at BETWEEN '2024-01-01' AND '2024-03-31'
ORDER BY created_at DESC;
-- Optimized index: equality β range β sort alignment
CREATE INDEX idx_transactions_status_created
ON transactions (status, created_at DESC);
Step 4: Query Rewriting Patterns
Replace anti-patterns with planner-friendly structures.
*Before (N+1 + implicit SELECT ):
import { Pool } from 'pg';
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
async function getUserOrders(userId: string) {
const orders = await pool.query(
`SELECT * FROM orders WHERE user_id = $1`, [userId]
);
// N+1 anti-pattern
const enriched = await Promise.all(
orders.rows.map(async (order) => {
const items = await pool.query(
`SELECT * FROM order_items WHERE order_id = $1`, [order.id]
);
return { ...order, items: items.rows };
})
);
return enriched;
}
After (Single query with JOIN + covering columns + explicit typing):
import { Pool, QueryResult } from 'pg';
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
max: 20,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
interface OrderRow {
id: string;
user_id: string;
total: number;
item_id: string;
sku: string;
quantity: number;
price: number;
}
async function getUserOrdersOptimized(userId: string) {
const res: QueryResult<OrderRow> = await pool.query(
`SELECT
o.id, o.user_id, o.total,
oi.id AS item_id, oi.sku, oi.quantity, oi.price
FROM orders o
LEFT JOIN order_items oi ON oi.order_id = o.id
WHERE o.user_id = $1
ORDER BY o.created_at DESC`,
[userId]
);
// Group in application layer (cheaper than repeated round-trips)
const grouped = res.rows.reduce((acc, row) => {
if (!acc[row.id]) {
acc[row.id] = {
id: row.id,
user_id: row.user_id,
total: row.total,
items: []
};
}
if (row.item_id) {
acc[row.id].items.push({
id: row.item_id,
sku: row.sku,
quantity: row.quantity,
price: row.price
});
}
return acc;
}, {} as Record<string, any>);
return Object.values(grouped);
}
Step 5: Architecture Decisions & Rationale
| Pattern | Use Case | Trade-off |
|---|
| Materialized Views | Heavy aggregations, dashboard queries, read-heavy analytics | Stale data window; requires refresh strategy |
| Table Partitioning | Time-series or tenant-isolated data >50M rows | Complex DDL; requires partition pruning awareness |
| Read Replicas | Analytical workloads, reporting, background jobs | Replication lag; write consistency boundary |
| Connection Pooling (PgBouncer) | High concurrency, microservice architectures | Transaction pooling limits session variables |
Rationale: Query optimization alone hits a ceiling when data volume exceeds memory capacity. Partitioning and materialized views shift execution cost from runtime to maintenance windows. Read replicas isolate analytical I/O from transactional throughput. Connection pooling eliminates TCP handshake overhead and prevents connection exhaustion during traffic bursts.
Pitfall Guide
- Indexing low-cardinality columns: Adding indexes to columns with few distinct values (e.g.,
status, is_active) bloats storage and slows writes without improving read performance. The planner often ignores them anyway.
- Composite index misordering: Placing range or sort columns before equality filters breaks index usage. The planner can only use leading columns for index scans.
- Ignoring planner statistics decay:
VACUUM reclaims space; ANALYZE updates planner statistics. Running VACUUM without ANALYZE leaves the planner guessing, causing suboptimal join strategies.
- Blind
work_mem tuning: Increasing work_mem allows larger in-memory sorts and hash tables, but unbounded increases trigger OOM kills when multiple complex queries run concurrently. Set conservatively and monitor temp_files in logs.
- Caching without invalidation: Redis or application-level caching accelerates reads but introduces consistency violations when underlying data changes. Cache-aside patterns without write-through invalidation or TTL alignment cause stale reads.
- ORM lazy loading in batch contexts: ORMs optimize for single-entity retrieval. Bulk operations require explicit
JOIN, IN, or batch fetch strategies. Lazy loading in loops creates exponential query growth.
- Not testing with production-like data: Query plans change at scale. A query using an index scan on 10K rows may switch to a sequential scan on 10M rows because the planner calculates full scan as cheaper than random I/O.
Production Best Practices:
- Run
pg_stat_statements continuously; audit top 20 queries weekly.
- Use
EXPLAIN (ANALYZE, BUFFERS) in CI/CD pipelines for schema migrations.
- Schedule
pg_cron or external jobs for VACUUM ANALYZE on high-churn tables.
- Monitor
blks_read vs blks_hit in pg_statio_user_tables to track cache efficiency.
- Enforce explicit column selection in linting rules (
SELECT * bans).
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High read/write ratio (>10:1) | Read replicas + materialized views | Isolates analytical I/O from transactional throughput | +15-25% infra cost, -40% primary DB load |
| Complex aggregations on time-series data | Table partitioning + covering indexes | Enables partition pruning and reduces scan scope | Neutral infra cost, -60% query latency |
| Strict consistency requirements | Query rewrite + optimized indexing + connection pooling | Avoids replication lag while improving execution efficiency | -20% cloud spend, improved SLA compliance |
| Limited budget / shared hosting | Query caching + aggressive indexing + work_mem tuning | Maximizes existing resources without horizontal scaling | Near-zero infra change, -30% query time |
Configuration Template
postgresql.conf (optimization baseline)
shared_buffers = 25% of RAM
effective_cache_size = 75% of RAM
work_mem = 64MB
maintenance_work_mem = 512MB
random_page_cost = 1.1
effective_io_concurrency = 200
wal_level = replica
max_wal_senders = 3
log_min_duration_statement = 200
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0
PgBouncer.ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp
[pgbouncer]
listen_port = 6432
listen_addr = *
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 200
default_pool_size = 25
reserve_pool_size = 5
server_idle_timeout = 30
server_lifetime = 3600
TypeScript Connection Pool Config
import { Pool } from 'pg';
export const db = new Pool({
host: process.env.DB_HOST || '127.0.0.1',
port: Number(process.env.DB_PORT) || 6432,
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
max: 20,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
statement_timeout: 5000,
query_timeout: 5000,
});
db.on('error', (err) => {
console.error('Unexpected database pool error:', err);
process.exit(1);
});
Quick Start Guide
- Install monitoring extensions: Run
CREATE EXTENSION IF NOT EXISTS pg_stat_statements; and set log_min_duration_statement = 200 in postgresql.conf. Restart PostgreSQL.
- Identify bottlenecks: Query
pg_stat_statements to extract the top 3 queries by total_exec_time. Copy one for analysis.
- Generate execution plan: Run
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) <query>;. Note Seq Scan, Sort, and Rows Removed by Filter lines.
- Apply targeted index: Create a composite index matching equality β range β sort order. Verify usage with
EXPLAIN.
- Validate improvement: Re-run the query. Confirm
Seq Scan β Index Scan, reduced actual rows vs estimated rows, and lower Execution Time. Commit schema change and update application query if necessary.
Query optimization is deterministic when treated as a contract between application logic and storage execution. Measure first, rewrite deliberately, and validate against production data distribution. The cost of inaction compounds; the ROI of systematic optimization scales linearly with data growth.