orces a architectural shift: locking is not a database configuration problem, it is an application-level concurrency strategy. Choosing the right pattern based on conflict probability, consistency requirements, and latency tolerance directly determines whether a system scales linearly or collapses under predictable load.
Core Solution
Implementing a production-grade locking strategy requires a hybrid approach: optimistic locking as the default, pessimistic locking for strict consistency boundaries, and explicit timeout/retry boundaries at the connection level. The following implementation uses TypeScript with pg (node-postgres) to demonstrate precise control over lock behavior.
Step 1: Schema Preparation
Add a BIGINT version column to tables requiring concurrency control. This enables atomic version validation without application-level state tracking.
ALTER TABLE inventory_items
ADD COLUMN version BIGINT NOT NULL DEFAULT 1;
CREATE INDEX idx_inventory_items_version ON inventory_items(version);
Step 2: Optimistic Update with Version Validation
The update query must include the version in the WHERE clause and increment it atomically. If the row was modified concurrently, rowsAffected returns 0.
import { Pool, PoolClient } from 'pg';
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
max: 20,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
interface UpdateResult {
success: boolean;
attempts: number;
version: number;
}
async function updateInventory(
itemId: string,
newQuantity: number,
expectedVersion: number,
maxRetries = 3
): Promise<UpdateResult> {
const client = await pool.connect();
try {
await client.query('BEGIN');
await client.query('SET lock_timeout = \'500ms\'');
await client.query('SET statement_timeout = \'2000ms\'');
let attempts = 0;
while (attempts < maxRetries) {
attempts++;
const res = await client.query(
`UPDATE inventory_items
SET quantity = $1, version = version + 1
WHERE id = $2 AND version = $3
RETURNING version`,
[newQuantity, itemId, expectedVersion + (attempts - 1)]
);
if (res.rowCount === 1) {
await client.query('COMMIT');
return { success: true, attempts, version: expectedVersion + attempts };
}
// Conflict detected, retry with exponential backoff + jitter
const delay = Math.min(100 * Math.pow(2, attempts - 1), 1000) + Math.random() * 200;
await new Promise(resolve => setTimeout(resolve, delay));
}
await client.query('ROLLBACK');
return { success: false, attempts, version: expectedVersion };
} finally {
client.release();
}
}
Step 3: Pessimistic Fallback for Critical Paths
Use SELECT FOR UPDATE only when cross-row atomicity or financial consistency is mandatory. This serializes access but guarantees no stale writes.
async function processPaymentTransfer(fromId: string, toId: string, amount: number): Promise<void> {
const client = await pool.connect();
try {
await client.query('BEGIN ISOLATION LEVEL SERIALIZABLE');
await client.query('SET lock_timeout = \'300ms\'');
await client.query('SET statement_timeout = \'1500ms\'');
// Lock rows in deterministic order to prevent deadlocks
await client.query(
`SELECT id, balance FROM accounts
WHERE id IN ($1, $2)
ORDER BY id
FOR UPDATE`,
[fromId, toId]
);
await client.query(`UPDATE accounts SET balance = balance - $1 WHERE id = $2`, [amount, fromId]);
await client.query(`UPDATE accounts SET balance = balance + $1 WHERE id = $2`, [amount, toId]);
await client.query('COMMIT');
} catch (err: any) {
await client.query('ROLLBACK');
if (err.code === '40P01') throw new Error('Deadlock detected');
if (err.code === '55P03') throw new Error('Lock timeout exceeded');
throw err;
} finally {
client.release();
}
}
Step 4: Architecture Decisions & Rationale
- Optimistic default: 80% of application writes conflict <5% of the time. Version checks eliminate lock waits entirely for non-conflicting paths.
- Deterministic locking order:
ORDER BY id in pessimistic queries prevents circular wait conditions, reducing deadlocks by ~90%.
- Session-level timeouts:
lock_timeout and statement_timeout prevent pool exhaustion. Long-running locks are killed before they block other transactions.
- Jittered exponential backoff: Prevents retry storms. Without jitter, conflicting transactions retry simultaneously, amplifying contention.
- Connection pool sizing: Keep
max connections proportional to CPU cores Γ 2. Oversized pools increase lock contention probability exponentially.
Pitfall Guide
1. Ignoring lock_timeout at Session Level
Leaving lock timeouts at database defaults (often 0/infinite) allows a single slow transaction to hold locks indefinitely. Connection pools drain as new requests queue behind blocked sessions. Always set SET lock_timeout = '500ms' per transaction for non-critical paths.
2. Overusing Pessimistic Locking for Read-Heavy Workloads
SELECT FOR UPDATE serializes access even when no write conflict exists. In catalog, inventory, or user profile updates, this caps throughput. Use optimistic locking unless strict read-modify-write atomicity is legally or financially mandated.
3. Missing Version Increment on Conflicting Reads
Optimistic locking fails when applications read a row, modify it locally, and write back without validating the version. The database accepts the stale write. Always include WHERE version = $current in updates and handle rowCount === 0 as a conflict, not a success.
4. Retry Logic Without Jitter
Fixed-interval retries create thundering herd effects. When 100 transactions conflict simultaneously, they all retry at T+100ms, re-triggering contention. Add random jitter (Math.random() * base_delay) to desynchronize retry windows.
5. Locking Unindexed Columns
PostgreSQL escalates row locks to page or table locks when the WHERE clause lacks a supporting index. A WHERE status = 'pending' without an index locks entire table pages, blocking unrelated rows. Always index columns used in lock predicates.
6. Mixing Isolation Levels in the Same Transaction
Starting a transaction with READ COMMITTED and switching to SERIALIZABLE mid-flight creates inconsistent snapshot behavior. MVCC snapshots are bound to transaction start. Set isolation level explicitly in BEGIN and never change it.
7. Assuming ORMs Handle Locking Automatically
ORMs abstract SQL but do not implement concurrency control. entity.save() without version validation or explicit locks will overwrite concurrent changes silently. Map ORM lifecycle hooks to version checks or use raw queries for critical paths.
Production Best Practices:
- Monitor
pg_stat_activity for wait_event_type = 'Lock'
- Set
deadlock_timeout = '1s' in postgresql.conf
- Audit lock waits in CI/CD using
pg_stat_statements and pg_locks
- Prefer application-level retries over database-level retries for better observability
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Read-heavy catalog updates (<5% conflict) | Optimistic (Version Check) | Eliminates lock waits, maximizes throughput | Lowest compute, highest scalability |
| Financial transfers, inventory reservations | Pessimistic (FOR UPDATE) + Serializable | Guarantees atomicity, prevents double-spend | Higher latency, predictable resource usage |
| Cross-service coordination, leader election | Advisory (Redis/etcd) | Decouples lock state from DB, scales horizontally | Infra cost increases, DB load decreases |
| Batch imports, idempotent writes | Optimistic + Retry (3x) | Handles transient conflicts without serializing | Moderate latency, high success rate |
| Regulatory audit trails, immutable logs | Append-only + Optimistic | No updates, only inserts; version tracks lineage | Zero lock contention, storage scales linearly |
Configuration Template
// db/config.ts
import { Pool } from 'pg';
export const dbPool = new Pool({
connectionString: process.env.DATABASE_URL,
max: parseInt(process.env.DB_POOL_MAX || '16', 10),
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
application_name: 'production-api',
// SSL enabled for production
ssl: process.env.NODE_ENV === 'production' ? { rejectUnauthorized: false } : false,
});
// db/middleware.ts
export async function withLockTimeout(client: import('pg').PoolClient): Promise<void> {
await client.query(`
SET lock_timeout = '500ms';
SET statement_timeout = '2000ms';
SET idle_in_transaction_session_timeout = '5000ms';
`);
}
// db/retry.ts
export async function withRetry<T>(fn: () => Promise<T>, maxRetries = 3): Promise<T> {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await fn();
} catch (err: any) {
if (err.code === '40P01' || err.code === '55P03' || err.code === '40001') {
if (attempt === maxRetries) throw err;
const delay = Math.min(100 * Math.pow(2, attempt - 1), 1000) + Math.random() * 200;
await new Promise(res => setTimeout(res, delay));
continue;
}
throw err;
}
}
throw new Error('Retry logic exhausted');
}
Quick Start Guide
- Add version column: Run
ALTER TABLE <table> ADD COLUMN version BIGINT NOT NULL DEFAULT 1; on all write-heavy tables.
- Update queries: Replace direct
UPDATE statements with UPDATE ... SET ..., version = version + 1 WHERE id = $id AND version = $current.
- Wrap in retry logic: Use the
withRetry template to handle 40P01 (deadlock) and 40001 (serialization failure) codes automatically.
- Set session timeouts: Execute
SET lock_timeout = '500ms' immediately after BEGIN in every transaction.
- Monitor: Query
SELECT pid, state, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event_type = 'Lock'; to validate lock behavior in staging before production rollout.