es careful isolation level tuning to prevent write skew and phantom reads.
The key insight is that no single approach is optimal. High-performance architectures often hybridize these strategies: using MVCC for reads, OCC for user-profile updates, and Pessimistic locking for critical financial transfers.
Core Solution
Implementing robust concurrency control requires a layered approach: selecting the correct isolation level, applying the appropriate locking strategy, and engineering application-level resilience.
Step 1: Workload Analysis and Strategy Selection
Analyze the read/write ratio and contention points.
- Read-Heavy/Low Contention: Use MVCC with
REPEATABLE READ or SNAPSHOT isolation.
- Write-Heavy/High Contention: Use Optimistic CC with version columns and application-level retries.
- Critical Consistency/High Value: Use Pessimistic Locking (
SELECT FOR UPDATE) within SERIALIZABLE or strict isolation levels.
Step 2: Implementation Patterns
A. Pessimistic Locking (Row-Level)
Use when data integrity is paramount and contention is manageable.
import { Pool, PoolClient } from 'pg';
export async function transferFunds(
pool: Pool,
fromId: string,
toId: string,
amount: number
): Promise<void> {
const client = await pool.connect();
try {
await client.query('BEGIN');
// Lock rows in a deterministic order to prevent deadlocks
// Always lock lower ID first
const [first, second] = [fromId, toId].sort();
await client.query(
'SELECT id, balance FROM accounts WHERE id = $1 FOR UPDATE',
[first]
);
await client.query(
'SELECT id, balance FROM accounts WHERE id = $1 FOR UPDATE',
[second]
);
// Perform checks and updates
const fromRes = await client.query(
'UPDATE accounts SET balance = balance - $1 WHERE id = $2 AND balance >= $1 RETURNING balance',
[amount, fromId]
);
if (fromRes.rowCount === 0) {
throw new Error('Insufficient funds');
}
await client.query(
'UPDATE accounts SET balance = balance + $1 WHERE id = $2',
[amount, toId]
);
await client.query('COMMIT');
} catch (err) {
await client.query('ROLLBACK');
throw err;
} finally {
client.release();
}
}
B. Optimistic Concurrency Control (Versioning)
Use for high-throughput scenarios where conflicts are rare or acceptable to retry.
interface UserSettings {
id: string;
theme: string;
version: number;
}
export async function updateSettingsOptimistic(
db: any,
settings: UserSettings
): Promise<UserSettings> {
const result = await db.query(
`UPDATE user_settings
SET theme = $1, version = version + 1
WHERE id = $2 AND version = $3
RETURNING *`,
[settings.theme, settings.id, settings.version]
);
if (result.rowCount === 0) {
// Conflict detected: version changed since read
throw new OptimisticLockError('Concurrent modification detected');
}
return result.rows[0];
}
C. Retry Wrapper with Exponential Backoff
Mandatory for OCC and transient deadlock errors.
export async function withRetry<T>(
fn: () => Promise<T>,
maxRetries = 3,
baseDelay = 50
): Promise<T> {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await fn();
} catch (err: any) {
// Check for retryable errors: deadlock, serialization failure, optimistic lock
const isRetryable =
err.code === '40P01' || // PostgreSQL deadlock
err.code === '40001' || // Serialization failure
err instanceof OptimisticLockError;
if (!isRetryable || attempt === maxRetries - 1) {
throw err;
}
// Exponential backoff with jitter to prevent thundering herd
const delay = baseDelay * Math.pow(2, attempt) + Math.random() * 100;
await new Promise(res => setTimeout(res, delay));
}
}
throw new Error('Max retries exceeded');
}
// Usage
await withRetry(() => updateSettingsOptimistic(db, settings));
Step 3: Architecture Decisions
- Idempotency: Ensure retry logic is safe. If the transaction commits but the response is lost, a retry must not duplicate the effect. Use idempotency keys for external API calls within transactions.
- Lock Ordering: Always acquire locks in a consistent order (e.g., sorted by primary key) to eliminate deadlocks.
- Transaction Scope: Keep transactions as short as possible. Move non-critical logic (logging, notifications) outside the transaction boundary using outbox patterns or event emission.
Pitfall Guide
1. Ignoring Default Isolation Levels
- Mistake: Assuming
READ COMMITTED prevents non-repeatable reads.
- Reality: In PostgreSQL,
READ COMMITTED allows non-repeatable reads. A transaction may see different data for the same row across two queries. If business logic requires a consistent snapshot, you must explicitly set REPEATABLE READ or SERIALIZABLE.
- Fix: Audit all transaction blocks. Explicitly set isolation levels where consistency guarantees are required.
2. Deadlocks from Unordered Locking
- Mistake: Acquiring locks on multiple rows in arbitrary order based on input data.
- Reality: If Transaction A locks Row 1 then Row 2, and Transaction B locks Row 2 then Row 1, a deadlock occurs.
- Fix: Implement strict lock ordering. Sort entity IDs before acquiring locks.
3. Retry Storms and Thundering Herd
- Mistake: Retrying immediately or with fixed delays on conflict.
- Reality: When contention spikes, all clients retry simultaneously, amplifying load and causing cascading failures.
- Fix: Implement exponential backoff with jitter. Add jitter to desynchronize retry attempts.
4. N+1 Lock Queries
- Mistake: Iterating over a list and locking rows one by one in application code.
- Reality: This increases round trips and holds locks longer than necessary.
- Fix: Use
SELECT ... FOR UPDATE with an IN clause or JOIN to lock all required rows in a single query.
5. Write Skew Anomalies
- Mistake: Assuming
REPEATABLE READ prevents all anomalies.
- Reality: Write skew occurs when two transactions read overlapping data, make disjoint updates, and commit, violating a constraint that neither transaction checked individually. Example: On-call doctors constraint. Two doctors check the schedule, see the other is on-call, and both request off. The constraint "at least one doctor must be on-call" is violated.
- Fix: Use
SERIALIZABLE isolation or add explicit range locks/conditions to prevent the invariant violation.
6. Long Transactions Holding Locks
- Mistake: Performing slow I/O (HTTP calls, heavy computation) inside a database transaction.
- Reality: Locks are held for the duration of the transaction, blocking other writers and increasing deadlock probability.
- Fix: Fetch data, release lock, perform computation, then re-acquire lock with version check (OCC) or re-validate before final update.
7. Confusing MVCC Visibility with Logical Consistency
- Mistake: Relying on MVCC snapshots to make decisions about current state.
- Reality: A snapshot reflects the state at the transaction start. If you check
balance > 100 in the snapshot but the current committed balance is 50, you may proceed with a deduction that results in a negative balance if not locked.
- Fix: For state-dependent writes, use
SELECT FOR UPDATE or optimistic versioning to ensure the check and update are atomic against the latest state.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Financial Ledger / Wallet Balance | Pessimistic Locking + Serializable | Strict consistency required; contention is lower than read traffic; deadlocks manageable via ordering. | High Latency per Write, Low Risk. |
| User Profile / Preferences | Optimistic CC | High read/write ratio; conflicts are rare; user experience degrades gracefully with retry. | Low Latency, Low Cost. |
| Inventory Decrement | Optimistic CC with Pre-check or Pessimistic | Depends on volume. For massive flash sales, Pessimistic with sharding or Optimistic with aggressive retry. | Medium Latency, Scalable. |
| Reporting / Analytics | MVCC Snapshot Isolation | Reads must not block writes; consistent snapshot required for accurate aggregation. | Low Impact on OLTP. |
| Batch Processing | Batched Pessimistic | Process chunks with locks; reduces overhead vs row-by-row; allows progress tracking. | Medium Latency, High Throughput. |
Configuration Template
PostgreSQL Isolation Configuration:
-- Set default isolation level for the session
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- Or per transaction
BEGIN ISOLATION LEVEL SERIALIZABLE;
-- ... operations ...
COMMIT;
TypeScript Retry Configuration:
interface RetryConfig {
maxRetries: number;
baseDelayMs: number;
jitterMs: number;
retryableCodes: string[];
}
export const defaultRetryConfig: RetryConfig = {
maxRetries: 3,
baseDelayMs: 50,
jitterMs: 100,
retryableCodes: ['40P01', '40001', 'serialization_failure', 'deadlock_detected'],
};
export function createRetryWrapper(config: RetryConfig) {
return async function retry<T>(fn: () => Promise<T>): Promise<T> {
for (let attempt = 0; attempt < config.maxRetries; attempt++) {
try {
return await fn();
} catch (err: any) {
const isRetryable = config.retryableCodes.includes(err.code) ||
err.message.includes('optimistic') ||
err instanceof OptimisticLockError;
if (!isRetryable || attempt === config.maxRetries - 1) {
throw err;
}
const delay = config.baseDelayMs * Math.pow(2, attempt) + Math.random() * config.jitterMs;
await new Promise(res => setTimeout(res, delay));
}
}
throw new Error('Retry limit exceeded');
};
}
Quick Start Guide
- Identify Contention Points: Review your schema and query patterns. Identify tables with frequent concurrent writes (e.g., counters, balances, inventory).
- Add Versioning: For tables using Optimistic CC, add a
version INT DEFAULT 1 column. Update your ORM or query builder to increment and check this column on updates.
- Implement Retry Wrapper: Create a generic
withRetry utility in your infrastructure layer. Wrap all database transactions in this utility. Configure retryable error codes for your specific database engine.
- Enforce Lock Ordering: If using Pessimistic locking, audit code paths that lock multiple rows. Introduce sorting logic to ensure locks are always acquired in ID order.
- Load Test: Run a load test simulating concurrent writes to critical paths. Monitor for deadlocks, lock waits, and retry rates. Adjust isolation levels and retry parameters based on metrics.