and contention handling. The implementation below uses Redis with Lua scripting for atomicity, background renewal for long-running tasks, and exponential backoff for contention.
Architecture Decisions
- Lease-based over boolean locks: Locks expire automatically to prevent deadlocks from crashed processes
- Lua scripts for atomicity: Redis executes Lua atomically, preventing race conditions between check-and-delete operations
- Background renewal: Extends TTL while work continues, avoiding premature expiration
- UUID-based ownership: Prevents accidental release of locks held by other processes
- Quorum-ready design: The acquisition logic supports multi-node deployment for partition tolerance
Step-by-Step Implementation
1. Define the lock interface
export interface DistributedLock {
acquire(): Promise<boolean>;
release(): Promise<void>;
renew(): Promise<boolean>;
isAcquired(): boolean;
}
2. Implement the lock manager
import Redis from 'ioredis';
export class RedisDistributedLock implements DistributedLock {
private readonly lockKey: string;
private readonly lockValue: string;
private readonly ttlMs: number;
private readonly renewalInterval: number;
private acquired = false;
private renewalTimer: NodeJS.Timeout | null = null;
constructor(
private readonly redis: Redis,
key: string,
ttlMs = 10000,
renewalInterval = 3000
) {
this.lockKey = `lock:${key}`;
this.lockValue = `${process.pid}:${crypto.randomUUID()}`;
this.ttlMs = ttlMs;
this.renewalInterval = renewalInterval;
}
async acquire(): Promise<boolean> {
const lua = `
if redis.call('SET', KEYS[1], ARGV[1], 'NX', 'PX', ARGV[2]) == 1 then
return 1
else
return 0
end
`;
const result = await this.redis.eval(lua, 1, this.lockKey, this.lockValue, this.ttlMs);
this.acquired = result === 1;
if (this.acquired) {
this.startRenewal();
}
return this.acquired;
}
async release(): Promise<void> {
if (!this.acquired) return;
const lua = `
if redis.call('GET', KEYS[1]) == ARGV[1] then
return redis.call('DEL', KEYS[1])
else
return 0
end
`;
await this.redis.eval(lua, 1, this.lockKey, this.lockValue);
this.acquired = false;
this.stopRenewal();
}
async renew(): Promise<boolean> {
const lua = `
if redis.call('GET', KEYS[1]) == ARGV[1] then
return redis.call('PEXPIRE', KEYS[1], ARGV[2])
else
return 0
end
`;
const result = await this.redis.eval(lua, 1, this.lockKey, this.lockValue, this.ttlMs);
return result === 1;
}
isAcquired(): boolean {
return this.acquired;
}
private startRenewal(): void {
this.renewalTimer = setInterval(async () => {
const renewed = await this.renew();
if (!renewed) {
this.acquired = false;
this.stopRenewal();
}
}, this.renewalInterval);
}
private stopRenewal(): void {
if (this.renewalTimer) {
clearInterval(this.renewalTimer);
this.renewalTimer = null;
}
}
}
3. Contention handling wrapper
export async function withDistributedLock<T>(
redis: Redis,
key: string,
task: () => Promise<T>,
options = { maxRetries: 3, baseDelayMs: 100, ttlMs: 10000 }
): Promise<T> {
const lock = new RedisDistributedLock(redis, key, options.ttlMs);
for (let attempt = 0; attempt <= options.maxRetries; attempt++) {
const acquired = await lock.acquire();
if (acquired) {
try {
return await task();
} finally {
await lock.release();
}
}
if (attempt === options.maxRetries) {
throw new Error(`Failed to acquire lock after ${options.maxRetries} retries`);
}
const delay = options.baseDelayMs * Math.pow(2, attempt) + Math.random() * 100;
await new Promise(resolve => setTimeout(resolve, delay));
}
throw new Error('Unreachable');
}
Rationale
- Lua atomicity prevents the check-then-act race condition that breaks naive implementations
- Background renewal decouples task duration from lock TTL, eliminating premature expiration
- Exponential backoff with jitter prevents thundering herd scenarios during high contention
- Ownership verification ensures processes only release locks they hold, critical in GC pause scenarios where a process might hold a lock past its TTL
Pitfall Guide
-
Using SETNX without TTL
Locks never expire when processes crash or hang. The system deadlocks until manual intervention. Always pair acquisition with PX or EX to enforce lease semantics.
-
Releasing locks without ownership verification
A process that acquires a lock, experiences a GC pause past the TTL, and then attempts to release will delete a lock now held by another process. Lua scripts must verify the lock value matches the owner UUID before deletion.
-
Ignoring clock skew across nodes
Redis TTL relies on server time. In multi-node deployments, clock drift causes premature expiration or extended holds. Use lease renewal and prefer consensus-based systems (etcd, ZooKeeper) when strict temporal guarantees are required.
-
No lease renewal for long-running tasks
Fixed TTLs assume predictable execution time. Background workers processing large payloads or waiting on external APIs will exceed TTLs. Implement automatic renewal at half the TTL interval.
-
Single-node lock services in production
A single Redis instance becomes a single point of failure. Network partitions cause split-brain scenarios where multiple nodes believe they hold the same lock. Deploy Redis Sentinel or Cluster, or use quorum-based acquisition (Redlock) for critical paths.
-
Blocking retries without jitter
Synchronous retry loops with fixed delays cause thundering herd effects. All contending processes wake simultaneously, overwhelming the lock service. Add randomized jitter to backoff calculations.
-
Treating locks as transaction boundaries
Distributed locks coordinate access, not guarantee consistency. They do not replace idempotency keys, optimistic concurrency control, or compensating transactions. Locks should protect critical sections, not entire business workflows.
Best Practices from Production
- Monitor lock acquisition latency and contention rates; alert when >5% of acquisitions require retries
- Use circuit breakers for the lock service to prevent cascading failures during Redis outages
- Set TTL to 3β5x the expected critical section duration; renewal handles variance
- Never nest distributed locks across different keys without a strict ordering protocol to prevent deadlocks
- Log lock lifecycle events (acquire, renew, release, timeout) with trace IDs for distributed tracing
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Non-critical worker deduplication | Single-Node Redis + TTL | Low latency, simple deployment, acceptable failure risk | Low infrastructure cost |
| Financial transaction coordination | etcd Lease or Redis Redlock | Partition tolerance, clock skew resilience, strong consistency | Moderate infrastructure cost |
| Serverless function coordination | Redis Cluster + Quorum Acquisition | Stateless functions require external lease management with high availability | Pay-per-use Redis cluster cost |
| Database-backed critical paths | Advisory Locks + Application-level retry | Leverages existing DB ACID guarantees, avoids external dependency | Zero additional infrastructure |
Configuration Template
// lock.config.ts
import Redis from 'ioredis';
export const lockRedis = new Redis({
host: process.env.REDIS_HOST || 'localhost',
port: parseInt(process.env.REDIS_PORT || '6379'),
password: process.env.REDIS_PASSWORD,
maxRetriesPerRequest: 3,
retryStrategy: (times) => Math.min(times * 50, 2000),
enableReadyCheck: true,
reconnectOnError: (err) => {
const targetError = 'READONLY';
if (err.message.includes(targetError)) {
return true;
}
return false;
}
});
export const lockDefaults = {
ttlMs: 15000,
renewalInterval: 5000,
maxRetries: 4,
baseDelayMs: 150,
jitterRange: 200
};
export function createLock(key: string) {
const { RedisDistributedLock } = require('./distributed-lock');
return new RedisDistributedLock(
lockRedis,
key,
lockDefaults.ttlMs,
lockDefaults.renewalInterval
);
}
Quick Start Guide
- Install dependencies:
npm install ioredis uuid
- Create lock instance:
const lock = createLock('order-processing:12345');
- Acquire and execute:
const acquired = await lock.acquire();
if (acquired) {
try {
// critical section
} finally {
await lock.release();
}
}
- Wrap with retry: Use
withDistributedLock(redis, key, task) for automatic backoff and cleanup
- Verify in monitoring: Check Redis keyspace for
lock:* patterns and confirm TTL expiration behavior under load