n below uses ioredis for pipeline support, cluster readiness, and deterministic retry logic.
Step 1: Define Cache Service Architecture
The cache service must abstract serialization, TTL management, and concurrency control. Never expose raw Redis commands to business logic.
import Redis, { RedisOptions } from 'ioredis';
interface CacheConfig {
host: string;
port: number;
password?: string;
maxRetriesPerRequest: number;
enableReadyCheck: boolean;
}
interface CacheMetrics {
hits: number;
misses: number;
errors: number;
}
export class ProductionCache {
private client: Redis;
private metrics: CacheMetrics = { hits: 0, misses: 0, errors: 0 };
constructor(config: CacheConfig) {
this.client = new Redis({
...config,
retryStrategy: (times: number) => Math.min(times * 50, 2000),
maxRetriesPerRequest: config.maxRetriesPerRequest,
enableReadyCheck: config.enableReadyCheck,
// Critical: disable lazy disconnect to prevent connection pool leaks
lazyConnect: false,
});
this.client.on('error', (err) => {
console.error('[Redis] Connection error:', err.message);
this.metrics.errors++;
});
}
// Serialize with deterministic JSON handling; replace with msgpack for hot paths
private serialize(value: unknown): string {
return JSON.stringify(value);
}
private deserialize<T>(raw: string | null): T | null {
if (!raw) return null;
try {
return JSON.parse(raw) as T;
} catch {
return null;
}
}
Step 2: Implement Cache-Aside with Probabilistic Early Expiration
Static TTLs cause synchronized expiration. Probabilistic early expiration shifts the expiration window forward by a random percentage, distributing cache misses across time.
async get<T>(key: string, ttl: number): Promise<T | null> {
try {
const raw = await this.client.get(key);
if (raw) {
this.metrics.hits++;
return this.deserialize<T>(raw);
}
this.metrics.misses++;
return null;
} catch {
this.metrics.errors++;
return null;
}
}
async set<T>(key: string, value: T, ttl: number): Promise<void> {
try {
// Probabilistic early expiration: reduce TTL by 5-15% randomly
const jitter = Math.floor(ttl * (0.05 + Math.random() * 0.1));
const effectiveTtl = ttl - jitter;
await this.client.set(key, this.serialize(value), 'EX', effectiveTtl);
} catch {
this.metrics.errors++;
}
}
Step 3: Stampede Mitigation via Lock Coalescing
When multiple requests miss the cache simultaneously, they all hit the database. Lock coalescing ensures only one request rebuilds the cache while others wait.
async getOrSet<T>(
key: string,
ttl: number,
fetchFn: () => Promise<T>
): Promise<T> {
const cached = await this.get<T>(key, ttl);
if (cached) return cached;
const lockKey = `${key}:lock`;
const lockAcquired = await this.client.set(lockKey, '1', 'EX', 10, 'NX');
if (lockAcquired) {
try {
const fresh = await fetchFn();
await this.set(key, fresh, ttl);
return fresh;
} finally {
await this.client.del(lockKey);
}
}
// Wait for lock holder to populate cache, then retry
await new Promise((res) => setTimeout(res, 100));
return this.getOrSet(key, ttl, fetchFn);
}
Step 4: Write-Through Pattern for Consistency-Critical Paths
Write-through updates the cache synchronously with the primary store. It guarantees consistency at the cost of write latency. Use it for user sessions, inventory counts, and pricing rules.
async writeThrough<T>(
key: string,
value: T,
ttl: number,
writeToPrimary: (val: T) => Promise<void>
): Promise<void> {
// Pipeline ensures atomic cache update + primary write
const pipeline = this.client.pipeline();
pipeline.set(key, this.serialize(value), 'EX', ttl);
// Execute cache update first
await pipeline.exec();
// Primary write runs concurrently; cache is already consistent
await writeToPrimary(value);
}
Architecture Rationale
ioredis over redis: Native pipeline support, cluster topology awareness, and deterministic retry strategies. The redis package's connection pooling lacks production-grade backpressure handling.
- Probabilistic expiration over mutex locks: Mutex locks serialize cache misses, creating artificial bottlenecks. Probabilistic TTLs distribute misses naturally. Lock coalescing acts as a safety net for high-concurrency windows.
- Write-through vs write-behind: Write-behind improves write throughput but introduces data loss risk on cache node failure. Write-through is preferred for financial, inventory, and session data where consistency outweighs latency.
- Serialization choice: JSON is debuggable and sufficient for 80% of workloads. Replace with
msgpackr or protobuf when payload size exceeds 2KB or serialization consumes >5% of CPU time.
Pitfall Guide
1. Static TTL Assignment
Mistake: Applying uniform TTLs (e.g., 600s) regardless of data volatility.
Impact: Hot data expires unnecessarily; cold data occupies memory. Memory efficiency drops 30-40%.
Best Practice: Tier TTLs by volatility. Static configuration: 24h. User profiles: 1-4h. Real-time metrics: 30-60s. Instrument expiration rates to adjust dynamically.
2. Cache Stampedes
Mistake: Relying on GET/SET without concurrency control during expiration windows.
Impact: Database connection pool exhaustion, P99 latency spikes, cascading failures.
Best Practice: Implement probabilistic early expiration + lock coalescing. For extreme traffic, use cache warming strategies or pre-computed snapshots.
3. Missing Invalidation on Mutations
Mistake: Updating the primary database without purging or updating the cache.
Impact: Silent data staleness. Users see outdated prices, inventory, or permissions.
Best Practice: Bind cache invalidation to mutation paths. Use event-driven invalidation (Redis Pub/Sub, Kafka, or CDC) for distributed systems. Never assume cache consistency is automatic.
4. Caching Cheap Queries
Mistake: Caching queries that execute in <10ms or receive <50 RPS.
Impact: Serialization overhead, network round-trips, and memory allocation exceed query cost. Net performance degradation.
Best Practice: Cache only queries with measurable cost: >10ms execution, >100 RPS, or complex joins/aggregations. Profile before caching.
5. Ignoring Serialization Overhead
Mistake: Serializing large payloads or complex objects without benchmarking.
Impact: CPU spikes, increased latency, and memory fragmentation. JSON.stringify can consume 15-25% of request time for nested objects.
Best Practice: Flatten cache payloads. Use msgpackr for binary efficiency. Measure serialization cost relative to database query time. Cache only when serialization + network < database query.
6. Unbounded Cache Growth
Mistake: Omitting maxmemory-policy or relying on default noeviction.
Impact: Redis rejects writes, causing application crashes or silent failures. Memory leaks compound over days.
Best Practice: Set maxmemory-policy allkeys-lru or volatile-ttl. Monitor evicted_keys and used_memory_peak. Implement key prefixing for bulk invalidation.
7. Treating Cache as Stateless
Mistake: Assuming cache can be dropped and rebuilt instantly without side effects.
Impact: Cache rebuild storms, inconsistent states, and lost rate-limit counters or session data.
Best Practice: Design cache as a stateful subsystem. Implement graceful degradation, cache warming routines, and state reconciliation processes. Never cache non-idempotent or ephemeral state without explicit TTL and invalidation contracts.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Read-heavy catalog, low update frequency | Cache-Aside + Probabilistic TTL | Maximizes hit ratio, minimizes write overhead | Low memory, 30% infra cost reduction |
| User sessions, auth tokens | Write-Through + Fixed TTL | Guarantees consistency, prevents stale auth states | Moderate write cost, high reliability |
| Inventory counts, pricing rules | Write-Through + Event Invalidation | Eliminates overselling, syncs across microservices | Higher write amplification, prevents revenue loss |
| Real-time analytics, dashboards | Cache-Aside + Short TTL + Pre-warm | Balances freshness with query cost | Low memory, predictable latency floor |
| High-concurrency login endpoints | Cache-Aside + Lock Coalescing | Prevents database storms during peak auth traffic | Minimal memory, 4x latency improvement |
Configuration Template
# docker-compose.yml
version: '3.8'
services:
redis:
image: redis:7.2-alpine
command: redis-server /usr/local/etc/redis/redis.conf
ports:
- "6379:6379"
volumes:
- ./redis.conf:/usr/local/etc/redis/redis.conf
deploy:
resources:
limits:
memory: 2G
# redis.conf
maxmemory 1500mb
maxmemory-policy allkeys-lru
save ""
appendonly no
tcp-keepalive 300
timeout 0
hz 10
dynamic-hz yes
lazyfree-lazy-eviction yes
lazyfree-lazy-expire yes
// cache-client.ts
import { ProductionCache } from './ProductionCache';
export const cache = new ProductionCache({
host: process.env.REDIS_HOST || '127.0.0.1',
port: parseInt(process.env.REDIS_PORT || '6379', 10),
password: process.env.REDIS_PASSWORD,
maxRetriesPerRequest: 3,
enableReadyCheck: true,
});
// Usage example
export async function getUserProfile(userId: string) {
return cache.getOrSet(
`user:profile:${userId}`,
3600,
() => fetchUserProfileFromDB(userId)
);
}
Quick Start Guide
- Launch Redis with production config: Run
docker compose up -d. Verify maxmemory-policy and eviction settings with redis-cli CONFIG GET maxmemory-policy.
- Install dependencies:
npm i ioredis msgpackr (optional for serialization). Create ProductionCache.ts using the template above.
- Instrument metrics: Add Prometheus counters for
cache_hits, cache_misses, cache_errors, and redis_memory_used. Expose via /metrics endpoint.
- Test stampede mitigation: Run
wrk -t12 -c400 -d30s http://localhost:3000/api/user/123. Monitor Redis connected_clients and database query logs. Verify lock coalescing prevents concurrent DB hits.
- Validate invalidation: Update a user profile via API. Confirm cache key is purged or updated within 50ms. Check hit ratio drops to 0% for that key, then recovers on next read.