strategy requires explicit consistency boundaries, idempotent mutation propagation, and stampede mitigation. The following architecture uses a cache-aside pattern with hybrid invalidation, implemented in TypeScript with ioredis.
Step 1: Define Consistency Boundaries
Classify data by consistency requirements:
- Strong consistency: Financial balances, inventory counts, session state
- Eventual consistency: Product catalogs, user profiles, analytics aggregates
Map these to invalidation tolerances. Strong data requires event-driven invalidation with mutex protection. Eventual data can tolerate TTL + versioned keys.
Step 2: Implement Cache-Aside with Versioning
Versioned keys prevent stale reads during deployments and enable selective invalidation. Instead of deleting keys, increment a version counter and append it to the cache key. Old versions naturally expire via TTL.
import Redis from 'ioredis';
const redis = new Redis({
host: process.env.REDIS_HOST,
port: parseInt(process.env.REDIS_PORT || '6379'),
maxRetriesPerRequest: 3,
retryStrategy: (times) => Math.min(times * 50, 2000)
});
interface CacheConfig {
ttlSeconds: number;
versionKey: string;
consistency: 'strong' | 'eventual';
}
export class CacheManager {
constructor(private redis: Redis, private config: CacheConfig) {}
async getVersion(): Promise<number> {
const v = await this.redis.get(this.config.versionKey);
return v ? parseInt(v, 10) : 0;
}
async invalidateVersion(): Promise<void> {
await this.redis.incr(this.config.versionKey);
}
buildKey(entityId: string): string {
return `${this.config.versionKey}:${entityId}:${this.config.consistency}`;
}
async get<T>(entityId: string): Promise<T | null> {
const version = await this.getVersion();
const key = `${this.buildKey(entityId)}:${version}`;
const raw = await this.redis.get(key);
return raw ? JSON.parse(raw) : null;
}
async set<T>(entityId: string, data: T): Promise<void> {
const version = await this.getVersion();
const key = `${this.buildKey(entityId)}:${version}`;
await this.redis.set(key, JSON.stringify(data), 'EX', this.config.ttlSeconds);
}
}
Step 3: Wire Event-Driven Invalidation
For strong consistency, publish invalidation events to a Redis Pub/Sub channel or message broker. Subscribers delete or version-bump affected keys. Use idempotent handlers to prevent duplicate processing.
import { EventEmitter } from 'events';
export class InvalidationBus extends EventEmitter {
constructor(private redis: Redis, private channel: string) {
super();
this.subscribe();
}
private subscribe(): void {
const subscriber = this.redis.duplicate();
subscriber.subscribe(this.channel, () => {
console.log(`[InvalidationBus] Subscribed to ${this.channel}`);
});
subscriber.on('message', async (ch, payload) => {
if (ch !== this.channel) return;
try {
const { entityId, type } = JSON.parse(payload);
this.emit('invalidate', { entityId, type });
} catch (err) {
console.error('[InvalidationBus] Parse error:', err);
}
});
}
async publish(entityId: string, type: 'UPDATE' | 'DELETE'): Promise<void> {
await this.redis.publish(this.channel, JSON.stringify({ entityId, type }));
}
}
Step 4: Mitigate Cache Stampedes
When a high-traffic key expires, concurrent requests trigger simultaneous database queries. Prevent this with probabilistic early expiration or a distributed mutex.
import { v4 as uuidv4 } from 'uuid';
export class StampedeGuard {
constructor(private redis: Redis, private lockTtlMs: number = 3000) {}
async acquireLock(key: string): Promise<string | null> {
const lockKey = `lock:${key}`;
const token = uuidv4();
const acquired = await this.redis.set(lockKey, token, 'PX', this.lockTtlMs, 'NX');
return acquired ? token : null;
}
async releaseLock(key: string, token: string): Promise<void> {
const lockKey = `lock:${key}`;
const current = await this.redis.get(lockKey);
if (current === token) {
await this.redis.del(lockKey);
}
}
}
Architecture Decisions & Rationale
- Cache-aside over write-through: Write-through synchronously blocks application threads until cache and DB commit. Cache-aside defers cache population to read paths, reducing write latency and simplifying error handling.
- Versioning over direct deletion: Deleting keys forces immediate rebuilds under load. Versioning allows old keys to expire naturally while new reads fetch fresh data, eliminating stampede pressure.
- Pub/Sub for invalidation: Low-latency, zero-persistence broadcast. Acceptable for invalidation because duplicate or missed messages are handled by TTL fallback and idempotent version increments.
- Distributed mutex for stampedes: Redis
SET NX PX provides lightweight mutual exclusion without external coordination services. Fallback to direct DB query if lock acquisition fails after timeout.
Pitfall Guide
-
Blind TTL extension on read: Extending TTL every time a key is accessed creates "hot" keys that never expire, causing stale data to persist indefinitely. Best practice: Only extend TTL on explicit writes or use sliding expiration with a hard maximum lifetime.
-
Cross-entity invalidation gaps: Updating a parent entity without invalidating cached child aggregates or denormalized views. Best practice: Map entity relationships explicitly and trigger batch invalidation events for all dependent cache keys.
-
Thundering herd on expiration: Thousands of requests hitting an expired key simultaneously. Best practice: Implement probabilistic early expiration (e.g., 10% chance to refresh 20% before TTL expiry) or distributed mutex locks for high-traffic keys.
-
Race conditions in write-through: Application writes to DB, cache updates, but a concurrent read fetches stale cache before invalidation propagates. Best practice: Use cache-aside for write-heavy paths, or enforce sequential consistency via distributed locks around critical sections.
-
Ignoring cache topology: Redis Sentinel or Cluster modes introduce replication lag and partition tolerance. Invalidation messages may arrive on nodes that haven't replicated the mutation yet. Best practice: Route invalidation through a single authoritative node or use consistent hashing with version vectors.
-
Unbounded invalidation queues: Message brokers backing up during traffic spikes, causing delayed invalidation and memory pressure. Best practice: Implement backpressure, discard stale invalidation events older than a configurable threshold, and monitor queue depth.
-
No cache health observability: Teams monitor hit rate but ignore stale read ratio, invalidation latency, and stampede frequency. Best practice: Instrument cache managers with metrics for cache_stale_ratio, invalidation_propagation_ms, and stampede_prevented_count. Alert on stale ratio > 2% or invalidation latency > 500ms.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Read-heavy catalog with infrequent updates | TTL-Only + Versioning | Low write volume makes event-driven overhead unnecessary; versioning handles deployments safely | Low infra cost, moderate memory usage |
| Inventory/financial data requiring sub-100ms consistency | Event-Driven + Mutex | Strong consistency demands immediate invalidation; mutex prevents stampede-induced DB thrashing | Higher network overhead, reduced DB load during bursts |
| Multi-region deployment with cross-node replication lag | Hybrid + Version Vectors | Pub/Sub latency varies across regions; version vectors ensure stale reads are detected and refreshed | Moderate infra cost, requires version sync service |
| High-churn session/cache with strict memory limits | TTL-Only with aggressive eviction | Memory constraints favor expiration over explicit invalidation; versioning adds key bloat | Low operational overhead, requires careful TTL tuning |
Configuration Template
// cache-config.ts
import { Redis } from 'ioredis';
import { CacheManager } from './cache-manager';
import { InvalidationBus } from './invalidation-bus';
import { StampedeGuard } from './stampede-guard';
const redis = new Redis({
host: process.env.REDIS_HOST || '127.0.0.1',
port: parseInt(process.env.REDIS_PORT || '6379'),
password: process.env.REDIS_PASSWORD || undefined,
maxRetriesPerRequest: 3,
retryStrategy: (times) => Math.min(times * 100, 2000),
enableReadyCheck: true,
connectTimeout: 5000
});
export const cacheManager = new CacheManager(redis, {
ttlSeconds: 120,
versionKey: 'app:version:products',
consistency: 'eventual'
});
export const invalidationBus = new InvalidationBus(redis, 'invalidation:products');
export const stampedeGuard = new StampedeGuard(redis, 2500);
// Graceful shutdown
process.on('SIGTERM', async () => {
await redis.quit();
process.exit(0);
});
Quick Start Guide
- Initialize Redis: Run
docker run -d -p 6379:6379 redis:7-alpine to start a local Redis instance.
- Install dependencies: Execute
npm i ioredis uuid in your project directory.
- Copy configuration: Paste the Configuration Template into
cache-config.ts and ensure environment variables match your Redis endpoint.
- Run integration test: Create a test script that calls
cacheManager.set(), publishes an invalidation event, verifies cacheManager.get() returns updated data, and confirms version increment. Execute with npx ts-node test-cache.ts.
- Verify metrics: Check Redis keyspace hits/misses via
redis-cli INFO stats and confirm invalidation latency stays under 200ms under 1k RPS load testing with autocannon.