e-aside routing with SWR background refresh, pipeline batching, and connection pooling using ioredis v5. It includes a lightweight distributed mutex to prevent cache stampedes during cold starts or TTL expiry.
import Redis, { RedisOptions } from 'ioredis';
import { createHash } from 'crypto';
import { pack, unpack } from 'msgpackr';
// Redis 7.2 + ioredis v5 configuration
const REDIS_CONFIG: RedisOptions = {
host: process.env.REDIS_HOST || '127.0.0.1',
port: parseInt(process.env.REDIS_PORT || '6379', 10),
maxRetriesPerRequest: 3,
retryStrategy: (times: number) => Math.min(times * 50, 2000),
lazyConnect: true,
enableReadyCheck: true,
// Redis 7.2 optimization: background thread for memory cleanup
lazyFreeUnlinkOnEviction: true,
// Pipeline queue limits to prevent memory bloat during traffic spikes
maxRetriesPerRequest: 3,
};
const SWR_TTL_BUFFER = 30; // seconds to trigger background refresh before expiry
const STAMPEDE_LOCK_TTL = 5000; // ms to hold mutex during cold fetch
const MAX_PIPELINE_SIZE = 50;
export class CacheService {
private readonly client: Redis;
constructor(config: Partial<RedisOptions> = {}) {
this.client = new Redis({ ...REDIS_CONFIG, ...config });
this.client.on('error', (err) => {
console.error('[CacheService] Redis connection error:', err.message);
});
}
// Deterministic key generation with sorted params
private generateKey(resource: string, params: Record<string, unknown>, tenantId: string): string {
const sortedParams = Object.keys(params)
.sort()
.map((k) => `${k}:${params[k]}`)
.join('|');
const paramHash = createHash('sha256').update(sortedParams).digest('hex').slice(0, 12);
return `v1:api:${resource}:${paramHash}:${tenantId}`;
}
// Lightweight distributed mutex using SET NX PX
private async acquireLock(key: string): Promise<boolean> {
const lockKey = `${key}:lock`;
const acquired = await this.client.set(lockKey, '1', 'PX', STAMPEDE_LOCK_TTL, 'NX');
return acquired === 'OK';
}
private async releaseLock(key: string): Promise<void> {
await this.client.del(`${key}:lock`);
}
// Core Cache-Aside + SWR pattern
async get<T>(
resource: string,
params: Record<string, unknown>,
tenantId: string,
ttl: number,
fetchFn: () => Promise<T>
): Promise<T> {
const key = this.generateKey(resource, params, tenantId);
const raw = await this.client.get(key);
if (raw) {
const cached = unpack(Buffer.from(raw, 'base64')) as { data: T; expiresAt: number };
// SWR: Trigger async refresh if within buffer window
if (Date.now() >= cached.expiresAt - SWR_TTL_BUFFER * 1000) {
this.refreshAsync(key, ttl, fetchFn).catch(() => {}); // fire-and-forget
}
return cached.data;
}
// Cold start / TTL expired: prevent stampede
const locked = await this.acquireLock(key);
if (locked) {
try {
const freshData = await fetchFn();
const payload = pack({ data: freshData, expiresAt: Date.now() + ttl * 1000 });
await this.client.set(key, payload.toString('base64'), 'EX', ttl);
return freshData;
} finally {
await this.releaseLock(key);
}
}
// Contended: wait briefly and retry, or fall back to DB
await new Promise((res) => setTimeout(res, 50));
const retryRaw = await this.client.get(key);
if (retryRaw) {
const retry = unpack(Buffer.from(retryRaw, 'base64')) as { data: T; expiresAt: number };
return retry.data;
}
return fetchFn();
}
// Background SWR refresh
private async refreshAsync<T>(
key: string,
ttl: number,
fetchFn: () => Promise<T>
): Promise<void> {
const locked = await this.acquireLock(key);
if (!locked) return; // Another process already refreshing
try {
const freshData = await fetchFn();
const payload = pack({ data: freshData, expiresAt: Date.now() + ttl * 1000 });
await this.client.set(key, payload.toString('base64'), 'EX', ttl);
} finally {
await this.releaseLock(key);
}
}
// Pipeline batch read for N+1 scenarios
async batchGet<T>(
resource: string,
paramSets: Array<{ params: Record<string, unknown>; tenantId: string }>,
ttl: number,
fetchFn: (keys: string[]) => Promise<Map<string, T>>
): Promise<Map<string, T>> {
const keys = paramSets.map((p) => this.generateKey(resource, p.params, p.tenantId));
const pipeline = this.client.pipeline();
keys.forEach((k) => pipeline.get(k));
const results = await pipeline.exec();
const cache = new Map<string, T>();
const misses: string[] = [];
results?.forEach((res, idx) => {
if (res && !res[0]) {
const raw = res[1] as string | null;
if (raw) {
const parsed = unpack(Buffer.from(raw, 'base64')) as { data: T; expiresAt: number };
cache.set(keys[idx], parsed.data);
} else {
misses.push(keys[idx]);
}
} else {
misses.push(keys[idx]); // Error or null
}
});
if (misses.length > 0) {
const fresh = await fetchFn(misses);
const pipelineSet = this.client.pipeline();
fresh.forEach((data, key) => {
const payload = pack({ data, expiresAt: Date.now() + ttl * 1000 });
pipelineSet.set(key, payload.toString('base64'), 'EX', ttl);
});
await pipelineSet.exec();
fresh.forEach((data, key) => cache.set(key, data));
}
return cache;
}
async disconnect(): Promise<void> {
await this.client.quit();
}
}
Step 3: Usage Example & Integration Pattern
const cache = new CacheService();
// Example: Fetch user profile with SWR + stampede protection
const profile = await cache.get(
'user',
{ id: 'u_12345', include: 'preferences' },
'tenant_acme',
300, // 5 min TTL
async () => {
// Simulate DB/API call
return { id: 'u_12345', name: 'Alex', role: 'admin' };
}
);
await cache.disconnect();
Pitfall Guide
| Symptom | Root Cause | Diagnostic Command | Fix |
|---|
| p99 latency spikes every N minutes | TTL expiration thundering herd | redis-cli --latency-history shows periodic latency jumps | Implement SWR buffer + distributed mutex (SET NX PX). Never use exact TTL expiry for hot keys. |
OOM command not allowed or swap thrashing | maxmemory misconfigured or fragmentation ratio > 1.5 | redis-cli INFO memory β mem_fragmentation_ratio | Set maxmemory-policy allkeys-lru. Enable Redis 7.2 lazyfree-lazy-eviction yes. Avoid storing large uncompressed blobs. |
| Serialization mismatches / corrupted data | Mixed JSON.stringify and msgpackr usage across services | redis-cli GET <key> returns unreadable base64 | Enforce single serialization strategy at the gateway layer. Validate payload schema with msgpackr strict mode. |
Connection pool exhaustion / ECONNRESET | Pipeline queue overflow or unbounded retry storms | redis-cli INFO clients β connected_clients | Cap pipeline size (MAX_PIPELINE_SIZE). Use maxRetriesPerRequest: 3 and exponential backoff. Implement circuit breaker for Redis down scenarios. |
| SWR refresh blocking request path | Background refresh runs synchronously or shares thread pool | Node.js event loop lag spikes during refresh | Use fire-and-forget async refresh with separate error handling. Ensure fetchFn is non-blocking and uses connection pooling. |
| Cache key collisions across tenants | Missing tenant ID or unsorted parameters in key | Cross-tenant data leakage in logs | Hash sorted parameter string + tenant ID. Validate key format with regex in CI: /^v1:api:[a-z0-9]+:[a-f0-9]{12}:[a-z0-9_]+$/ |
Production Bundle
Configuration Checklist
Monitoring & Alerting
- Hit Ratio: Target > 85%. Alert if drops below 70% for 5 minutes (indicates key churn or invalidation storm).
- Latency Percentiles: p50 < 2ms, p99 < 10ms. Alert on p99 > 50ms (network contention or eviction thrashing).
- Memory:
used_memory_peak vs maxmemory. Alert at 85% threshold.
- Connections:
connected_clients vs maxclients. Alert if approaching 80%.
- SWR Refresh Rate: Track background refresh success/failure ratio. Alert if failure rate > 5%.
Rollout Strategy
- Canary Deployment: Route 5% of traffic to cached path. Monitor DB IOPS and error rates.
- Fallback Circuit Breaker: If Redis latency > 100ms or error rate > 2%, bypass cache and query DB directly.
- Invalidation Path: Implement explicit
DELETE or PUBLISH on write operations. Never rely solely on TTL for consistency-critical data.
- Load Testing: Simulate 2x peak RPS with cache disabled, then enable. Verify p99 stability and memory bounds.
- Observability: Inject
trace_id and cache_hit: true/false into structured logs. Correlate with APM traces.