ndow Counter** is the mandatory choice. It eliminates the 2x burst vulnerability of fixed windows while maintaining O(1) complexity relative to request volume, unlike the Sliding Window Log. Deploying Fixed Window counters on authentication endpoints is a structural vulnerability that allows attackers to optimize brute-force scripts for boundary exploitation.
Core Solution
Implementing production-grade rate limiting requires a multi-layered approach combining algorithmic precision, distributed state, and dynamic keying.
1. Architecture Decision: Edge vs. Application
- Edge/CDN Level: Best for DDoS mitigation and blocking known bad actors. Limited visibility into application context (e.g., user tier, specific resource sensitivity).
- API Gateway: Ideal for global policy enforcement across microservices. Centralized configuration but introduces a hop.
- Application Level: Required for fine-grained, context-aware limiting (e.g., limiting based on user reputation or specific API key scopes).
Recommendation: Defense in depth. Use Edge for volumetric protection, API Gateway for tenant-level limits, and Application logic for high-value action limits.
2. Algorithm Implementation: Sliding Window Counter
The Sliding Window Counter improves upon the Fixed Window by maintaining a count for the current window and weighting the previous window's count based on elapsed time.
Formula:
Effective Count = Current Window Count + (Previous Window Count * Weight)
Weight = (Elapsed Time in Current Window / Window Duration)
This ensures the limit decays smoothly, preventing bursts at window boundaries.
3. TypeScript Implementation with Redis
Redis is the standard for distributed rate limiting due to its atomic operations and low latency. The implementation must use Lua scripts to ensure atomicity; otherwise, race conditions between GET and SET commands allow limit bypass.
import Redis from 'ioredis';
// Production-grade Sliding Window Counter via Lua Script
const RATE_LIMIT_LUA = `
local key = KEYS[1]
local window_size = tonumber(ARGV[1])
local limit = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local precision = tonumber(ARGV[4])
-- Calculate window boundaries
local current_window = math.floor(now / window_size)
local previous_window = current_window - 1
local current_key = key .. ':' .. current_window
local previous_key = key .. ':' .. previous_window
-- Atomic increment and expiry for current window
local current_count = redis.call('INCR', current_key)
if current_count == 1 then
-- Set expiry slightly larger than window to allow cleanup
redis.call('EXPIRE', current_key, window_size * 2)
end
-- Get previous window count
local previous_count = redis.call('GET', previous_key) or 0
previous_count = tonumber(previous_count)
-- Calculate weighted count
local elapsed = now - (current_window * window_size)
local weight = elapsed / window_size
local effective_count = current_count + (previous_count * weight)
-- Check limit
if effective_count > limit then
return {0, math.ceil(limit / (current_count + (previous_count * weight)))}
end
-- Return allowed status and remaining quota
local remaining = math.floor(limit - effective_count)
return {1, remaining}
`;
export class SecureRateLimiter {
private redis: Redis;
private defaultWindow: number;
constructor(redis: Redis, defaultWindowMs: number = 60000) {
this.redis = redis;
this.defaultWindow = defaultWindowMs / 1000; // Lua expects seconds
}
async isAllowed(key: string, limit: number): Promise<{ allowed: boolean; remaining: number; retryAfter?: number }> {
const now = Date.now() / 1000;
// Atomic execution prevents race conditions
const result = await this.redis.eval(
RATE_LIMIT_LUA,
1,
key,
this.defaultWindow,
limit,
now,
1 // Precision placeholder
);
const allowed = result[0] === 1;
const remaining = result[1];
return {
allowed,
remaining: Math.max(0, remaining),
retryAfter: allowed ? undefined : this.defaultWindow,
};
}
}
4. Key Design Strategy
Security relies on granular keying. Never rate limit solely by IP.
- Composite Keys:
ratelimit:{endpoint}:{identifier}
- Identifiers:
- Authenticated:
user_id or api_key_hash.
- Unauthenticated:
ip_address combined with fingerprint (if available).
- Resource-Specific:
ratelimit:login:{ip} vs ratelimit:search:{api_key}.
- Defense against NAT: Use subnet-based limiting for IPs behind shared proxies, or require token-based identification for sensitive actions.
Implement the IETF RateLimit-* draft headers to assist legitimate clients and reduce support burden.
RateLimit-Limit: The maximum number of requests allowed.
RateLimit-Remaining: Requests left in the current window.
RateLimit-Reset: Unix timestamp when the limit resets.
Retry-After: Seconds to wait (mandatory on 429 responses).
Pitfall Guide
1. Race Conditions in Distributed Counters
Mistake: Using separate GET and SET commands in Redis.
Impact: Two concurrent requests may both read a count of 99, increment to 100, and both succeed, allowing 2x the limit.
Fix: Always use Lua scripts or Redis INCR with atomic check-and-set logic.
2. Trusting X-Forwarded-For Blindly
Mistake: Extracting IP from headers without validating the proxy chain.
Impact: Attackers can spoof IPs to bypass limits or target other users.
Fix: Configure the load balancer to strip untrusted headers and only trust X-Forwarded-For from known upstream proxies.
3. Thundering Herd on Limit Reset
Mistake: Clients retry immediately when the limit resets.
Impact: A spike of requests hits the server exactly at the reset time, causing temporary overload.
Fix: Implement jitter in client retry logic. Servers should return Retry-After with jitter recommendations.
4. Memory Amplification in Sliding Logs
Mistake: Using Sliding Window Log for high-traffic endpoints.
Impact: Storing a timestamp for every request causes Redis memory usage to explode, leading to OOM kills.
Fix: Use Sliding Window Counter for high throughput. Reserve Sliding Window Log only for ultra-low-volume, high-security endpoints.
5. Ignoring Low-and-Slow Attacks
Mistake: Setting limits that only block high-frequency bursts.
Impact: Attackers distribute requests over hours, staying under thresholds while harvesting data or cracking credentials.
Fix: Implement long-duration windows (e.g., 100 requests per 24 hours) for sensitive actions, in addition to short windows.
6. Hardcoded Limits vs. Tiered Limits
Mistake: Applying a single limit to all users.
Impact: Legitimate enterprise users hit limits; attackers with free accounts operate freely.
Fix: Implement tiered limits based on subscription level, reputation score, or resource cost.
7. Missing Rate Limit on Error Paths
Mistake: Rate limiting only successful requests or specific status codes.
Impact: Attackers can trigger expensive error-handling paths (e.g., database lookups for invalid keys) to exhaust resources.
Fix: Count all requests against the limit, regardless of the outcome.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Authentication Endpoints | Sliding Window Counter + Composite Key (IP/User) | Prevents credential stuffing and brute force; eliminates burst vulnerabilities. | Low (Redis ops) |
| Public API Search | Token Bucket + Edge Caching | Smooths traffic spikes; protects backend search indices; allows bursty user behavior. | Medium (Edge compute) |
| High-Volume Logging | Leaky Bucket + Sampling | Enforces constant ingestion rate; prevents log storage exhaustion. | Low |
| Webhook Delivery | Token Bucket per Consumer | Prevents a single consumer from being overwhelmed; ensures delivery fairness. | Low |
| Internal Microservice | In-Memory Sliding Window | Low latency; no external dependency; sufficient for trusted internal traffic. | Negligible |
Configuration Template
A production-ready configuration structure for a TypeScript-based rate limiter service.
// rate-limit.config.ts
export interface RateLimitRule {
endpoint: string;
method?: string;
tier: 'anonymous' | 'free' | 'pro' | 'enterprise';
windowMs: number;
maxRequests: number;
keyGenerator: (req: Request) => string;
// Optional: Adaptive behavior
adaptive?: {
enabled: boolean;
reputationSource: 'user_score' | 'ip_reputation';
reductionFactor: number; // Reduce limit by this factor for low reputation
};
}
export const SECURITY_RULES: RateLimitRule[] = [
{
endpoint: '/api/v1/auth/login',
method: 'POST',
tier: 'anonymous',
windowMs: 60000, // 1 minute
maxRequests: 5,
keyGenerator: (req) => `auth:ip:${req.ip}`,
adaptive: {
enabled: true,
reputationSource: 'ip_reputation',
reductionFactor: 0.5,
},
},
{
endpoint: '/api/v1/auth/login',
method: 'POST',
tier: 'free',
windowMs: 3600000, // 1 hour
maxRequests: 20,
keyGenerator: (req) => `auth:user:${req.user.id}`,
},
{
endpoint: '/api/v1/data/export',
method: 'POST',
tier: 'pro',
windowMs: 86400000, // 24 hours
maxRequests: 10,
keyGenerator: (req) => `export:user:${req.user.id}`,
},
];
Quick Start Guide
- Install Dependencies:
npm install ioredis @upstash/ratelimit # Or implement custom Lua script
- Initialize Redis Client:
const redis = new Redis(process.env.REDIS_URL);
const limiter = new SecureRateLimiter(redis);
- Apply Middleware:
app.use('/api/auth/login', async (req, res, next) => {
const key = `login:${req.ip}`;
const result = await limiter.isAllowed(key, 5);
res.set('RateLimit-Limit', '5');
res.set('RateLimit-Remaining', String(result.remaining));
if (!result.allowed) {
res.set('Retry-After', String(result.retryAfter));
return res.status(429).json({ error: 'Too Many Requests' });
}
next();
});
- Verify Headers:
Use
curl -v to inspect response headers. Ensure RateLimit-Remaining decrements and 429 includes Retry-After.
- Load Test:
Run
k6 or wrk scripts to simulate distributed traffic. Verify that limits hold and no requests bypass the threshold during concurrent bursts.
Rate limiting is not a set-and-forget configuration. It requires continuous tuning based on traffic patterns, threat intelligence, and business requirements. By adopting sliding window algorithms, atomic distributed storage, and composite keying, you establish a robust security boundary that mitigates abuse without degrading the experience for legitimate users.