tem boundaries. The architecture relies on three components: a semaphore to control active I/O count, a batch processor to manage memory lifecycle, and a retry policy with exponential backoff to handle transient failures.
Step 1: Implement a Concurrency Semaphore
A semaphore tracks active operations and blocks new submissions until a slot frees up. This prevents connection pool exhaustion and keeps heap usage predictable.
class AsyncSemaphore {
private active: number = 0;
private queue: Array<() => void> = [];
constructor(private limit: number) {}
async acquire(): Promise<void> {
if (this.active < this.limit) {
this.active++;
return;
}
return new Promise<void>(resolve => this.queue.push(resolve));
}
release(): void {
this.active--;
if (this.queue.length > 0) {
const next = this.queue.shift();
next?.();
}
}
}
Rationale: The semaphore operates entirely within the event loop. It doesn't block the main thread; it queues callbacks. When a slot opens, the next pending task is allowed to proceed. This matches the OS-level concept of limiting concurrent file descriptors or TCP connections.
Step 2: Build a Batch Processor with Memory Awareness
Batching ensures that completed promise contexts are garbage collected before the next chunk executes. This prevents memory accumulation from large payloads.
type Task<T, R> = (input: T) => Promise<R>;
async function executeInChunks<T, R>(
items: T[],
concurrencyLimit: number,
task: Task<T, R>
): Promise<R[]> {
const semaphore = new AsyncSemaphore(concurrencyLimit);
const results: R[] = [];
const pending: Promise<void>[] = [];
for (const item of items) {
pending.push(
(async () => {
await semaphore.acquire();
try {
const result = await task(item);
results.push(result);
} finally {
semaphore.release();
}
})()
);
}
await Promise.all(pending);
return results;
}
Rationale: Unlike naive batching that slices arrays and awaits each slice sequentially, this implementation maintains a steady flow of active tasks. The finally block guarantees semaphore release even if a task throws. Results are collected incrementally, allowing the V8 engine to reclaim memory from completed operations.
Step 3: Integrate Resilient Retry Logic
Retries must respect external rate limits and avoid amplifying load. A disciplined retry policy checks error types, respects Retry-After headers, and applies jitter.
async function withRetry<T>(
fn: () => Promise<T>,
maxAttempts: number = 3,
baseDelay: number = 1000
): Promise<T> {
for (let attempt = 1; attempt <= maxAttempts; attempt++) {
try {
return await fn();
} catch (error: any) {
const isRetryable =
error?.status === 429 ||
error?.status === 503 ||
error?.code === 'ECONNRESET';
if (!isRetryable || attempt === maxAttempts) throw error;
const delay = baseDelay * Math.pow(2, attempt - 1);
const jitter = Math.random() * 500;
await new Promise(res => setTimeout(res, delay + jitter));
}
}
throw new Error('Retry logic exhausted');
}
Rationale: Exponential backoff prevents synchronized retry storms. Jitter randomizes delays across concurrent workers, reducing collision probability. The retryable check ensures that client errors (400, 401, 404) fail fast, preserving system resources for recoverable failures.
Pitfall Guide
1. The Parallelism Illusion
Explanation: Developers assume Promise.all() creates true parallel execution. JavaScript's event loop schedules I/O concurrently but runs on a single thread. CPU-bound tasks will block the loop, and I/O tasks compete for the same file descriptors.
Fix: Reserve promise aggregation for I/O-bound operations. Offload CPU-heavy work to Worker Threads or child processes. Never mix CPU and I/O tasks in the same concurrency pool.
2. Rate Limit Blindness
Explanation: Third-party APIs enforce strict request quotas. Launching hundreds of requests simultaneously triggers 429 responses, temporary bans, or degraded latency. Local test environments rarely enforce these limits.
Fix: Implement a token bucket or leaky bucket algorithm. Align your concurrency limit with the provider's documented rate limit divided by your expected execution window. Always parse Retry-After headers.
3. Memory Accumulation from Large Payloads
Explanation: Storing every resolved value in a single array prevents garbage collection until the entire operation completes. Large JSON responses, file buffers, or parsed datasets can easily exceed container memory limits.
Fix: Stream results to disk or a message broker instead of holding them in memory. Process data in chunks and discard completed payloads immediately. Use Promise.allSettled() only when you need visibility, not when you need memory efficiency.
4. Retry Amplification
Explanation: Blindly retrying failed requests multiplies load on an already stressed system. If 500 requests fail due to throttling and each retries 3 times, you've generated 1,500 additional requests targeting the same bottleneck.
Fix: Implement circuit breaker patterns. Stop retrying when failure rates exceed a threshold. Use exponential backoff with jitter. Never retry idempotent-safe operations without verifying server state first.
5. Partial Failure Ambiguity
Explanation: Promise.all() rejects immediately on the first failure, discarding successful results. This creates inconsistent system states where some records are updated, others are not, and no audit trail exists.
Fix: Use Promise.allSettled() for visibility, but pair it with concurrency limits. Design idempotent operations that can be safely re-run. Log successful and failed items separately for reconciliation.
6. Connection Pool Starvation
Explanation: Database drivers maintain connection pools (typically 10-50 connections). Unbounded async execution attempts to borrow more connections than available, causing timeouts, deadlocks, or pool exhaustion errors.
Fix: Set your concurrency limit to poolSize - overhead. Reserve 20% of connections for health checks, migrations, and synchronous queries. Monitor pool utilization metrics and alert at 70% capacity.
7. Treating allSettled as a Concurrency Fix
Explanation: Promise.allSettled() solves visibility, not load management. It still schedules every task immediately and consumes the same resources as Promise.all().
Fix: Never use allSettled as a standalone solution for large datasets. Combine it with a concurrency limiter or queue-based processor to control resource consumption while maintaining result visibility.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Small datasets (<100 items) | Fixed-size batching | Simple implementation, predictable memory, minimal overhead | Low (negligible compute increase) |
| High-throughput external APIs | Token bucket + concurrency limiter | Prevents 429 errors, respects rate limits, maintains steady throughput | Medium (requires rate limit tracking) |
| CPU-heavy transformations | Worker Thread pool | Prevents event loop blocking, utilizes multi-core hardware | High (increased memory & process overhead) |
| Long-running sync jobs | Message queue + dedicated workers | Enables pause/resume, dead letter queues, horizontal scaling | High (infrastructure & monitoring costs) |
| Real-time user requests | Direct async/await with strict limits | Low latency, simple error handling, predictable response times | Low (optimized connection pooling) |
Configuration Template
// concurrency.config.ts
export interface ConcurrencyConfig {
maxConcurrentTasks: number;
batchSize: number;
retry: {
maxAttempts: number;
baseDelayMs: number;
maxDelayMs: number;
jitterFactor: number;
};
memory: {
maxHeapUsageMB: number;
gcThresholdPercent: number;
};
monitoring: {
enableMetrics: boolean;
alertThresholdPercent: number;
};
}
export const defaultConfig: ConcurrencyConfig = {
maxConcurrentTasks: 10,
batchSize: 50,
retry: {
maxAttempts: 3,
baseDelayMs: 1000,
maxDelayMs: 30000,
jitterFactor: 0.5
},
memory: {
maxHeapUsageMB: 512,
gcThresholdPercent: 80
},
monitoring: {
enableMetrics: true,
alertThresholdPercent: 70
}
};
Quick Start Guide
- Identify unbounded aggregation points: Search your codebase for
Promise.all( and Promise.allSettled(. Flag any call operating on arrays that could exceed 50 items in production.
- Replace with concurrency limiter: Import the
AsyncSemaphore and executeInChunks utilities. Wrap your existing async function in the withRetry wrapper. Pass your dataset and concurrency limit to the batch processor.
- Configure downstream limits: Set
maxConcurrentTasks to 20% below your database pool size or API rate limit. Adjust batchSize based on payload size to keep heap usage under 512MB.
- Validate under load: Run a staging test with production-equivalent data volume. Monitor connection pool utilization, heap memory, and error rates. Adjust concurrency limits until metrics stabilize below 70% capacity.
- Deploy with observability: Enable metrics collection. Set up alerts for pool exhaustion, memory spikes, and retry storm patterns. Document the concurrency limits in your runbook for incident response.