process like pgbouncer or ProxySQL. Best for multi-tenant apps, serverless environments, or when pooling across multiple languages.
* Cloud Proxy: Managed services like AWS RDS Proxy or Azure Database for PostgreSQL Flexible Server proxy. Reduces operational overhead.
-
Determine Pool Sizing:
- Formula:
Pool Size = ((Core Count * 2) + Disk Spindle Count) is a heuristic for DB threads, but for application pools, use:
Max Pool Size = (DB Max Connections / Number of App Instances) * Safety Factor (0.8)
- Example: If DB allows 500 connections and you run 10 app instances,
Max Pool Size = (500 / 10) * 0.8 = 40.
-
Configure Lifecycle Parameters:
max: Hard limit on connections. Prevents DB exhaustion.
min: Minimum connections to keep warm. Reduces cold-start latency.
idleTimeout: Time before an idle connection is closed. Reclaims resources during low traffic.
maxLifetime: Maximum time a connection exists. Critical for cloud environments to handle rotated credentials or network drops.
acquireTimeout: Max time to wait for a connection from the pool. Prevents request threads from blocking indefinitely.
-
Implement Health Checks:
- Configure
validationQuery or testOnBorrow to ensure connections are alive before use. This handles network partitions and database restarts gracefully.
Code Example: TypeScript with pg
This implementation demonstrates a robust, singleton pool pattern with error handling and graceful shutdown.
import { Pool, PoolConfig } from 'pg';
// Singleton pattern to prevent multiple pool instances
let pool: Pool | null = null;
export function getPool(): Pool {
if (!pool) {
const config: PoolConfig = {
host: process.env.DB_HOST,
port: Number(process.env.DB_PORT) || 5432,
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
// Sizing
max: 40, // Hard cap
min: 5, // Keep warm
idleTimeoutMillis: 30000, // Recycle idle after 30s
maxLifetimeMillis: 600000, // 10 min max life (AWS RDS rotation safety)
// Safety
connectionTimeoutMillis: 2000, // Fail fast if pool exhausted
statement_timeout: 10000, // Query timeout
};
pool = new Pool(config);
// Error handler for idle connections
pool.on('error', (err, client) => {
console.error('Unexpected error on idle client', err);
// Client is automatically removed from pool by pg library
});
// Metrics hook (optional integration with Prometheus/Datadog)
pool.on('connect', () => {
// Increment metric: pool_connections_created_total
});
}
return pool;
}
// Graceful shutdown handler
export async function closePool() {
if (pool) {
await pool.end();
pool = null;
}
}
Architecture Decisions and Rationale
- Singleton Pool: Creating a new
Pool instance per request defeats the purpose. The pool must be instantiated once per process. In serverless environments, instantiate outside the handler to reuse across invocations in the same execution context.
- Transaction vs. Session Pooling:
- Session Pooling: Holds the connection for the duration of the client session. Safer but consumes more DB connections.
- Transaction Pooling: Returns the connection to the pool after each transaction. Maximizes throughput but breaks session-level state (e.g., temporary tables, prepared statements persisting across transactions). Use transaction pooling only if your workload is stateless per transaction.
- Prepared Statements: Pooling libraries often cache prepared statements client-side. Ensure
max is set correctly, as prepared statements consume memory on the database server per connection. If using pgbouncer in transaction mode, client-side prepared statement caching may cause errors; disable it or use pgbouncer's prepared statement support.
Pitfall Guide
1. Setting max Too High
Mistake: Setting max equal to the database max_connections or basing it on app threads.
Impact: When multiple app instances connect, the total connections exceed the DB limit, causing too many connections errors.
Fix: Calculate max based on shared DB capacity. max = (DB_Max / App_Instances) * 0.8.
2. Connection Leaks
Mistake: Acquiring a client from the pool but failing to release it in all code paths (e.g., missing finally block or unhandled promise rejection).
Impact: Pool exhaustion. The pool size shrinks until no connections are available, causing all requests to timeout.
Fix: Always use try/finally or the pool.query() shortcut which auto-releases.
// Bad
const client = await pool.connect();
await client.query('...');
// If error occurs above, release is never called.
// Good
const client = await pool.connect();
try {
await client.query('...');
} finally {
client.release();
}
3. Ignoring maxLifetime in Cloud Environments
Mistake: Leaving maxLifetime at default (often 0 or infinite).
Impact: Cloud providers (AWS, GCP, Azure) silently drop connections after a period or rotate TLS certificates. Applications hold stale connections, leading to intermittent ECONNRESET errors.
Fix: Set maxLifetime to a value lower than the cloud provider's connection timeout (e.g., 10 minutes for RDS).
4. Pool Starvation from Long Transactions
Mistake: Allowing slow queries or long transactions to hold connections for seconds.
Impact: The pool fills with blocked connections. New requests wait in the queue, increasing latency and potentially timing out.
Fix: Implement query timeouts (statement_timeout). Monitor active vs waiting metrics. Optimize slow queries. Consider a separate pool for read replicas if reporting queries are heavy.
Mistake: Setting idleTimeout too low (e.g., 1 second).
Impact: The pool constantly creates and destroys connections, negating the benefit of pooling and increasing CPU usage on both app and DB.
Fix: Set idleTimeout to a value that balances resource reclamation and connection reuse (e.g., 30 seconds to 1 minute).
6. Treating Pool Size as a Linear Scaling Factor
Mistake: Increasing max to fix latency spikes.
Impact: Adding more connections increases contention on database locks and CPU. It does not fix slow queries; it just allows more slow queries to run concurrently, worsening DB performance.
Fix: Diagnose the root cause. If waiting count is high, the pool is too small or queries are too slow. If active is high but latency is high, the bottleneck is likely DB CPU, locks, or I/O, not pool size.
7. Using Pooling with Serverless Without a Proxy
Mistake: Running library pools in serverless functions (AWS Lambda, Vercel) that scale to thousands of concurrent instances.
Impact: Each instance opens its own pool. Thousands of instances can open thousands of connections, overwhelming the database.
Fix: Use a database proxy (RDS Proxy, PgBouncer) or a serverless-aware pooler. Configure the library pool with max: 1 and let the proxy handle pooling, or use a provider-specific solution.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Monolith / Containerized App | Library Pool (pg.Pool, HikariCP) | Low latency, simple integration, per-process isolation. | Low. No external infrastructure. |
| Serverless / High Scale | Cloud Proxy (RDS Proxy) or PgBouncer | Prevents connection explosion from scaling instances. | Medium. Proxy adds cost but saves DB scaling costs. |
| Multi-Language Stack | PgBouncer / ProxySQL | Centralized pooling logic shared across different drivers/languages. | Medium. Ops overhead for proxy management. |
| Read-Heavy Reporting | Separate Pool for Read Replica | Isolates heavy reporting queries from OLTP pool. | Low. Requires read replica infrastructure. |
| Legacy App Refactor | PgBouncer in Transaction Mode | Allows pooling without code changes; maximizes throughput. | Low. Requires DB config changes. |
Configuration Template
Copy this template for a production-grade PostgreSQL pool in TypeScript. Adjust values based on your sizing calculations.
// db/pool.ts
import { Pool, PoolConfig } from 'pg';
const poolConfig: PoolConfig = {
// Connection
host: process.env.DB_HOST!,
port: parseInt(process.env.DB_PORT || '5432', 10),
database: process.env.DB_NAME!,
user: process.env.DB_USER!,
password: process.env.DB_PASSWORD!,
// Security
ssl: process.env.NODE_ENV === 'production'
? { rejectUnauthorized: true }
: false,
// Pool Sizing
// Formula: (DB_Max / Instances) * 0.8
// Example: DB=500, Instances=10 -> Max=40
max: parseInt(process.env.DB_POOL_MAX || '40', 10),
min: parseInt(process.env.DB_POOL_MIN || '5', 10),
// Lifecycle
// Must be < Cloud provider timeout (e.g., RDS drops at 10m)
maxLifetimeMillis: parseInt(process.env.DB_MAX_LIFETIME || '600000', 10),
// Recycle idle connections to free DB resources during low traffic
idleTimeoutMillis: parseInt(process.env.DB_IDLE_TIMEOUT || '30000', 10),
// Safety & Timeouts
// Fail fast if pool is exhausted; prevents thread starvation
connectionTimeoutMillis: parseInt(process.env.DB_ACQUIRE_TIMEOUT || '2000', 10),
// Query timeout to prevent long-running queries from blocking pool
statement_timeout: parseInt(process.env.DB_STATEMENT_TIMEOUT || '10000', 10),
// Client Configuration
application_name: process.env.APP_NAME || 'unknown-app',
};
export const pool = new Pool(poolConfig);
// Global error handler for the pool
pool.on('error', (err, client) => {
console.error(`[DB Pool] Unexpected error on idle client: ${err.message}`);
// The client is automatically removed from the pool by the library
});
// Optional: Log pool stats periodically
setInterval(() => {
console.log(`[DB Pool] Active: ${pool.totalCount - pool.idleCount}, Idle: ${pool.idleCount}, Waiting: ${pool.waitingCount}`);
}, 60000);
Quick Start Guide
- Install Driver: Run
npm install pg (or your database driver of choice).
- Create Pool Singleton: Implement the pool initialization code as a singleton module. Ensure it is imported, not re-instantiated, across your application.
- Query via Pool: Use
pool.query(sql, params) for simple queries or pool.connect() with try/finally for transactions. Never create a Client instance manually for request handling.
- Configure Environment: Set
DB_POOL_MAX and DB_MAX_LIFETIME based on your database limits and cloud provider settings.
- Add Observability: Expose pool metrics (
active, idle, waiting) to your monitoring system. Set alerts on waitingCount > 0 to detect pool starvation early.