ore provisioning:
- Read/write ratio (>80% writes favors NoSQL; >60% complex reads favors SQL)
- Query topology (point lookups vs multi-entity joins)
- Consistency tolerance (strong ACID vs eventual/configurable)
Step 2: Select and Initialize the Engine
Use connection pooling and schema validation from day one. Raw drivers outperform heavy ORMs in high-throughput scenarios, but type safety remains critical.
SQL Implementation (PostgreSQL + Drizzle ORM)
import { drizzle } from 'drizzle-orm/node-postgres';
import { pgTable, varchar, integer, timestamp, jsonb } from 'drizzle-orm/pg-core';
import { Pool } from 'pg';
const pool = new Pool({
host: process.env.DB_HOST,
port: Number(process.env.DB_PORT),
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASS,
max: 20,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 5000,
});
const users = pgTable('users', {
id: integer().primaryKey().generatedAlwaysAsIdentity(),
email: varchar({ length: 255 }).notNull().unique(),
profile: jsonb().$type<{ name: string; preferences: Record<string, unknown> }>(),
createdAt: timestamp().defaultNow().notNull(),
});
export const db = drizzle(pool);
export async function createUser(email: string, profile: { name: string; preferences: Record<string, unknown> }) {
return db.insert(users).values({ email, profile }).returning().execute();
}
NoSQL Implementation (MongoDB + Native Driver)
import { MongoClient, ObjectId } from 'mongodb';
const client = new MongoClient(process.env.MONGO_URI || 'mongodb://localhost:27017');
const db = client.db('app');
const users = db.collection('users');
export async function createUser(email: string, profile: { name: string; preferences: Record<string, unknown> }) {
const result = await users.insertOne({
_id: new ObjectId(),
email,
profile,
createdAt: new Date(),
});
return { insertedId: result.insertedId };
}
Step 3: Architect for Scale
Indexing strategy differs fundamentally:
- SQL: Create composite indexes matching WHERE and ORDER BY clauses. Use EXPLAIN ANALYZE to verify index utilization. Avoid over-indexing; each index adds write overhead and storage bloat.
- NoSQL: Index only query paths. Use covered queries where possible. For MongoDB, create indexes on frequently filtered fields and use TTL indexes for ephemeral data.
Connection handling must include retry logic and circuit breaking. Both engines degrade gracefully under connection exhaustion if pooling is configured correctly. Implement exponential backoff with jitter for transient network failures.
Step 4: Enforce Consistency Boundaries
SQL guarantees ACID by default. NoSQL requires explicit transaction configuration:
// MongoDB multi-document transaction
const session = client.startSession();
try {
await session.withTransaction(async () => {
await users.updateOne({ email: 'a@test.com' }, { $set: { status: 'verified' } }, { session });
await db.collection('audit').insertOne({ action: 'verify', userId: 'a@test.com' }, { session });
});
} finally {
await session.endSession();
}
Architectural rationale: Use transactions only when cross-entity integrity is required. Unnecessary transaction scope increases lock contention and reduces throughput. Design idempotent operations to tolerate eventual consistency in high-write scenarios.
Pitfall Guide
-
Treating JSONB as a full NoSQL replacement
PostgreSQL's JSONB enables flexible schemas, but querying nested fields without GIN indexes triggers sequential scans. Index specific paths using expression indexes: CREATE INDEX ON users ((profile->>'region')); Without this, JSONB performance degrades faster than native document stores.
-
Ignoring transaction boundaries in distributed NoSQL
MongoDB and DynamoDB support multi-document transactions, but they carry significant latency overhead and lock contention. Using them for high-throughput writes creates bottlenecks. Best practice: model data to avoid cross-document writes, or use outbox patterns with message queues for eventual consistency.
-
Over-normalizing document databases
Relational normalization habits leak into NoSQL design, resulting in excessive $lookup or client-side joins. Document stores optimize for data locality. Embed frequently accessed related data; reference only when data is large, rarely queried, or shared across multiple parents.
-
Underestimating join costs in SQL at scale
Joins are cheap on small datasets but become O(n*m) operations on unindexed tables. At scale, materialized views or pre-aggregated tables outperform runtime joins. Use connection pooling and prepared statements to reduce planner overhead, but redesign schema when join depth exceeds 3 tables with >1M rows.
-
Treating eventual consistency as free
Eventual consistency reduces write latency but requires application-level idempotency, conflict resolution, and stale-read handling. Implement version vectors or causal consistency markers for critical paths. Never assume "eventual" means "ignore"; design compensating transactions for financial or inventory systems.
-
Vendor lock-in via proprietary query languages
MongoDB's aggregation pipeline and DynamoDB's PartiQL create migration friction. Abstract data access behind a repository interface. Use TypeScript generics to isolate engine-specific syntax. This preserves the option to swap engines without rewriting business logic.
-
Skipping connection pooling and retry logic
Direct connections exhaust under load. Both engines require pool management (max, idleTimeout, connectionTimeout). Implement exponential backoff with jitter for transient failures. Add circuit breakers to prevent cascade failures during network partitions.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Financial transactions, inventory, audit logs | SQL (PostgreSQL/MySQL) | Strong ACID compliance, complex joins, mature tooling | Higher initial tuning cost; lower long-term compliance risk |
| User profiles, session storage, content management | NoSQL (MongoDB/DynamoDB) | Flexible schema, high write throughput, automatic scaling | Lower operational overhead; requires application-level consistency design |
| Real-time analytics, time-series telemetry | NoSQL (Cassandra/TimescaleDB) | Optimized for append-heavy workloads, partition tolerance | Horizontal scaling reduces per-node cost; query flexibility limited |
| Complex reporting, BI dashboards, data warehousing | SQL (PostgreSQL/ClickHouse) | Window functions, CTEs, materialized views, SQL standard compliance | Higher storage/compute cost; predictable query performance |
| Graph relationships, recommendation engines | Graph DB (Neo4j/Amazon Neptune) | Native relationship traversal, O(1) neighbor lookups | Specialized licensing; steep learning curve for traversal queries |
| High-velocity IoT, event streaming | NoSQL (Kafka + DynamoDB/MongoDB) | Write-optimized, partition-tolerant, schema evolution | Low per-event cost; requires stream processing pipeline |
Configuration Template
Docker Compose (Local Development)
version: '3.8'
services:
postgres:
image: postgres:16-alpine
environment:
POSTGRES_DB: app_dev
POSTGRES_USER: dev
POSTGRES_PASSWORD: dev_secret
ports: ["5432:5432"]
volumes: [pg_data:/var/lib/postgresql/data]
mongodb:
image: mongo:7
environment:
MONGO_INITDB_ROOT_USERNAME: dev
MONGO_INITDB_ROOT_PASSWORD: dev_secret
ports: ["27017:27017"]
volumes: [mongo_data:/data/db]
volumes:
pg_data:
mongo_data:
TypeScript Environment Configuration
// env.ts
export const config = {
postgres: {
host: process.env.PG_HOST || 'localhost',
port: Number(process.env.PG_PORT) || 5432,
database: process.env.PG_DB || 'app_dev',
user: process.env.PG_USER || 'dev',
password: process.env.PG_PASS || 'dev_secret',
pool: { max: 20, idle: 30000, connectionTimeout: 5000 },
},
mongodb: {
uri: process.env.MONGO_URI || 'mongodb://dev:dev_secret@localhost:27017/app_dev?authSource=admin',
options: {
maxPoolSize: 20,
serverSelectionTimeoutMS: 5000,
socketTimeoutMS: 45000,
},
},
} as const;
Quick Start Guide
- Spin up infrastructure: Run
docker compose up -d to launch PostgreSQL and MongoDB locally. Verify connectivity with pg_isready and mongosh --eval "db.runCommand({ping:1})".
- Initialize schemas: Execute Drizzle migration (
npx drizzle-kit push) for SQL. For MongoDB, create indexes: db.users.createIndex({ email: 1 }, { unique: true }) and db.users.createIndex({ "profile.region": 1 }).
- Run sample queries: Test SQL with
SELECT * FROM users WHERE email = $1; using prepared statements. Test NoSQL with users.find({ email: "test@example.com" }).projection({ profile: 1 }). Measure response times with console.time() or APM instrumentation.
- Validate scaling behavior: Load test with 500 concurrent connections using
autocannon or k6. Monitor connection pool utilization, query latency percentiles, and error rates. Adjust maxPoolSize and index strategy based on results.
- Deploy to staging: Replace local URIs with managed service endpoints (RDS, Atlas, DynamoDB). Enable VPC peering, IAM roles, and encrypted connections. Run integration tests against staging before production rollout.