deterministic migration generator, and a controlled execution engine with locking and transactional safety.
Architecture Decisions
- Schema Definition: Use a TypeScript-based schema definition that mirrors the database structure. This enables compile-time type checking and auto-generation of migration SQL.
- Migration Generator: Generate SQL files deterministically. The generator should produce a numbered sequence of files, each containing the necessary DDL/DML to transition from version N to N+1.
- Execution Engine: A custom runner that:
- Acquires an advisory lock to prevent concurrent execution.
- Wraps migrations in transactions where supported.
- Records execution metadata in a
__migrations table.
- Validates the database state before applying changes.
- Zero-Downtime Pattern: For large tables, implement the Expand/Contract pattern:
- Expand: Add new column, backfill data, update code to write to both columns.
- Migrate: Switch code to read from new column.
- Contract: Remove old column in a subsequent migration.
Step-by-Step Implementation
This example uses a TypeScript ecosystem with drizzle-orm for schema definition and a custom migration runner. The runner ensures safety and idempotency.
1. Schema Definition
Define the schema using a type-safe DSL. This serves as the source of truth.
// schema/users.ts
import { pgTable, varchar, timestamp, integer } from 'drizzle-orm/pg-core';
export const users = pgTable('users', {
id: integer().primaryKey().generatedAlwaysAsIdentity(),
email: varchar({ length: 255 }).notNull().unique(),
displayName: varchar({ length: 100 }),
createdAt: timestamp({ mode: 'string' }).defaultNow().notNull(),
});
2. Migration Generation
Use the tooling CLI to generate migrations based on schema changes. This creates a deterministic SQL file.
# Command to generate migration
npx drizzle-kit generate --name add_user_status_column
This produces migrations/0002_add_user_status_column.sql:
ALTER TABLE "users" ADD COLUMN "status" varchar(20) DEFAULT 'active';
3. Controlled Migration Runner
The runner is the critical component. It handles locking, transactions, and version tracking.
// lib/migration-runner.ts
import { Client } from 'pg';
import { readFileSync, readdirSync } from 'fs';
import { join } from 'path';
const MIGRATIONS_DIR = './migrations';
const LOCK_ID = 20240520; // Unique integer for advisory lock
export class MigrationRunner {
private client: Client;
constructor(connectionString: string) {
this.client = new Client({ connectionString });
}
async connect() {
await this.client.connect();
}
async run() {
try {
// 1. Acquire Advisory Lock
await this.acquireLock();
// 2. Ensure Migrations Table Exists
await this.ensureMigrationsTable();
// 3. Get Pending Migrations
const pending = await this.getPendingMigrations();
if (pending.length === 0) {
console.log('No pending migrations.');
return;
}
console.log(`Applying ${pending.length} migrations...`);
// 4. Execute Migrations
for (const migration of pending) {
await this.executeMigration(migration);
}
console.log('Migrations completed successfully.');
} finally {
await this.releaseLock();
await this.client.end();
}
}
private async acquireLock() {
// Prevents concurrent migrations across multiple app instances
const res = await this.client.query(
`SELECT pg_try_advisory_lock($1)`,
[LOCK_ID]
);
if (!res.rows[0].pg_try_advisory_lock) {
throw new Error('Another migration is already running. Exiting.');
}
console.log('Advisory lock acquired.');
}
private async releaseLock() {
await this.client.query(`SELECT pg_advisory_unlock($1)`, [LOCK_ID]);
}
private async ensureMigrationsTable() {
await this.client.query(`
CREATE TABLE IF NOT EXISTS __migrations (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL UNIQUE,
applied_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
`);
}
private async getPendingMigrations() {
const files = readdirSync(MIGRATIONS_DIR)
.filter(f => f.endsWith('.sql'))
.sort(); // Ensure deterministic order
const res = await this.client.query(
'SELECT name FROM __migrations'
);
const applied = new Set(res.rows.map(r => r.name));
return files.filter(f => !applied.has(f));
}
private async executeMigration(file: string) {
const sql = readFileSync(join(MIGRATIONS_DIR, file), 'utf-8');
// Execute within a transaction for atomicity
await this.client.query('BEGIN');
try {
await this.client.query(sql);
await this.client.query(
'INSERT INTO __migrations (name) VALUES ($1)',
[file]
);
await this.client.query('COMMIT');
console.log(`Applied: ${file}`);
} catch (err) {
await this.client.query('ROLLBACK');
console.error(`Failed to apply ${file}:`, err);
throw err;
}
}
}
4. Zero-Downtime Migration Example
For large tables, avoid locking ALTER TABLE. Use the expand/contract pattern.
-- Migration: Add 'new_email' column for zero-downtime email migration
-- Phase 1: Expand
BEGIN;
ALTER TABLE users ADD COLUMN new_email VARCHAR(255);
CREATE INDEX CONCURRENTLY idx_users_new_email ON users(new_email);
COMMIT;
-- Application Code Update: Write to both 'email' and 'new_email'.
-- Backfill script: Update 'new_email' from 'email' in batches.
-- Migration: Switch Read
-- Phase 2: Migrate Read
-- Application Code Update: Read from 'new_email'.
-- Migration: Remove Old Column
-- Phase 3: Contract
BEGIN;
ALTER TABLE users DROP COLUMN email;
ALTER TABLE users RENAME COLUMN new_email TO email;
-- Recreate constraints/indexes on renamed column
COMMIT;
Pitfall Guide
Production migration failures often stem from predictable errors. Avoid these pitfalls to ensure system stability.
1. Non-Transactional DDL in PostgreSQL
PostgreSQL supports transactions for most DDL, but CREATE INDEX CONCURRENTLY and REINDEX cannot run inside a transaction block.
- Risk: If a migration contains these commands, the runner cannot wrap the entire migration in a transaction. A failure after the concurrent index creation leaves the database in a partial state.
- Mitigation: Split migrations. Run non-transactional commands in separate migration files or handle them explicitly with retry logic and state checks.
2. Ignoring Advisory Locks
Running migrations without a distributed lock leads to race conditions when multiple application instances start simultaneously (e.g., during a rolling deployment).
- Risk: Duplicate execution, constraint violations, or corrupted migration metadata.
- Mitigation: Always use
pg_try_advisory_lock (or equivalent in other DBs) before checking for pending migrations. Fail fast if the lock cannot be acquired.
3. Modifying Past Migrations
Once a migration has been applied to any environment, it must never be altered.
- Risk: Developers modifying a past migration file creates divergence between environments. Environments that already applied the migration will not see the change, while new environments will apply the modified version, leading to schema drift.
- Mitigation: Enforce immutability in CI/CD. If a change is needed, create a new migration to correct the schema.
4. Blocking Data Migrations
Running large UPDATE statements without batching locks rows and increases WAL volume, potentially causing replication lag or running out of disk space.
- Risk: Service timeouts, degraded performance, and storage exhaustion.
- Mitigation: Use batched updates with
LIMIT and OFFSET or cursor-based iteration. Commit in chunks and introduce small delays to reduce lock contention.
5. Missing Rollback Strategies
Migrations that only define forward changes cannot be safely rolled back during deployment failures.
- Risk: A failed deployment requires manual database intervention to revert schema changes, increasing MTTR.
- Mitigation: Generate rollback scripts alongside forward migrations. Ensure rollbacks are tested in staging environments. Note that data loss is irreversible; rollback scripts should focus on schema reversion.
6. Schema Drift in CI/CD
Relying on the database state in CI rather than validating schema definitions leads to false positives.
- Risk: Tests pass in CI but fail in production due to environment differences or unapplied migrations.
- Mitigation: Run migrations in a ephemeral database container during CI. Validate that the generated SQL matches the expected schema diff. Fail the build if drift is detected.
7. Over-Reliance on Auto-Migration in Production
ORMs with auto-migration features (e.g., prisma db push) apply schema changes automatically based on the current schema definition.
- Risk: Auto-migration tools may generate destructive changes (e.g., dropping columns) or fail to handle complex transitions safely. They lack the control required for zero-downtime deployments.
- Mitigation: Disable auto-migration in production. Use migration generators to produce explicit SQL scripts that are reviewed and version-controlled.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Startup / MVP | Declarative ORM Auto-Migrate | Speed of development is priority; low risk due to small data volume. | Low engineering cost; acceptable risk. |
| Mid-Scale App | Schema-as-Code + Generated SQL | Balances DX with safety; explicit migrations allow review and rollback. | Moderate engineering cost; high reliability. |
| Enterprise / High Availability | Zero-Downtime Pattern + Controlled Runner | Prevents downtime during schema changes; supports rolling deployments. | High engineering cost; critical for SLA compliance. |
| Multi-Database Support | Database-Agnostic Tool (e.g., Kysely) | Abstracts dialect differences; simplifies maintenance across engines. | Moderate cost; reduces vendor lock-in. |
| Legacy Database | Imperative Framework (Flyway/Liquibase) | Handles complex legacy schemas with extensive history and validation. | High migration effort; ensures stability. |
Configuration Template
Ready-to-use configuration for a TypeScript migration runner using environment variables and type-safe schema imports.
// migration.config.ts
import { defineConfig } from 'drizzle-kit';
import 'dotenv/config';
export default defineConfig({
schema: './src/db/schema.ts',
out: './migrations',
dialect: 'postgresql',
dbCredentials: {
url: process.env.DATABASE_URL!,
},
strict: true,
// Ensure migrations are generated with explicit types
verbose: true,
});
// src/db/migrate.ts
import { MigrationRunner } from '../lib/migration-runner';
import 'dotenv/config';
async function main() {
const runner = new MigrationRunner(process.env.DATABASE_URL!);
await runner.connect();
try {
await runner.run();
process.exit(0);
} catch (error) {
console.error('Migration failed:', error);
process.exit(1);
}
}
main();
Quick Start Guide
-
Initialize Project:
npm install drizzle-orm drizzle-kit pg dotenv
-
Define Schema:
Create src/db/schema.ts and define tables using drizzle-orm DSL.
-
Configure Migration:
Add migration.config.ts with database credentials and schema path.
-
Generate Migration:
npx drizzle-kit generate --name init_schema
Review the generated SQL in migrations/.
-
Run Migration:
npx tsx src/db/migrate.ts
Verify the __migrations table and schema changes in the database.
-
Integrate CI/CD:
Add migration step to deployment pipeline. Ensure DATABASE_URL is set and advisory locks are functional. Run rollback tests in pre-deployment hooks.