describing, and immutable. Use a strict naming pattern: {version}_{description}.{ts|sql}. Store files in ./migrations/ and maintain a migrations table in the target database to track applied versions and checksums.
// migrations/001_add_users_table.ts
import { Knex } from 'knex';
export async function up(knex: Knex): Promise<void> {
await knex.schema.createTable('users', (table) => {
table.uuid('id').primary().defaultTo(knex.raw('gen_random_uuid()'));
table.string('email').notNullable().unique();
table.timestamp('created_at').defaultTo(knex.fn.now());
});
}
export async function down(knex: Knex): Promise<void> {
await knex.schema.dropTableIfExists('users');
}
Step 2: Idempotent Execution with Checksum Verification
Idempotency prevents duplicate execution and detects drift. Compute a SHA-256 checksum of each migration file and compare it against the stored value in the migrations table. Skip execution if the version exists and the checksum matches.
import crypto from 'crypto';
import fs from 'fs';
import path from 'path';
import { Knex } from 'knex';
async function computeChecksum(filePath: string): Promise<string> {
const content = fs.readFileSync(filePath, 'utf8');
return crypto.createHash('sha256').update(content).digest('hex');
}
async function ensureMigrationsTable(knex: Knex): Promise<void> {
const exists = await knex.schema.hasTable('migrations');
if (!exists) {
await knex.schema.createTable('migrations', (table) => {
table.string('version').primary();
table.string('checksum').notNullable();
table.timestamp('applied_at').defaultTo(knex.fn.now());
});
}
}
export async function runMigrations(knex: Knex, dir: string): Promise<void> {
await ensureMigrationsTable(knex);
const files = fs.readdirSync(dir).filter(f => f.endsWith('.ts'));
for (const file of files) {
const filePath = path.join(dir, file);
const checksum = await computeChecksum(filePath);
const existing = await knex('migrations').where({ version: file }).first();
if (existing && existing.checksum === checksum) continue;
const mod = await import(filePath);
await knex.transaction(async (trx) => {
await mod.up(trx);
await trx('migrations').insert({ version: file, checksum });
});
}
}
Step 3: Transactional Safety & Advisory Locking
Database engines handle DDL differently. PostgreSQL allows DDL inside transactions; MySQL does not. Wrap execution in transactions where supported, and use advisory locks to prevent concurrent migration runs in distributed deployments.
// PostgreSQL advisory lock example
export async function acquireLock(knex: Knex, lockId: number): Promise<void> {
const result = await knex.raw(`SELECT pg_try_advisory_lock(?)`, [lockId]);
if (!result.rows[0].pg_try_advisory_lock) {
throw new Error('Migration lock acquired by another process. Aborting.');
}
}
export async function releaseLock(knex: Knex, lockId: number): Promise<void> {
await knex.raw(`SELECT pg_advisory_unlock(?)`, [lockId]);
}
Step 4: CI/CD Integration & Pre-Flight Validation
Migrations must validate before execution. Implement a dry-run mode that parses SQL/TypeScript files, checks syntax, verifies table/column existence, and simulates transaction boundaries without applying changes.
// pipeline-triggered validation
async function dryRun(knex: Knex, dir: string): Promise<void> {
const files = fs.readdirSync(dir).filter(f => f.endsWith('.ts'));
for (const file of files) {
const filePath = path.join(dir, file);
const mod = await import(filePath);
// Validate up/down exports exist
if (typeof mod.up !== 'function' || typeof mod.down !== 'function') {
throw new Error(`${file} missing required up/down exports`);
}
}
console.log('Dry run validation passed.');
}
Architecture Decisions & Rationale
- Knex over raw SQL: Provides dialect abstraction, query building safety, and transaction chaining. Raw SQL is reserved for engine-specific operations (e.g.,
CREATE INDEX CONCURRENTLY).
- Checksum verification: Detects file tampering and prevents silent drift. Critical for multi-region deployments.
- Separation of schema vs data migrations: Schema changes use DDL-safe patterns; data migrations use batched updates with explicit cursors to avoid table locks.
- Advisory locking: Prevents race conditions in horizontal CI runners or multi-node deployments.
- Deterministic rollback: Every
up must have a corresponding down. Rollback scripts are versioned and tested alongside forward migrations.
Pitfall Guide
- Non-idempotent scripts: Running the same migration twice corrupts state or fails silently. Fix: enforce checksum tracking and version guards. Never use
CREATE TABLE without IF NOT EXISTS in non-versioned contexts.
- Missing transaction boundaries: Partial migrations leave schemas in inconsistent states. Fix: wrap all execution in database transactions. For MySQL, split DDL and DML, and use explicit commit/rollback checkpoints.
- Ignoring foreign key cascade behavior: Dropping or altering referenced columns triggers implicit cascades that delete production data. Fix: audit dependency graphs before migration. Use
RESTRICT instead of CASCADE in staging, and validate with EXPLAIN ANALYZE.
- Skipping dry-run/pre-flight validation: Syntax errors and missing indexes surface only in production. Fix: implement a validation phase that parses migration files, checks dialect compatibility, and simulates transaction scope.
- Hardcoding environment-specific values: Connection strings, timeouts, and batch sizes vary across environments. Fix: inject configuration via environment variables or secret managers. Use connection pooling limits aligned with target database capacity.
- Treating data migration as an afterthought: Schema changes are safe; data backfills are not. Fix: separate schema and data migrations. Use batched updates with explicit
LIMIT/OFFSET or keyset pagination. Monitor replication lag and lock wait times.
- No automated rollback path: Manual rollback during incidents extends MTTR and increases error probability. Fix: version
down scripts alongside up. Implement automated rollback triggers in CI/CD when health checks fail post-migration.
Best practices from production: use expand/contract patterns for column renames, schedule heavy data migrations during low-traffic windows, monitor pg_stat_activity or equivalent for lock contention, and enforce migration code reviews with schema diff tools.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Small app (<10k users) | Single-node automated runner with checksums | Simplicity outweighs distributed complexity | Low: minimal infra overhead |
| High-traffic SaaS | Expand/contract pattern + advisory locks + CI dry-run | Prevents lock contention and downtime during peak load | Medium: requires schema versioning discipline |
| Multi-tenant database | Tenant-isolated migrations with batched cursors | Avoids cross-tenant lock escalation and data leakage | High: requires tenant-aware routing and monitoring |
| Legacy monolith | Phased migration with feature flags and dual-write | Allows gradual cutover without full rewrite | High: initial duplication cost, reduced risk |
Configuration Template
# .github/workflows/db-migrate.yml
name: Database Migration Pipeline
on:
push:
paths: ['migrations/**']
workflow_dispatch:
env:
DB_HOST: ${{ secrets.DB_HOST }}
DB_PORT: ${{ secrets.DB_PORT }}
DB_NAME: ${{ secrets.DB_NAME }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20' }
- run: npm ci
- run: npx ts-node ./scripts/dry-run.ts
migrate:
needs: validate
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20' }
- run: npm ci
- run: npx ts-node ./scripts/run-migrations.ts
env:
MIGRATION_LOCK_ID: 1001
// scripts/run-migrations.ts
import knex from 'knex';
import { runMigrations, acquireLock, releaseLock } from '../src/migration-runner';
async function main() {
const db = knex({
client: 'pg',
connection: {
host: process.env.DB_HOST,
port: Number(process.env.DB_PORT),
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
},
pool: { min: 2, max: 5 },
});
const lockId = Number(process.env.MIGRATION_LOCK_ID || 1001);
try {
await acquireLock(db, lockId);
await runMigrations(db, './migrations');
console.log('Migrations applied successfully.');
} catch (err) {
console.error('Migration failed:', err);
process.exit(1);
} finally {
await releaseLock(db, lockId);
await db.destroy();
}
}
main();
Quick Start Guide
- Initialize a TypeScript project and install dependencies:
npm i knex pg typescript @types/node ts-node
- Create
./migrations/ directory and add versioned files following {version}_{description}.ts naming
- Add the
run-migrations.ts script and db-migrate.yml workflow to your repository
- Configure database credentials as repository secrets and push a migration file to trigger validation and execution
- Verify the
migrations table in your database; applied versions and checksums confirm successful automation