e correlation, and enforcement gates operate as a single control plane. The architecture follows four phases: policy definition, CI/CD integration, drift detection, and automated remediation.
Step 1: Define Policy-as-Code Boundaries
Policies must be versioned alongside infrastructure code. Use a policy engine that supports multiple IaC formats. Open Policy Agent (OPA) with Conftest handles HCL, YAML, and JSON. AWS CDK integrates natively with cdk-nag. Policies should enforce least-privilege IAM, encryption at rest, network isolation, and tag compliance.
TypeScript example: Custom cdk-nag rule for enforcing encryption on S3 buckets and RDS instances.
import { NagRuleCompliance, NagRules } from 'cdk-nag';
import { CfnBucket, CfnDBInstance } from 'aws-cdk-lib/aws-s3';
import { CfnDBInstance as RdsCfnDBInstance } from 'aws-cdk-lib/aws-rds';
export class EncryptedStorageRule implements NagRuleCompliance {
public get id(): string { return 'EncryptedStorageRule'; }
public get reason(): string { return 'Storage resources must enforce encryption at rest'; }
public isCompliant(node: any): boolean {
if (node instanceof CfnBucket) {
const bucket = node as CfnBucket;
return bucket.bucketEncryption !== undefined && bucket.bucketEncryption !== null;
}
if (node instanceof RdsCfnDBInstance) {
const rds = node as RdsCfnDBInstance;
return rds.storageEncrypted === true;
}
return true;
}
}
Step 2: Integrate Evaluation into CI/CD
Policy gates must run before terraform plan, cdk synth, or pulumi preview. Fail the pipeline on critical violations. Use matrix testing to evaluate policies across environments.
GitHub Actions workflow snippet:
name: IaC Security Gate
on:
pull_request:
paths:
- 'infra/**'
jobs:
policy-evaluation:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Conftest
run: |
wget https://github.com/open-policy-agent/conftest/releases/latest/download/conftest_Linux_x86_64.tar.gz
tar -xzf conftest_Linux_x86_64.tar.gz
sudo mv conftest /usr/local/bin
- name: Run OPA Policies
run: conftest test infra/ --policy policies/ --fail-on-warn
- name: CDK Security Scan
run: |
npm ci
npx cdk synth
npx cdk-nag infra/lib/*.ts
Step 3: Implement Drift Detection
IaC security automation fails when it only validates code, not state. Run scheduled drift detection against cloud APIs. Compare actual resource configurations against the IaC model. Flag deviations that bypass the pipeline.
TypeScript drift checker using AWS SDK v3:
import { S3Client, GetBucketEncryptionCommand } from '@aws-sdk/client-s3';
async function verifyBucketEncryption(bucketName: string): Promise<boolean> {
const client = new S3Client({ region: process.env.AWS_REGION });
try {
const cmd = new GetBucketEncryptionCommand({ Bucket: bucketName });
const res = await client.send(cmd);
return res.ServerSideEncryptionConfiguration !== undefined;
} catch {
return false; // Bucket lacks encryption or doesn't exist
}
}
Drift detection must trigger remediation, not just alerts. Use state-machine workflows to auto-remediate low-risk violations (e.g., missing tags, disabled public access). High-risk violations (e.g., open security groups, disabled logging) require manual approval with pre-filled remediation PRs.
Architecture rationale: Pre-provisioning gates prevent violations from entering the environment. Drift detection catches out-of-band changes. Automated remediation closes the loop without engineering overhead. This triad eliminates the traditional security bottleneck while maintaining auditability. Every policy evaluation, drift scan, and remediation action is logged, versioned, and tied to a specific commit SHA.
Pitfall Guide
-
Treating IaC Scans as Compliance Completion
Running a scanner once per sprint creates a false sense of security. Policies change, cloud services evolve, and new violation patterns emerge. Scanning must be continuous, tied to every commit, and versioned alongside infrastructure code.
-
Blindly Adopting Default Rule Sets
Out-of-the-box policies (CIS, NIST, AWS Foundational) are starting points, not endpoints. They generate noise when applied without environment context. Tune severity thresholds, suppress known exceptions with documented justifications, and enforce mandatory rules only.
-
Ignoring Runtime Drift
IaC security automation that only validates code misses console changes, CLI bypasses, and third-party integrations. Drift detection must run post-deployment and correlate actual state with desired state. Without this, automation only secures the pipeline, not the environment.
-
Hardcoding Secrets in IaC Templates
Embedding credentials, API keys, or private keys in Terraform variables, CDK context, or Pulumi configs breaks security automation. Use secret managers (AWS Secrets Manager, HashiCorp Vault) with dynamic credential generation. IaC should reference secrets, never contain them.
-
Bypassing Gates for "Urgent" Deployments
Emergency overrides destroy policy integrity. Implement break-glass workflows with mandatory post-deployment reviews, automated rollbacks, and audit trails. If gates are consistently bypassed, the policies are misaligned with operational reality.
-
Lack of Role-Based Policy Enforcement
Applying identical rules to development, staging, and production creates friction. Use environment-aware policy evaluation. Allow relaxed networking in dev, enforce strict isolation in prod, and require explicit approvals for cross-environment promotions.
-
Not Versioning Policies Alongside Infrastructure
Policies that live in a separate repository or static dashboard create drift between intent and enforcement. Store policies in the same monorepo as IaC. Use Git hooks to validate policy syntax before commit. Treat policy updates as infrastructure changes.
Best Practices from Production:
- Run policy evaluation in isolated containers to prevent dependency conflicts.
- Cache policy engines in CI to reduce pipeline latency below 15 seconds.
- Use policy suppression files with mandatory expiration dates to prevent permanent exceptions.
- Implement policy coverage metrics: track percentage of resources evaluated vs. total deployed.
- Require policy authors to sign commits with GPG/SSH keys for audit integrity.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Multi-cloud environment (AWS, GCP, Azure) | OPA + Conftest with HCL/YAML parsers | Vendor-agnostic policy language, single evaluation engine across providers | Low (open-source), moderate CI compute |
| AWS-native CDK/Pulumi stack | cdk-nag + custom TypeScript rules | Native AST evaluation, zero format conversion, direct integration with synth step | Low (SDK overhead), negligible pipeline latency |
| High-compliance regulated workload | Pre-provisioning gate + post-deploy drift scan + automated remediation | Dual-layer enforcement satisfies audit requirements, eliminates manual evidence collection | Moderate (drift detection compute), high ROI on audit labor |
| Fast-moving startup with frequent infra changes | Lightweight pre-commit hooks + CI gate + suppression workflow | Catches violations early, maintains velocity, prevents policy fatigue | Low (developer tooling), minimal cloud overhead |
Configuration Template
Complete GitHub Actions workflow with OPA policy evaluation, CDK security scan, and drift detection trigger:
name: IaC Security Automation Pipeline
on:
push:
paths:
- 'infra/**'
- 'policies/**'
schedule:
- cron: '0 2 * * *' # Daily drift detection
env:
AWS_REGION: us-east-1
CONTEST_POLICY_DIR: policies/
CDK_APP: infra/lib/main.ts
jobs:
security-gate:
if: github.event_name == 'push'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: 20
- name: Install Dependencies
run: npm ci
- name: Run OPA Policies
run: |
curl -L https://github.com/open-policy-agent/conftest/releases/latest/download/conftest_Linux_x86_64.tar.gz | tar -xz
./conftest test infra/ --policy ${{ env.CONTEST_POLICY_DIR }} --fail-on-warn
- name: CDK Security Scan
run: npx cdk synth && npx cdk-nag ${{ env.CDK_APP }}
drift-detection:
if: github.event_name == 'schedule'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Drift Scanner
run: npx ts-node scripts/drift-checker.ts
- name: Post Remediation PR
if: failure()
uses: peter-evans/create-pull-request@v5
with:
commit-message: 'fix: auto-remediate detected drift'
title: 'Automated Drift Remediation'
body: 'Pipeline detected configuration drift. Remediation applied.'
branch: auto/drift-remediation
Quick Start Guide
- Install
cdk-nag and conftest in your infrastructure repository: npm i -D cdk-nag && curl -L https://github.com/open-policy-agent/conftest/releases/latest/download/conftest_Linux_x86_64.tar.gz | tar -xz -C /usr/local/bin
- Create a
policies/ directory with OPA Rego rules enforcing encryption and public access restrictions
- Add a CI step running
conftest test infra/ --policy policies/ --fail-on-warn before your IaC plan/synth command
- Commit and push; the pipeline will block any merge that violates baseline policies, generating audit logs automatically