would otherwise bottleneck deployment velocity.
Core Solution
SwiftDeploy operates on a declarative configuration model. Engineers define desired state in manifest.yaml, and the CLI generates all downstream configurations, orchestrates containers, enforces policies, and streams observability data.
manifest.yaml (the only file you edit manually):
services:
image: swiftdeploy-keeds-api:v1.0.0
port: 5000
name: api-service
mode: stable
nginx:
image: nginx:alpine
port: 8080
proxy_timeout: 30s
network:
name: swiftdeploy-net
driver_type: bridge
Enter fullscreen mode Exit fullscreen mode
From this one file, SwiftDeploy generates:
nginx.conf (web server configuration)
docker-compose.yml (container orchestration)
- All the settings for monitoring and policy checks
CLI Workflow:
The CLI tool (swiftdeploy) has several commands:
| Command | What It Does |
|---|
init | Reads manifest.yaml and generates nginx.conf + docker-compose.yml |
validate | Checks if everything is ready for deployment |
deploy | Starts all containers and waits for them to be healthy |
promote canary/stable | Switches between stable and canary modes |
status | Shows a live dashboard with metrics and policy compliance |
audit | Generates a report of all events and policy violations |
teardown | Stops and removes all containers |
Observability & Metrics:
The API service exposes a /metrics endpoint that reports statistics in Prometheus format:
http_requests_total{method="GET",path="/healthz",status_code="200"} 42
http_request_duration_seconds_bucket{le="0.1"} 35
app_uptime_seconds 847
app_mode 0
chaos_active 0
Enter fullscreen mode Exit fullscreen mode
These metrics tell you:
- How many requests have been made
- How fast responses are
- How long the app has been running
- Whether you're in stable or canary mode
- Whether chaos testing is active
OPA: The Policy Engine:
OPA (Open Policy Agent) is a separate container that acts like a security guard. Before you can deploy or promote, the CLI asks OPA: "Is it safe?"
Why use OPA instead of checking directly in the CLI?
- Policies are separate from code β easier to update
- If OPA crashes, the CLI still works (just warns you)
- OPA is not accessible from the internet (security)
The Two Policies:
- Infrastructure Policy (checks before deploy): Is there enough disk space? (must be > 10GB). Is the CPU overloaded? (must be < 2.0).
- Canary Safety Policy (checks before promoting to canary): Is the error rate too high? (must be < 1%). Is the response time too slow? (P99 must be < 500ms).
Data-Driven Thresholds:
The actual numbers (10GB, 2.0, 1%, 500ms) are stored in a separate JSON file, not in the policy code. This means you can change the limits without modifying the policy logic.
thresholds.json:
{
"infrastructure": {
"min_disk_gb": 10,
"max_cpu_load": 2.0
},
"canary": {
"max_error_rate": 0.01,
"max_p99_latency_ms": 500
}
}
Enter fullscreen mode Exit fullscreen mode
Status Dashboard & Audit:
The swiftdeploy status command shows a live dashboard:
βββββββββββββββββββββββββββββββββββββββββ
β SwiftDeploy Status Dashboard β
β ββββββββββββββββββββββββββββββββββββββββ£
β Mode: canary β
β Chaos: none β
β Req/s: 0.98 β
β P99 Latency: 5ms β
β Error Rate: 0.00% β
β Uptime: 133s β
β ββββββββββββββββββββββββββββββββββββββββ£
β Policy Compliance β
β Infrastructure: PASS β
β Canary Safety: PASS β
βββββββββββββββββββββββββββββββββββββββββ
Enter fullscreen mode Exit fullscreen mode
Every time the dashboard refreshes, it saves the data to history.jsonl for the audit trail. The swiftdeploy audit command reads history.jsonl and generates audit_report.md with a timeline of all events and a list of policy violations.
Architecture Flow:
User runs: swiftdeploy deploy
β
βΌ
CLI gets host stats (disk, CPU)
β
βΌ
CLI asks OPA: "Is it safe to deploy?"
β
βΌ
OPA checks infrastructure policy
β
βββ If safe β Start containers
β
βββ If not safe β Block with reason
Enter fullscreen mode Exit fullscreen mode
User runs: swiftdeploy promote canary
β
βΌ
CLI scrapes /metrics endpoint
β
βΌ
CLI calculates error rate and P99 latency
β
βΌ
CLI asks OPA: "Is it safe to promote?"
β
βΌ
OPA checks canary safety policy
β
βββ If safe β Switch to canary mode
β
βββ If not safe β Block with reason
Enter fullscreen mode Exit fullscreen mode
Pitfall Guide
- OPA Rule Syntax Conflicts: Defining
default deny := [] alongside deny contains msg if { ... } causes Rego evaluation crashes. The contains keyword inherently handles empty sets. Best Practice: Remove explicit default empty assignments and rely on contains for set-based policy logic.
- OPA Data Path Resolution: OPA loads data files based on strict directory-to-path mapping. Placing
thresholds.json in the root directory breaks policy evaluation. Best Practice: Always align JSON data files with OPA's expected namespace structure (e.g., swiftdeploy/thresholds.json maps to data.swiftdeploy.thresholds).
- Missing Input Context for Policy Evaluation: OPA policies requiring
input.timestamp will fail silently or default to FAIL if the CLI omits the field. Best Practice: Always inject a timestamp field in every CLI-to-OPA payload to satisfy temporal policy constraints.
- Nginx Startup DNS Resolution: Nginx resolves upstream hostnames at startup. If the backend container isn't ready, Nginx caches the failure and returns 502s. Best Practice: Use Docker's internal DNS resolver (
127.0.0.11) and variable-based proxy directives to force runtime hostname resolution per request.
- Container Restart vs. Recreation:
docker compose restart only restarts processes; it does not reload updated environment variables or docker-compose.yml configurations. Best Practice: Use docker compose up -d --no-deps <service> to force container recreation when switching modes or updating configs.
- Over-Restrictive Container Security Context: Explicitly setting
user: nginx and dropping all Linux capabilities can break official images that handle privilege dropping internally. Best Practice: Rely on the base image's default user/permission model unless specific hardening is required. Test capability drops in isolation before applying to production stacks.
Deliverables
- Deployment Blueprint: A complete architectural diagram and workflow specification detailing how
manifest.yaml drives container generation, OPA policy evaluation, metrics scraping, and audit logging. Includes network topology, service dependencies, and data flow between CLI, OPA, and Nginx.
- Pre-Deployment Checklist: A validated sequence for safe rollouts: verify
manifest.yaml syntax β run swiftdeploy validate β confirm OPA connectivity β check host resources β execute swiftdeploy deploy β monitor swiftdeploy status dashboard β verify audit trail generation.
- Configuration Templates: Production-ready templates including
manifest.yaml (service/network definitions), thresholds.json (data-driven policy limits), OPA Rego policy skeletons (infrastructure.rego, canary.rego), and Nginx upstream proxy configurations with Docker DNS resolution patterns.