n within a deterministic control flow. The solution consists of three layers: Context Engineering, Generation Orchestration, and Verification.
Architecture Decisions
-
Retrieval-Augmented Generation (RAG) for Code: Never query an LLM with a task description alone. You must inject relevant context. This includes:
- AST Analysis: Parse the codebase to extract function signatures, interfaces, and dependency graphs.
- Semantic Search: Use vector embeddings to retrieve relevant code snippets based on the task intent.
- Project Rules: Inject style guides, security policies, and architectural constraints as system prompts.
-
Structured Output Enforcement: LLMs must return code wrapped in structured formats (e.g., JSON with schema validation) to enable programmatic parsing and injection into files.
-
Sandboxed Verification: Generated code must be executed in a sandboxed environment with automated tests before being merged.
Technical Implementation (TypeScript)
The following implementation demonstrates a CodeGenerationStrategy that integrates context retrieval, structured generation, and validation.
import { VectorStore } from './vector-store';
import { LLMClient } from './llm-client';
import { ASTParser } from './ast-parser';
import { z } from 'zod';
// Schema for structured output
const CodeGenerationSchema = z.object({
code: z.string(),
explanation: z.string(),
dependencies: z.array(z.string()),
securityFlags: z.array(z.string()).optional(),
confidence: z.number().min(0).max(1)
});
export interface CodeGenerationConfig {
llmClient: LLMClient;
vectorStore: VectorStore;
astParser: ASTParser;
maxContextTokens: number;
temperature: number;
}
export class CodeGenerationStrategy {
private config: CodeGenerationConfig;
constructor(config: CodeGenerationConfig) {
this.config = config;
}
async generate(
task: string,
targetFile: string,
constraints: string[]
): Promise<z.infer<typeof CodeGenerationSchema>> {
// 1. Context Retrieval
const context = await this.buildContext(task, targetFile);
// 2. Prompt Construction
const prompt = this.buildPrompt(task, context, constraints);
// 3. Generation with Structured Output
const rawOutput = await this.config.llmClient.generate({
prompt,
temperature: this.config.temperature,
response_format: { type: 'json_schema', schema: CodeGenerationSchema }
});
// 4. Validation
const result = CodeGenerationSchema.parse(rawOutput);
if (result.confidence < 0.75) {
throw new Error(`Low confidence generation: ${result.confidence}. Manual review required.`);
}
return result;
}
private async buildContext(task: string, targetFile: string): Promise<string> {
// Semantic search for relevant code
const semanticMatches = await this.config.vectorStore.search(task, {
limit: 5,
filter: { file_type: ['ts', 'tsx'] }
});
// AST-based context for dependencies and interfaces
const astContext = await this.config.astParser.getRelevantContext(
targetFile,
semanticMatches.map(m => m.file_path)
);
// Token-aware truncation
return this.truncateContext([
`## Target File: ${targetFile}`,
`## Relevant Interfaces/Types: ${astContext.types}`,
`## Similar Implementations: ${semanticMatches.map(m => m.content).join('\n')}`,
`## Project Constraints: ${this.config.constraints}`
].join('\n'), this.config.maxContextTokens);
}
private buildPrompt(task: string, context: string, constraints: string[]): string {
return `
You are an expert senior developer. Generate code based on the following task.
## Task
${task}
## Context
${context}
## Constraints
${constraints.join('\n')}
## Instructions
1. Output must strictly follow the JSON schema.
2. Code must adhere to existing patterns in the context.
3. Identify any security risks in the 'securityFlags' field.
4. Rate confidence based on context completeness.
`;
}
private truncateContext(text: string, maxTokens: number): string {
// Implementation of token-aware truncation preserving structure
// Prioritizes keeping type definitions and constraints intact
return text.slice(0, maxTokens * 4); // Rough approximation for demo
}
}
Rationale
- Zod Schema: Enforces type safety on LLM output, preventing runtime errors from malformed JSON.
- Confidence Threshold: The model self-assesses confidence. Low confidence triggers a fallback to manual review, preventing silent failures.
- AST Integration: Semantic search alone is insufficient for code. AST parsing ensures the model understands type boundaries and function signatures, reducing hallucination of non-existent methods.
- Token Management: Explicit context truncation prevents prompt overflow and cost spikes.
Pitfall Guide
1. Hallucination of APIs and Libraries
Mistake: The LLM generates code using methods or libraries that do not exist or are deprecated.
Explanation: LLMs predict tokens based on training data, which may include outdated documentation or similar-sounding APIs.
Mitigation: Always inject current dependency versions and API documentation into the context. Use AST parsing to validate symbol existence post-generation.
2. Context Window Saturation
Mistake: Flooding the prompt with irrelevant code, causing the model to lose focus on the specific task.
Explanation: LLMs suffer from "lost in the middle" phenomena where information in the middle of the context window is ignored.
Mitigation: Use RAG to retrieve only the top-k relevant chunks. Place critical instructions and constraints at the beginning and end of the prompt.
3. Security Pattern Leakage
Mistake: Generated code includes hardcoded secrets, insecure defaults, or vulnerable patterns learned from training data.
Explanation: Training data contains vulnerable code from public repositories. The model may replicate these patterns if not constrained.
Mitigation: Implement security scanning (SAST) in the verification pipeline. Inject security policies into the system prompt. Use models fine-tuned for security or with safety filters.
4. Dependency Version Mismatch
Mistake: AI generates code compatible with a newer version of a library than the project uses.
Explanation: The model may not be aware of the project's specific package.json or requirements.txt constraints.
Mitigation: Inject the dependency manifest into the context. Validate generated code against the current lockfile during the verification step.
5. Architectural Drift
Mistake: AI generates code that works but violates architectural boundaries (e.g., accessing the database directly from a controller).
Explanation: Without explicit architectural constraints, the model optimizes for functional correctness over structural integrity.
Mitigation: Define architectural rules as machine-readable constraints. Use linters and custom rules to enforce boundaries. Review generated code for architectural compliance.
Mistake: If AI code generation is exposed to user input (e.g., dynamic code gen features), attackers can inject prompts.
Explanation: Malicious input can override system instructions, causing the model to execute unintended actions.
Mitigation: Sanitize all user inputs before inclusion in prompts. Use separate models for instruction following and code generation. Implement strict output validation.
7. License Compliance Violations
Mistake: Generated code contains snippets copyrighted or licensed in ways incompatible with your project.
Explanation: LLMs may reproduce training data verbatim.
Mitigation: Use models trained on permissive licenses. Implement license scanning for generated code. Add disclaimers and legal review for critical paths.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Greenfield Project | AI-Assist + Template Gen | Accelerates setup; context is fresh; low risk of drift. | Low (High ROI) |
| Legacy Refactoring | AI-Assist + Manual Review | High risk of breaking changes; requires deep context understanding. | Medium (Review overhead) |
| Security-Critical Module | Manual Coding + AI Audit | AI should not write security logic; use AI only for vulnerability scanning. | Low (Audit only) |
| Prototyping / PoC | AI-Generated + Quick Verify | Speed is priority; quality constraints are relaxed. | Low (Fast iteration) |
| Complex Business Logic | AI-Assist + Step-by-Step Verify | Break logic into small steps; verify each step; avoid end-to-end generation. | High (Verification cost) |
Configuration Template
Use this template to configure your AI code generation pipeline. Adapt values based on your stack and risk tolerance.
{
"pipeline": {
"model": {
"provider": "anthropic",
"name": "claude-3-5-sonnet-20240620",
"temperature": 0.2,
"max_tokens": 4096
},
"context": {
"max_tokens": 8000,
"retrieval": {
"strategy": "hybrid",
"semantic_limit": 5,
"ast_enabled": true,
"inject_dependencies": true,
"inject_style_guide": true
}
},
"verification": {
"structured_output": true,
"schema_path": "./schemas/code_gen.json",
"confidence_threshold": 0.75,
"auto_test": true,
"security_scan": true,
"lint_check": true
},
"safety": {
"block_secrets": true,
"allowed_licenses": ["MIT", "Apache-2.0", "BSD-3-Clause"],
"review_required_for": ["security", "auth", "payment"]
},
"cost_control": {
"max_daily_tokens": 10000000,
"alert_threshold": 0.8
}
}
}
Quick Start Guide
-
Install CLI and Dependencies:
npm install -g @codcompass/ai-codegen-cli
npm install zod openai @langchain/core
-
Initialize Configuration:
ai-codegen init --config .ai-codegen.json
Update the config with your API keys and context settings.
-
Index Codebase:
ai-codegen index --path ./src --output ./vector-store
This builds the vector embeddings and AST cache for context retrieval.
-
Run Generation:
ai-codegen generate --task "Create a user registration endpoint with validation" --target ./src/api/user.ts
The CLI will retrieve context, generate code, validate against schema, and run verification checks.
-
Review and Merge:
Inspect the generated output. If confidence is high and verification passes, merge the changes. If low confidence, the CLI will flag the output for manual review.
Conclusion
AI-powered code generation is a force multiplier, not a replacement for engineering discipline. The competitive advantage lies in building robust pipelines that combine AI generation with automated context, structured validation, and rigorous verification. Teams that treat AI as a stochastic component within a deterministic workflow will achieve sustainable velocity gains without compromising code quality or security. Implement the architecture outlined here to transform AI code generation from a productivity risk into a production asset.