anceβ16x cheaper than Claude Opus. Better context compounds with better models, but context alone closes the gap that parameter scaling cannot.
Core Solution
Building a structural context layer requires shifting from probabilistic text matching to deterministic graph indexing. The architecture consists of three components: a repository indexer, a context router, and an MCP-compatible service layer.
Step 1: Repository Indexing
The indexer parses the codebase to extract structural relationships. Instead of chunking files and generating embeddings, it builds a directed graph where nodes represent modules, classes, functions, and interfaces. Edges represent imports, inheritance, composition, and runtime dependencies.
import { parse, traverse } from '@babel/parser';
import { NodePath } from '@babel/traverse';
interface StructuralNode {
id: string;
type: 'module' | 'class' | 'function' | 'interface';
name: string;
filePath: string;
dependencies: string[];
inheritances: string[];
exports: string[];
}
class RepoGraphBuilder {
private nodes: Map<string, StructuralNode> = new Map();
async indexRepository(rootDir: string): Promise<void> {
const files = await this.collectSourceFiles(rootDir);
for (const file of files) {
const ast = parse(await this.readFile(file), { sourceType: 'module' });
this.traverseAST(ast, file);
}
this.resolveCrossReferences();
}
private traverseAST(ast: any, filePath: string): void {
traverse(ast, {
ImportDeclaration: (path: NodePath) => {
const source = path.node.source.value;
this.addNodeDependency(filePath, source);
},
ClassDeclaration: (path: NodePath) => {
const className = path.node.id?.name;
const superClass = path.node.superClass;
if (className) {
this.nodes.set(className, {
id: className,
type: 'class',
name: className,
filePath,
dependencies: [],
inheritances: superClass ? [superClass.name] : [],
exports: []
});
}
}
});
}
private resolveCrossReferences(): void {
for (const [, node] of this.nodes) {
node.dependencies = node.dependencies.map(dep =>
this.nodes.has(dep) ? dep : this.resolveModuleAlias(dep)
);
}
}
}
The indexer runs once during initialization and hooks into version control to incrementally update the graph. This eliminates stale context without requiring full re-indexing.
Step 2: Context Routing
The router translates natural language task descriptions into structural queries. It maps user intent to graph traversal paths instead of vector similarity searches.
interface ContextQuery {
target: string;
intent: 'ownership' | 'impact' | 'inheritance' | 'contract';
scope: 'local' | 'module' | 'system';
}
class ContextRouter {
constructor(private graph: Map<string, StructuralNode>) {}
resolve(query: ContextQuery): StructuralNode[] {
switch (query.intent) {
case 'ownership':
return this.findModuleOwners(query.target);
case 'impact':
return this.traceDownstreamConsumers(query.target);
case 'inheritance':
return this.resolveClassHierarchy(query.target);
case 'contract':
return this.extractInterfaceContracts(query.target);
default:
return this.fallbackToTextSearch(query.target);
}
}
private traceDownstreamConsumers(targetId: string): StructuralNode[] {
const consumers: StructuralNode[] = [];
const queue = [targetId];
const visited = new Set<string>();
while (queue.length > 0) {
const current = queue.shift()!;
if (visited.has(current)) continue;
visited.add(current);
for (const [, node] of this.graph) {
if (node.dependencies.includes(current)) {
consumers.push(node);
queue.push(node.id);
}
}
}
return consumers;
}
}
The router prioritizes deterministic graph traversal. When structural data is incomplete, it falls back to localized text search, but never as the primary navigation mechanism.
Step 3: MCP Service Layer
The context layer exposes itself via the Model Context Protocol. This allows any compatible agent to query architectural relationships without modifying its core execution loop.
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
const server = new McpServer({
name: 'arch-context-provider',
version: '1.0.0'
});
server.tool(
'arch_get_context',
'Retrieve structural context for a target component',
{ target: { type: 'string' }, scope: { type: 'string' } },
async ({ target, scope }) => {
const query: ContextQuery = {
target,
intent: 'ownership',
scope: scope as 'local' | 'module' | 'system'
};
const results = router.resolve(query);
return {
content: [{ type: 'text', text: JSON.stringify(results, null, 2) }]
};
}
);
server.tool(
'arch_trace_impact',
'Identify downstream dependencies and potential breakage points',
{ target: { type: 'string' } },
async ({ target }) => {
const impact = router.resolve({ target, intent: 'impact', scope: 'system' });
return {
content: [{ type: 'text', text: JSON.stringify(impact, null, 2) }]
};
}
);
const transport = new StdioTransport();
await server.connect(transport);
Architecture Decisions & Rationale
- Graph over Vectors: Embeddings measure lexical similarity. Graphs measure structural coupling. Agents need to know what breaks when a file changes, not what sounds similar.
- Incremental Indexing: Full repository scans are expensive. Hooking into git commits ensures the graph stays synchronized with minimal overhead.
- MCP Standardization: Tying the context layer to MCP decouples it from specific agent frameworks. The same service works across Claude Code, Cursor, Windsurf, and custom pipelines.
- Intent-Based Routing: Natural language queries are mapped to explicit traversal strategies. This prevents the agent from guessing relationships and forces deterministic navigation.
Pitfall Guide
1. Treating Embeddings as Architecture Maps
Embeddings return textually similar code, not structurally related code. An agent searching for a cache implementation might retrieve file storage utilities because both mention "write" and "read". The fix is to separate lexical search from structural traversal. Use embeddings only for fallback when graph data is missing.
2. Ignoring Inheritance & Mixin Chains
Complex frameworks rely on deep inheritance hierarchies and mixin compositions. A bug in a derived class often originates in a base implementation. Agents that only examine the immediate file miss the root cause. The fix is to index inheritance edges explicitly and require agents to traverse the full chain before patching.
3. Stale Index Synchronization
Codebases evolve. An index built once and never updated becomes a liability. The agent navigates using outdated relationships, leading to false positives and broken assumptions. The fix is to implement incremental graph updates triggered by commit hooks or CI pipeline events.
4. Context Window Bloat from Over-Indexing
Including every internal utility, test helper, and generated file in the context layer floods the agent's working memory. The fix is to apply architectural filtering: index only public interfaces, core modules, and cross-cutting concerns. Exclude test scaffolding and build artifacts unless explicitly requested.
5. Missing Cross-Cutting Concerns
Middleware, plugins, and event systems create implicit dependencies that don't appear in static imports. An agent tracing direct imports will miss runtime hooks. The fix is to parse configuration files, plugin registries, and event dispatchers to capture dynamic coupling.
6. Assuming Flat Architecture for Complex Systems
Not all codebases require deep structural mapping. Simple CRUD applications with isolated modules benefit less from graph indexing. The fix is to implement a complexity heuristic: enable full structural indexing only when module depth exceeds three levels or inheritance chains surpass two nodes.
7. Over-Reliance on Single-File Context
Agents often request context for one file at a time, missing the system-wide contract. A cache backend fix might violate base class expectations if the agent doesn't see the parent interface. The fix is to enforce contract-aware queries: always retrieve the interface definition alongside the implementation.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Simple CRUD app with isolated modules | Embedding search + lightweight indexing | Low architectural coupling; full graph adds unnecessary overhead | Baseline cost |
| Framework with deep inheritance chains | Structural context layer with inheritance tracing | Bugs often reside in base classes; agents need full chain visibility | +15% infra, -20% tokens |
| Plugin-driven architecture | Dynamic dependency parsing + event hook indexing | Runtime coupling isn't visible in static imports; requires config analysis | +20% infra, -25% tokens |
| Legacy monolith with mixed patterns | Hybrid indexing (graph + semantic fallback) | Inconsistent architecture requires flexible navigation; pure graph may miss undocumented patterns | +10% infra, -15% tokens |
| High-frequency CI/CD pipeline | Incremental graph updates + cached context responses | Full re-indexing blocks pipelines; delta updates maintain speed | Neutral infra, -30% agent latency |
Configuration Template
{
"mcpServers": {
"arch-context-provider": {
"command": "node",
"args": ["./dist/context-server.js"],
"env": {
"REPO_ROOT": "/workspace/target-repo",
"INDEX_STRATEGY": "incremental",
"FILTER_LEVEL": "core_only"
}
}
},
"agentConfig": {
"contextRouting": {
"enabled": true,
"fallbackToEmbeddings": true,
"maxTraversalDepth": 4,
"contractAware": true
},
"tokenOptimization": {
"maxExplorationTokens": 1500,
"autoTerminateOnContextMatch": true
}
}
}
Quick Start Guide
- Initialize the indexer: Run
npx arch-indexer init --repo ./target --strategy incremental to parse the codebase and generate the structural graph.
- Start the MCP service: Execute
node ./dist/context-server.js or deploy via Docker. The service exposes architectural tools over standard input/output.
- Connect your agent: Add the MCP configuration block to your agent's tool registry. No code changes required; the agent discovers tools automatically.
- Validate context routing: Run a test task with
arch_get_context and verify the response returns module boundaries, dependencies, and inheritance chains instead of raw file text.
- Monitor and tune: Track token consumption during the first 50 tasks. Adjust
maxTraversalDepth and FILTER_LEVEL if context window usage exceeds thresholds.
Context quality determines agent efficiency. Models provide reasoning; structural context provides navigation. When the agent knows where to look, it stops guessing and starts solving.