sions
- Abstraction Layer: Direct endpoint consumption creates tight coupling. A client class normalizes responses, handles missing fields, and provides consistent method signatures.
- Local Caching: OSM data updates asynchronously. Repeated identical queries waste bandwidth and increase latency. An in-memory TTL cache reduces redundant network calls while preserving data freshness.
- Dynamic Radius Scaling: Fixed radius queries in dense urban areas return hundreds of results, triggering timeouts. The client adjusts search radius based on population density heuristics.
- Graceful Degradation: Not all OSM entries contain phone numbers, emails, or websites. The response schema uses optional fields with explicit fallbacks to prevent downstream type errors.
Implementation
import { LRUCache } from 'lru-cache';
interface LocationQuery {
city: string;
businessType: string;
searchRadiusKm?: number;
maxResults?: number;
}
interface BusinessRecord {
id: string;
name: string;
address: string;
coordinates: { lat: number; lng: number };
contact?: {
phone?: string;
website?: string;
email?: string;
};
operatingHours?: string;
source: 'osm_registry';
}
interface RegistryResponse {
results: BusinessRecord[];
totalAvailable: number;
queryMetadata: {
location: string;
category: string;
radiusUsed: number;
};
}
class PointOfInterestRegistry {
private baseUrl: string;
private cache: LRUCache<string, RegistryResponse>;
constructor(baseUrl: string, cacheTtlMs: number = 300_000) {
this.baseUrl = baseUrl;
this.cache = new LRUCache({
max: 500,
ttl: cacheTtlMs,
allowStale: false
});
}
async queryLocations(query: LocationQuery): Promise<RegistryResponse> {
const cacheKey = this.generateCacheKey(query);
const cached = this.cache.get(cacheKey);
if (cached) return cached;
const adjustedRadius = this.calculateOptimalRadius(query.city, query.searchRadiusKm || 5);
const limit = Math.min(query.maxResults || 100, 500);
const params = new URLSearchParams({
location: query.city,
category: query.businessType,
radius_km: adjustedRadius.toString(),
limit: limit.toString()
});
const response = await fetch(`${this.baseUrl}/api/businesses?${params}`);
if (!response.ok) {
throw new Error(`Registry query failed: ${response.status} ${response.statusText}`);
}
const raw = await response.json();
const normalized = this.normalizePayload(raw, query, adjustedRadius);
this.cache.set(cacheKey, normalized);
return normalized;
}
async aggregateCounts(query: LocationQuery): Promise<number> {
const params = new URLSearchParams({
location: query.city,
category: query.businessType
});
const response = await fetch(`${this.baseUrl}/api/count?${params}`);
if (!response.ok) throw new Error(`Count endpoint failed: ${response.status}`);
const data = await response.json();
return data.total ?? 0;
}
async fetchTaxonomy(): Promise<string[]> {
const response = await fetch(`${this.baseUrl}/api/categories`);
if (!response.ok) throw new Error(`Taxonomy fetch failed: ${response.status}`);
const data = await response.json();
return Array.isArray(data) ? data : data.categories ?? [];
}
private normalizePayload(
raw: any,
query: LocationQuery,
radiusUsed: number
): RegistryResponse {
const results: BusinessRecord[] = (raw.results || []).map((item: any, idx: number) => ({
id: item.osm_id || `synthetic_${idx}`,
name: item.name || 'Unnamed Location',
address: item.address || item.formatted_address || '',
coordinates: {
lat: parseFloat(item.lat || item.latitude || 0),
lng: parseFloat(item.lon || item.longitude || 0)
},
contact: {
phone: item.phone || item.contact_phone,
website: item.website || item.url,
email: item.email || item.contact_email
},
operatingHours: item.opening_hours || item.hours,
source: 'osm_registry'
}));
return {
results,
totalAvailable: raw.total_available || results.length,
queryMetadata: {
location: query.city,
category: query.businessType,
radiusUsed
}
};
}
private calculateOptimalRadius(city: string, requested: number): number {
const denseMetros = ['paris', 'london', 'tokyo', 'new york', 'berlin', 'seoul'];
const isDense = denseMetros.some(m => city.toLowerCase().includes(m));
return isDense ? Math.min(requested, 3) : requested;
}
private generateCacheKey(query: LocationQuery): string {
return `${query.city}:${query.businessType}:${query.searchRadiusKm || 5}:${query.maxResults || 100}`;
}
}
export default PointOfInterestRegistry;
Rationale
The PointOfInterestRegistry class isolates network I/O from business logic. Field normalization handles OSM's inconsistent tagging conventions (e.g., phone vs contact_phone, lat vs latitude). The LRU cache prevents redundant calls during batch processing or UI re-renders. Dynamic radius calculation prevents timeout failures in high-density zones. The aggregateCounts method enables market research without downloading full payloads, reducing bandwidth by up to 90% for analytical workflows.
Pitfall Guide
1. Category Mapping Assumption
Explanation: Commercial APIs expose 100+ granular categories. The OSM registry provides 37 fixed taxonomies. Assuming 1:1 mapping causes missing results or misclassified data.
Fix: Implement a translation layer that maps internal categories to the nearest OSM equivalent. Maintain a fallback list that queries multiple OSM categories when precision is critical.
2. Data Freshness Blind Spots
Explanation: OSM relies on community contributions. Business closures, relocations, or phone number changes may lag by weeks or months.
Fix: Add a last_verified timestamp to your internal schema. Schedule periodic re-validation jobs that flag records older than 90 days for manual review or supplemental commercial API calls.
3. Unbounded Radius Queries
Explanation: Default 5km searches in urban centers return 300–500 results, triggering gateway timeouts or memory pressure.
Fix: Enforce dynamic radius scaling based on population density. Implement client-side pagination or result chunking when limit approaches 500.
Explanation: Not all OSM entries contain email, phone, or website data. Downstream systems expecting complete contact objects will throw type errors.
Fix: Use optional chaining and explicit fallbacks in your normalization layer. Document which fields are guaranteed vs. conditional in your API contracts.
5. MCP Stream Integration Errors
Explanation: AI assistants require streamable-http transport for tool calling. Using standard HTTP or SSE breaks context windows and tool execution loops.
Fix: Configure the MCP client with type: "streamable-http" explicitly. Validate tool response schemas against the AI framework's expected payload structure before deployment.
6. Geographic Bias in US Coverage
Explanation: EU coverage is excellent due to strong OSM contributor density. US coverage is good but fragmented, especially in rural or suburban zones.
Fix: Cross-reference US queries with municipal open data portals or supplement with targeted commercial API calls for high-value regions. Maintain a coverage quality score per region.
7. Ignoring Fair-Use Throttling
Explanation: "No hard caps" does not mean unlimited. Aggressive polling triggers IP-level rate limiting or temporary blocks.
Fix: Implement client-side rate limiting (e.g., 10 requests/second). Use exponential backoff on 429 responses. Queue batch jobs instead of firing parallel requests.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Internal dashboards & market research | OSM-backed registry | No reviews needed, high query volume, zero budget | $0/month |
| Customer-facing discovery app | Commercial POI API | Requires photos, ratings, and real-time popularity | ~$275+/month |
| AI agent tool calling | OSM-backed registry (MCP) | Structured JSON, streamable transport, no auth friction | $0/month |
| High-volume lead generation | Hybrid routing | OSM for bulk, commercial for enriched contact verification | Variable, optimized |
| Prototyping & MVP development | OSM-backed registry | Immediate deployment, no billing setup, fast iteration | $0/month |
Configuration Template
// registry.config.ts
import PointOfInterestRegistry from './PointOfInterestRegistry';
const registry = new PointOfInterestRegistry(
'https://bizdata-web.vercel.app',
300_000 // 5-minute cache TTL
);
// MCP client configuration for AI assistants
const mcpConfig = {
mcpServers: {
poi_registry: {
type: 'streamable-http',
url: 'https://bizdata-web.vercel.app/api/mcp'
}
}
};
export { registry, mcpConfig };
Quick Start Guide
- Initialize the client: Import
PointOfInterestRegistry and instantiate with the base URL and desired cache TTL.
- Query a category: Call
registry.queryLocations({ city: 'Berlin', businessType: 'cafe', maxResults: 50 }) to retrieve structured results.
- Validate responses: Check
queryMetadata.radiusUsed and iterate through results to verify coordinate accuracy and contact field presence.
- Integrate with AI tools: Add the
mcpConfig block to your AI assistant's configuration file. Verify tool execution by requesting location-based queries through the chat interface.
- Monitor & scale: Track cache hit rates and request velocity. Adjust TTL or implement Redis-backed caching if throughput exceeds 50 queries/second.
This architecture removes billing friction while preserving the structural integrity required for production location services. By normalizing OSM data through a dedicated client layer, teams gain predictable scaling, explicit error boundaries, and a clear migration path when premium features become necessary.