n cling to Flat-Rate, leaving money on the table because their architecture cannot support metering.
Core Solution
Implementing a robust pricing engine requires decoupling metering from billing. The metering service records usage; the billing service calculates charges. This separation allows you to change pricing logic without altering data collection.
Step-by-Step Implementation
1. Define the Metering Schema
Create a flexible schema that supports multiple metrics per tenant. Avoid hard-coding metric names in the database; use a key-value approach or a dedicated metrics table.
// types/billing.ts
export interface BillingPlan {
id: string;
name: string;
currency: string;
interval: 'monthly' | 'yearly' | 'pay_as_you_go';
tiers: PlanTier[];
features: Record<string, FeatureLimit>;
}
export interface PlanTier {
upTo: number | null; // null for unlimited
unitPrice: number;
}
export interface FeatureLimit {
quota: number;
resetCycle: 'monthly' | 'never';
}
export interface MeterEvent {
eventId: string;
tenantId: string;
metric: string; // e.g., 'api_calls', 'storage_gb', 'seats'
value: number;
timestamp: Date;
properties?: Record<string, string>; // For filtering/aggregation
}
2. Event Ingestion Service
Implement an ingestion endpoint that is idempotent and high-throughput. Use a message queue (Kafka, RabbitMQ, or SQS) to decouple ingestion from processing.
// services/metering-ingestion.ts
import { Kafka } from 'kafkajs';
const kafka = new Kafka({ clientId: 'metering-service', brokers: ['localhost:9092'] });
const producer = kafka.producer();
export async function recordUsage(event: MeterEvent): Promise<void> {
// Idempotency check to prevent double-counting
const isDuplicate = await checkIdempotencyKey(event.eventId);
if (isDuplicate) return;
await producer.send({
topic: 'usage-events',
messages: [
{
key: event.tenantId,
value: JSON.stringify(event)
}
]
});
await updateRealtimeQuota(event);
}
async function checkIdempotencyKey(eventId: string): Promise<boolean> {
// Check Redis for eventId with TTL matching retry window
const exists = await redis.get(`idempotency:${eventId}`);
if (exists) return true;
await redis.set(`idempotency:${eventId}`, '1', 'EX', 3600);
return false;
}
3. Real-Time Quota Enforcement
For hybrid models, enforce limits in real-time to prevent unexpected overages. Use a distributed cache with sliding window or fixed window counters.
// services/quota-manager.ts
import Redis from 'ioredis';
const redis = new Redis();
export async function checkQuota(
tenantId: string,
metric: string,
limit: number,
increment: number
): Promise<{ allowed: boolean; current: number }> {
const key = `quota:${tenantId}:${metric}`;
// Atomic increment and check
const current = await redis.incrby(key, increment);
if (current > limit) {
// Decrement back to maintain accurate count without allowing usage
await redis.decrby(key, increment);
return { allowed: false, current: current - increment };
}
// Set expiry if not already set (monthly reset logic)
const ttl = await redis.ttl(key);
if (ttl === -1) {
await redis.expireat(key, getNextMonthBoundary());
}
return { allowed: true, current };
}
function getNextMonthBoundary(): number {
const now = new Date();
const nextMonth = new Date(now.getFullYear(), now.getMonth() + 1, 1);
return Math.floor(nextMonth.getTime() / 1000);
}
4. Aggregation and Billing Sync
A background worker consumes events, aggregates them per billing cycle, and syncs with the billing provider (Stripe, Chargebee, or custom).
// workers/billing-sync.ts
export async function aggregateAndBill(tenantId: string, period: DateRange): Promise<BillingRecord> {
// Query aggregated metrics from data warehouse (ClickHouse/BigQuery)
// for high-volume accuracy, rather than raw Redis counters
const metrics = await dataWarehouse.query(`
SELECT metric, SUM(value) as total
FROM usage_events
WHERE tenant_id = ? AND timestamp BETWEEN ? AND ?
GROUP BY metric
`, [tenantId, period.start, period.end]);
const invoice = calculateInvoice(metrics, tenantId);
await billingProvider.createInvoice(tenantId, invoice);
return invoice;
}
Architecture Decisions and Rationale
- Event Sourcing for Auditability: Store every metering event. This allows reconstruction of billing history if calculations change or disputes arise.
- CQRS Pattern: Use Command Query Separation. Writes go to the ingestion service; reads for quota checks go to Redis; reads for billing go to the data warehouse. This optimizes for throughput and consistency requirements of each use case.
- UTC Normalization: All timestamps must be stored in UTC. Billing cycles should be calculated based on UTC to avoid timezone drift errors, which cause revenue recognition issues.
- Idempotency Keys: Every metering event must have a unique ID. Network retries are inevitable; without idempotency, usage counts will inflate, leading to customer disputes.
Pitfall Guide
Common Mistakes
- Tying Pricing to UI Components: Hard-coding plan limits in frontend components or controllers makes pricing changes require a code deployment.
- Fix: Fetch limits from a configuration service or API at runtime.
- Race Conditions in Quota Checks: Checking quota and decrementing in separate operations allows concurrent requests to exceed limits.
- Fix: Use atomic operations in Redis or database transactions.
- Ignoring Timezone Edge Cases: Billing cycles that span DST changes or timezone shifts can cause double-billing or missed billing.
- Fix: Normalize all cycle calculations to UTC. Store customer timezone separately for display purposes only.
- Data Consistency Gaps: Discrepancies between the metering service and the billing provider due to failed syncs.
- Fix: Implement a reconciliation job that compares internal aggregates with billing provider records daily.
- Performance Bottlenecks: Synchronous database lookups for quota checks on every API request.
- Fix: Cache quotas in Redis. Use probabilistic data structures (HyperLogLog) for approximate counting if exact precision is not required.
- Lack of Dunning Management: Failing to handle payment failures gracefully.
- Fix: Integrate with billing provider webhooks to handle
invoice.payment_failed events and implement a retry/backoff strategy for access revocation.
- Currency Fluctuation Exposure: Storing prices in a single currency without hedging or dynamic conversion for global customers.
- Fix: Store base prices in USD and apply FX rates at invoice generation, or use multi-currency support from the billing provider.
Best Practices
- Feature Flags for Pricing Experiments: Wrap new pricing models in feature flags to roll out to specific tenant segments.
- Circuit Breakers: Protect the billing sync worker from cascading failures if the billing provider API is down.
- Granular Metrics: Design metrics to be composable. Instead of
api_calls_pro, use api_calls with a plan property. This allows flexibility in defining new tiers without schema changes.
- Customer Self-Service: Provide tenants with a usage dashboard. Transparency reduces churn and support tickets related to billing surprises.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Early Stage MVP | Flat-Rate with Stripe Subscriptions | Minimal engineering overhead; focus on product validation. | Low infra cost; high manual ops. |
| High-Volume API | Pure Usage with Async Aggregation | Customers expect pay-per-use; aligns cost with value. | High infra cost; requires robust pipeline. |
| Enterprise SaaS | Hybrid Model with Custom Contracts | Supports base revenue + variable usage; negotiable terms. | Medium infra cost; complex billing logic. |
| Marketplace SaaS | Revenue Share with Split Payments | Handles multi-party transactions; automates payouts. | High complexity; requires specialized billing provider. |
Configuration Template
Use this JSON structure to define dynamic pricing plans that can be loaded by the pricing engine.
{
"planId": "pro_v2",
"name": "Pro Plan",
"currency": "USD",
"interval": "monthly",
"baseFee": 4900,
"metrics": [
{
"key": "api_calls",
"displayName": "API Requests",
"included": 50000,
"overage": {
"unit": "1000",
"price": 50
}
},
{
"key": "storage_gb",
"displayName": "Storage",
"included": 10,
"overage": {
"unit": "1",
"price": 200
}
}
],
"features": {
"sso_enabled": true,
"support_level": "priority"
}
}
Quick Start Guide
- Initialize Metering Service: Deploy the ingestion service with Kafka and Redis dependencies. Configure environment variables for broker and cache connections.
- Define Plan Config: Create a
plans.json file using the configuration template and load it into your configuration service.
- Instrument Application: Add middleware to your API routes to call
recordUsage for billable actions. Ensure eventId is generated for idempotency.
- Test Quota Enforcement: Simulate requests exceeding the quota limit. Verify that the API returns
429 Too Many Requests and that Redis counters reflect accurate usage.
- Verify Billing Sync: Trigger the aggregation worker and confirm that an invoice is generated in your billing provider with correct calculations.