vendor --> Politely end call
+-- Support question --> AI resolves it
+-- Sales lead --> Transfer to you
+-- Urgent issue --> SMS alert + transfer
### 1. VoIPBin Provisioning
Initialize the telephony endpoint and retrieve authentication credentials:
curl -X POST https://api.voipbin.net/v1.0/auth/signup
-H "Content-Type: application/json"
-d \x27{"username": "your-email@example.com", "password": "your-password", "name": "Your Name"}\x27
This returns an `accesskey.token` immediately β no OTP, no waiting. Next, rent a phone number and point it at your webhook URL, and you are ready.
### 2. Core Screener Implementation (FastAPI + OpenAI)
The webhook server maintains lightweight session state, streams transcriptions to GPT-4o, and executes routing actions via VoIPBin's action API.
from fastapi import FastAPI, Request
from openai import OpenAI
import httpx, json
app = FastAPI()
client = OpenAI()
VOIPBIN_TOKEN = "YOUR_VOIPBIN_TOKEN"
MY_PHONE = "+14155551234"
BASE_URL = "https://api.voipbin.net/v1.0"
sessions = {}
SCREENER_PROMPT = """
You are an AI call screener. Your job:
- Greet the caller and ask who they are and why they are calling
- Classify the call as one of:
- SPAM: robocalls, solicitations, irrelevant vendors
- SUPPORT: tech questions, how-to, existing customers
- SALES: potential new customers, partnership inquiries
- URGENT: production issues, emergencies
Respond with JSON:
{"classification": "SALES", "summary": "Jane from Acme, wants enterprise pricing", "response": "What you said to the caller"}
"""
@app.post("/webhook/call")
async def handle_call(request: Request):
event = await request.json()
call_id = event["call_id"]
event_type = event["type"]
if event_type == "call.started":
sessions[call_id] = {"history": [], "turn": 0}
await speak(call_id, "Hi, thanks for calling. Could you tell me your name and what you are calling about today?")
elif event_type == "call.transcription":
caller_text = event["text"]
session = sessions.get(call_id, {"history": [], "turn": 0})
session["history"].append({"role": "user", "content": caller_text})
session["turn"] += 1
result = await screen_call(session["history"])
if "classification" in result:
await handle_classification(call_id, result)
else:
response_text = result.get("response", "Could you tell me a bit more?")
session["history"].append({"role": "assistant", "content": response_text})
await speak(call_id, response_text)
sessions[call_id] = session
return {"status": "ok"}
async def screen_call(history: list) -> dict:
messages = [{"role": "system", "content": SCREENER_PROMPT}] + history
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
response_format={"type": "json_object"}
)
return json.loads(response.choices[0].message.content)
async def handle_classification(call_id: str, result: dict):
classification = result["classification"]
summary = result["summary"]
if classification == "SPAM":
await speak(call_id, "Thanks for calling. We are not interested at this time. Have a great day!")
await end_call(call_id)
elif classification == "SUPPORT":
await speak(call_id, "Let me help you with that directly.")
# Add RAG over your docs here
elif classification == "SALES":
await speak(call_id, "This sounds like a great conversation. Let me connect you with our team.")
await transfer_call(call_id, MY_PHONE)
elif classification == "URGENT":
await speak(call_id, "I understand this is urgent. Connecting you right away.")
await send_sms_alert(summary)
await transfer_call(call_id, MY_PHONE)
async def speak(call_id: str, text: str):
async with httpx.AsyncClient() as http:
await http.post(
f"{BASE_URL}/calls/{call_id}/actions",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"},
json={"action": "speak", "text": text, "language": "en-US"}
)
async def transfer_call(call_id: str, phone: str):
async with httpx.AsyncClient() as http:
await http.post(
f"{BASE_URL}/calls/{call_id}/actions",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"},
json={"action": "transfer", "destination": phone}
)
async def end_call(call_id: str):
async with httpx.AsyncClient() as http:
await http.delete(
f"{BASE_URL}/calls/{call_id}",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"}
)
async def send_sms_alert(summary: str):
async with httpx.AsyncClient() as http:
await http.post(
f"{BASE_URL}/messages",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"},
json={"to": MY_PHONE, "text": f"URGENT CALL: {summary}"}
)
### 3. Production Extensions
Enhance the screener with CRM enrichment, temporal routing rules, and audit logging:
**Caller ID enrichment:**
async def enrich_caller(phone_number: str) -> dict:
existing = await crm.lookup(phone_number)
if existing:
return {"known": True, "name": existing.name, "tier": existing.tier}
return {"known": False}
**Time-based rules:**
import datetime
def is_after_hours():
hour = datetime.datetime.now().hour
return hour < 9 or hour > 18
After hours: only URGENT gets through
**Screening summary log:**
async def log_screening(call_id, result):
await db.insert("screened_calls", {
"call_id": call_id,
"classification": result["classification"],
"summary": result["summary"],
"timestamp": datetime.datetime.utcnow()
})
## Pitfall Guide
1. **In-Memory Session State Volatility**: The `sessions = {}` dictionary works for prototyping but fails under load or process restarts. Migrate to Redis or a persistent key-value store with TTLs matching call duration to prevent state leaks and memory bloat.
2. **LLM JSON Parsing Fragility**: Even with `response_format={"type": "json_object"}`, network timeouts or token limits can truncate responses. Implement a retry wrapper with regex fallback extraction and explicit schema validation (Pydantic) before routing.
3. **Telephony Latency Mismatch**: STT β LLM β TTS pipelines introduce 1.5β3s of latency. Without Voice Activity Detection (VAD) or streaming TTS, callers experience awkward silences. Configure VoIPBin's streaming endpoints and implement turn-taking guards to prevent AI interruption.
4. **Over-Aggressive Classification Thresholds**: Hard routing rules may block legitimate vendors or early-stage prospects. Introduce a `confidence_score` field in the prompt and implement a fallback path: if confidence < 0.75, route to a human or schedule a callback instead of terminating.
5. **Webhook Security & Idempotency**: Public `/webhook/call` endpoints are vulnerable to replay attacks and duplicate event processing. Validate request signatures, implement idempotency keys per `call_id`, and rate-limit transcription events to prevent LLM API quota exhaustion.
6. **Ignoring Business Context & Compliance**: Pure intent classification misses caller history and regulatory requirements (e.g., TCPA, GDPR). Always enrich calls with CRM data, respect do-not-call lists, and log consent states before transferring or recording.
## Deliverables
- **π Architecture Blueprint**: Complete system flow diagram detailing SIP signaling, webhook event lifecycle, LLM prompt chaining, and action routing matrix. Includes data persistence recommendations for production scale.
- **β
Deployment Checklist**: Step-by-step validation protocol covering VoIPBin credential rotation, webhook TLS verification, STT/TTS latency benchmarking, LLM rate-limit configuration, and failover routing tests.
- **βοΈ Configuration Templates**: Ready-to-use `.env` scaffolding, production-hardened prompt templates with confidence scoring, routing rule YAML, and database schema for call audit logging.