ing this with python-dotenv for secure credential management and structured error handling creates a production-ready foundation.
Core Solution
The following implementation covers environment isolation, secure credential management, core request patterns, stateful conversation routing, real-time streaming, and production error handling.
Environment & SDK Setup
mkdir claude-project
cd claude-project
python -m venv venv
# Mac/Linux
source venv/bin/activate
# Windows
venv\Scripts\activate
pip install anthropic python-dotenv
Secure API Key Management
ANTHROPIC_API_KEY=your-key-here
echo .env > .gitignore
Core Request Pattern & Response Parsing
from dotenv import load_dotenv
from anthropic import Anthropic
load_dotenv()
client = Anthropic()
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "What is a REST API?"
}
]
)
print(message.content[0].text)
print(message.content[0].text) # Claude's response
print(message.stop_reason) # Why it stopped — usually "end_turn"
print(message.usage.input_tokens) # Tokens in your message
print(message.usage.output_tokens) # Tokens in Claude's reply
Contextual Conversations & History Management
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a Python code reviewer. Be direct. Point out issues first, then explain why.",
messages=[
{"role": "user", "content": "Review this: for i in range(len(my_list)): print(my_list[i])"}
]
)
print(message.content[0].text)
from dotenv import load_dotenv
from anthropic import Anthropic
load_dotenv()
client = Anthropic()
history = []
def chat(message: str) -> str:
history.append({"role": "user", "content": message})
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a helpful programming assistant.",
messages=history
)
reply = response.content[0].text
history.append({"role": "assistant", "content": reply})
return reply
print(chat("What is a decorator in Python?"))
print(chat("Show me a real example."))
print(chat("How would that work in Flask?"))
Real-Time Streaming Implementation
from dotenv import load_dotenv
from anthropic import Anthropic
load_dotenv()
client = Anthropic()
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain recursion simply."}
]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
print()
Production-Ready Use Case & Error Handling
from dotenv import load_dotenv
from anthropic import Anthropic
load_dotenv()
client = Anthropic()
def summarize(text: str, sentences: int = 3) -> str:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=512,
system=f"Summarize the following text in {sentences} sentences. Return only the summary.",
messages=[{"role": "user", "content": text}]
)
return response.content[0].text
article = """
The James Webb Space Telescope has captured the deepest infrared image
of the universe ever taken. The image covers a patch of sky approximately
the size of a grain of sand held at arm's length. It contains thousands
of galaxies, some of which formed less than a billion years after the
Big Bang. Scientists believe this data will reshape our understanding
of how the earliest galaxies formed and evolved.
"""
print(summarize(article, sentences=2))
from dotenv import load_dotenv
from anthropic import Anthropic, APIError, RateLimitError, APIConnectionError
load_dotenv()
client = Anthropic()
def ask(question: str) -> str:
try:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": question}]
)
return response.content[0].text
except RateLimitError:
return "Rate limit reached. Wait a moment and try again."
except APIConnectionErr
Pitfall Guide
- Virtual Environment Isolation Failure: Installing packages globally leads to dependency conflicts and broken imports. Always use
python -m venv venv and activate it before running pip install. The (venv) prefix in your terminal is your only visual confirmation of isolation.
- API Key Exposure in Version Control: Committing
.env files to GitHub triggers automated credential scanning bots, leading to unauthorized usage and billing spikes within hours. Always add .env to .gitignore and load credentials via python-dotenv.
- Stateless Context Loss: Forgetting to append the
assistant response to the history list breaks conversational continuity. The API is stateless; you must manually maintain the messages array with alternating user and assistant roles, or the model will answer each prompt as an isolated query.
- Token Limit Truncation: Setting
max_tokens too low (e.g., <128) causes mid-sentence cutoffs. Start with 1024 for general tasks, and monitor message.usage.output_tokens to right-size limits for cost control without sacrificing output completeness.
- Unhandled Rate Limits & Network Failures: LLM APIs enforce strict rate limits and experience transient network issues. Wrapping calls in
try/except blocks for RateLimitError and APIConnectionError with fallback logic or exponential backoff is mandatory for production stability.
- Ignoring System Prompt Optimization: Treating the
system parameter as optional severely limits model control. Use it to enforce output formats, define roles, and constrain behavior. Proper system prompting drastically reduces hallucination and eliminates the need for heavy post-processing.
Deliverables
- Claude API Integration Blueprint: A step-by-step architectural guide covering environment setup, secure credential management, stateful conversation routing, streaming UX patterns, and production error handling strategies.
- Production Readiness Checklist: Validation steps for venv activation,
.gitignore configuration, token usage monitoring, error handling coverage, rate limit fallback strategies, and system prompt effectiveness testing.
- Configuration Templates: Ready-to-use
.env structure, requirements.txt (anthropic, python-dotenv), and modular Python snippets for single-turn inference, multi-turn stateful chats, and real-time streaming implementations.