enabling real-time progress tracking.
Core Solution
LDR operates as a self-hosted AI research assistant that orchestrates multi-source retrieval, iterative synthesis, and structured report generation. The architecture bundles three core components:
- Ollama: Local LLM inference engine
- SearXNG: Self-hosted meta-search engine
- SQLCipher: Encrypted SQLite database for persistent knowledge storage
Deployment Architecture
Option 1: Docker (Recommended)
Handles dependency resolution, encryption initialization, and service wiring automatically.
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml
docker compose up -d
Wait about 30 seconds, then open http://localhost:5000.
With NVIDIA GPU acceleration (Linux only):
First install the NVIDIA Container Toolkit:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor \
-o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install nvidia-container-toolkit -y
sudo systemctl restart docker
nvidia-smi # verify it worked
Then bring up the stack with GPU support:
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.gpu.override.yml
docker compose -f docker-compose.yml -f docker-compose.gpu.override.yml up -d
Option 2: pip (For developers / Python integration)
# Install the package
pip install local-deep-research
# Run SearXNG in Docker for search
docker run -d -p 8080:8080 --name searxng searxng/searxng
# Install Ollama from https://ollama.ai, then pull a model
ollama pull gemma3:12b
# Start the web UI
python -m local_deep_research.web.app
Important note on encryption: The pip install does not automatically set up SQLCipher (the AES-256 encrypted database LDR uses for storing your data and API keys). If you hit errors during setup, bypass it for now with:
export LDR_ALLOW_UNENCRYPTED=true
This stores data in plain SQLite. Fine for local dev, not recommended for production or shared setups. Docker handles encryption out of the box.
Programmatic Integration
Python API:
from local_deep_research.api import LDRClient, quick_query
# One-liner research
summary = quick_query("username", "password", "What is the current state of Rust async runtimes?")
print(summary)
# Client for more control
client = LDRClient()
client.login("username", "password")
result = client.quick_research("Compare FAISS vs Hnswlib for vector search at scale")
print(result["summary"])
HTTP API:
LDR exposes a REST API with session-based authentication and CSRF protection. The auth flow is a bit verbose but works reliably:
import requests
from bs4 import BeautifulSoup
session = requests.Session()
# Get CSRF token from login page
login_page = session.get("http://localhost:5000/auth/login")
soup = BeautifulSoup(login_page.text, "html.parser")
csrf = soup.find("input", {"name": "csrf_token"}).get("value")
# Authenticate
session.post("http://localhost:5000/auth/login", data={
"username": "user",
"password": "pass",
"csrf_token": csrf
})
# Get API CSRF token
api_csrf = session.get("http://localhost:5000/auth/csrf-token").json()["csrf_token"]
# Submit a research query
response = session.post(
"http://localhost:5000/api/start_research",
json={"query": "What are the tradeoffs between gRPC and REST for internal microservices?"},
headers={"X-CSRF-Token": api_csrf}
)
print(response.json())
The repository includes ready-to-run HTTP examples under examples/api_usage/http/ that handle authentication, retry logic, and progress polling.
Enterprise / RAG Integration
LDR integrates with existing vector stores via LangChain retrievers, enabling hybrid search across live web results and internal knowledge bases:
from local_deep_research.api import quick_summary
result = quick_summary(
query="What are our current deployment procedures for the payments service?",
retrievers={"internal_kb": your_langchain_retriever},
search_tool="internal_kb"
)
Supported backends include FAISS, Chroma, Pinecone, Weaviate, Elasticsearch, and any LangChain-compatible retriever.
Search Sources & LLM Configuration
Free (no API key needed): arXiv, PubMed, Semantic Scholar, Wikipedia, SearXNG, GitHub, The Guardian, Wikinews, Wayback Machine.
Premium (API key required): Tavily, Google (SerpAPI/Programmable Search), Brave Search.
Local LLMs: Llama 3, Mistral, Gemma, DeepSeek, and any Ollama-supported model.
Cloud LLMs: OpenAI (GPT-4, GPT-4.1-mini), Anthropic (Claude 3), Google (Gemini), 100+ models via OpenRouter.
Pitfall Guide
- SQLCipher Encryption Bypass in pip Install: The
pip install path does not auto-configure SQLCipher. Setting LDR_ALLOW_UNENCRYPTED=true drops data into plain SQLite, which is acceptable for local development but violates security compliance in production or multi-user environments. Always validate encryption status before deploying to shared infrastructure.
- Hardware Constraints for Local LLM Execution: Running Ollama + SearXNG + LDR concurrently demands significant RAM and VRAM. CPU-only setups will experience severe latency during synthesis loops. A dedicated GPU (8GB+ VRAM recommended) is required for acceptable throughput, and NVIDIA Container Toolkit misconfiguration is a common deployment blocker.
- CSRF Token Handling in HTTP API: The REST API enforces strict CSRF protection. Failing to extract the token from the login page HTML and attach it to subsequent requests will result in
403 Forbidden errors. Always parse the <input name="csrf_token"> value and include it in both authentication and research submission headers.
- Over-Provisioning for Simple Q&A Workloads: LDR is engineered for multi-step research synthesis, not conversational Q&A. Routing simple factual queries through the iterative search loop introduces unnecessary latency and resource consumption. Reserve LDR for literature reviews, competitive analysis, and complex technical investigations.
- GPU Passthrough & NVIDIA Container Toolkit Configuration: Docker GPU acceleration requires explicit
--gpus all flags and proper NVIDIA driver/container toolkit alignment. Mismatched driver versions or missing nvidia-container-toolkit packages will cause silent fallback to CPU execution, degrading performance by 10-50x. Verify with nvidia-smi inside the container.
- Ignoring Zero-Knowledge Password Recovery Limits: LDR's zero-knowledge architecture means there is no password recovery mechanism. If you lose credentials, the SQLCipher database becomes permanently inaccessible. Implement secure credential management (e.g., HashiCorp Vault, Bitwarden) and maintain encrypted backups of the database volume.
Deliverables
π Deployment & Architecture Blueprint
A structured reference covering the LDR service topology (Ollama β SearXNG β LDR Core β SQLCipher), network port mapping, GPU passthrough requirements, and hybrid RAG integration patterns. Includes environment variable matrices for cloud vs. local LLM routing.
β
Pre-Flight & Integration Checklist