s compared to single-pass generation.
- SQLCipher AES-256 encryption with zero-knowledge design ensures that even infrastructure administrators cannot access research data or API keys.
- Local knowledge base indexing enables future queries to leverage historically collected sources, creating a compounding research asset.
Core Solution
LDR operates on a deterministic research loop: query ingestion β strategy selection β sub-query decomposition β multi-source retrieval β iterative synthesis β citation-backed report generation β local library indexing. The architecture supports both local and cloud LLMs via Ollama or direct API integration, while SearXNG handles privacy-preserving meta-search.
Installation & Deployment
Option 1: Docker (Recommended)
Handles dependencies, encryption, and service wiring automatically.
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml
docker compose up -d
Wait about 30 seconds, then open http://localhost:5000.
With NVIDIA GPU acceleration (Linux only):
First install the NVIDIA Container Toolkit:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor \
-o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install nvidia-container-toolkit -y
sudo systemctl restart docker
nvidia-smi # verify it worked
Then bring up the stack with GPU support:
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.gpu.override.yml
docker compose -f docker-compose.yml -f docker-compose.gpu.override.yml up -d
The Docker Compose setup bundles Ollama (local LLM runner) and SearXNG (self-hosted meta-search engine) together with LDR. Everything runs locally.
Option 2: pip (For developers / Python integration)
# Install the package
pip install local-deep-research
# Run SearXNG in Docker for search
docker run -d -p 8080:8080 --name searxng searxng/searxng
# Install Ollama from https://ollama.ai, then pull a model
ollama pull gemma3:12b
# Start the web UI
python -m local_deep_research.web.app
Important note on encryption: The pip install does not automatically set up SQLCipher (the AES-256 encrypted database LDR uses for storing your data and API keys). If you hit errors during setup, bypass it for now with:
export LDR_ALLOW_UNENCRYPTED=true
This stores data in plain SQLite. Fine for local dev, not recommended for production or shared setups. Docker handles encryption out of the box.
API Integration
Python API
from local_deep_research.api import LDRClient, quick_query
# One-liner research
summary = quick_query("username", "password", "What is the current state of Rust async runtimes?")
print(summary)
# Client for more control
client = LDRClient()
client.login("username", "password")
result = client.quick_research("Compare FAISS vs Hnswlib for vector search at scale")
print(result["summary"])
HTTP API
import requests
from bs4 import BeautifulSoup
session = requests.Session()
# Get CSRF token from login page
login_page = session.get("http://localhost:5000/auth/login")
soup = BeautifulSoup(login_page.text, "html.parser")
csrf = soup.find("input", {"name": "csrf_token"}).get("value")
# Authenticate
session.post("http://localhost:5000/auth/login", data={
"username": "user",
"password": "pass",
"csrf_token": csrf
})
# Get API CSRF token
api_csrf = session.get("http://localhost:5000/auth/csrf-token").json()["csrf_token"]
# Submit a research query
response = session.post(
"http://localhost:5000/api/start_research",
json={"query": "What are the tradeoffs between gRPC and REST for internal microservices?"},
headers={"X-CSRF-Token": api_csrf}
)
print(response.json())
The repository includes ready-to-run HTTP examples under examples/api_usage/http/ that handle authentication, retry logic, and progress polling.
Enterprise / RAG Integration
LDR integrates with existing vector stores via LangChain retrievers, enabling hybrid search across live web results and internal knowledge bases.
from local_deep_research.api import quick_summary
result = quick_summary(
query="What are our current deployment procedures for the payments service?",
retrievers={"internal_kb": your_langchain_retriever},
search_tool="internal_kb"
)
It supports FAISS, Chroma, Pinecone, Weaviate, Elasticsearch, and anything LangChain-compatible. This is where the tool gets interesting for teams β you can combine live web search with your own internal documents in a single research pass.
Pitfall Guide
- SQLCipher Encryption Bypass in pip Install: The
LDR_ALLOW_UNENCRYPTED=true environment variable forces plain SQLite storage. While useful for local debugging, it exposes API keys and research data in plaintext. Always use Docker for production or manually configure SQLCipher when deploying via pip.
- Hardware Resource Contention: Running Ollama, SearXNG, and the LDR application stack simultaneously demands significant RAM and CPU/GPU resources. Systems with <16GB RAM or without GPU acceleration will experience severe latency during iterative synthesis and local model inference.
- Misaligned Use Cases: LDR is engineered for deep, multi-source research workflows. Using it for simple Q&A or quick fact-checking introduces unnecessary orchestration overhead. Reserve it for literature reviews, technical analysis, and compounding knowledge discovery.
- Zero-Knowledge Password Recovery Limitation: The security model intentionally omits password recovery mechanisms to maintain zero-knowledge architecture. If credentials are lost, the encrypted SQLCipher database becomes permanently inaccessible. Implement secure credential management (e.g., hardware keys, password managers) before deployment.
- GPU Passthrough Configuration Gaps: NVIDIA acceleration is Linux-only and requires the NVIDIA Container Toolkit, proper Docker daemon configuration, and correct
docker-compose.gpu.override.yml mounting. Missing steps will cause silent fallback to CPU inference, drastically slowing research loops.
- Data Leakage via Search Engines: Selecting a local LLM does not guarantee zero network traffic. SearXNG and premium search APIs (Tavily, Google, Brave) still route queries externally. Review your search source configuration to ensure compliance with internal data handling policies.
Deliverables
- Deployment Blueprint: Architecture diagram detailing the Ollama + SearXNG + LDR container orchestration, SQLCipher encryption flow, and LangChain retriever integration paths for hybrid RAG.
- Pre-Flight Checklist: Hardware requirements (CPU/GPU/RAM), dependency validation (Docker, NVIDIA Toolkit, Ollama models), security configuration (CSRF tokens, API keys, encryption status), and network port mapping verification.
- Configuration Templates: Production-ready
docker-compose.yml, GPU override manifests, .env templates for API key rotation, and LangChain retriever configuration snippets for FAISS/Chroma/Pinecone integration.