Welcome to local-ai.run
A self-hosted AI workspace that runs entirely on your own hardware. Chat with your files, swap model engines, transcribe audio — all without sending a single byte to a cloud provider.
Start here
Pick a guide depending on what you want to do.
Getting Started
Prerequisites, the one-command installer, and the manual Docker Compose path.
Configuration
Every .env variable explained — secrets, ports, Ollama, Whisper, image tags.
Troubleshooting
Port conflicts, GPU detection, disk space, large uploads, model errors.
Source on GitHub
Browse the code, file an issue, or open a PR. MIT licensed end-to-end.
What local-ai.run is
A complete local AI workspace that runs in Docker on your own hardware. Six services on one Docker network, zero external dependencies after install:
- Chat with your files — drop in PDFs, DOCX, XLSX, CSV, TXT, MD. Ask questions and get answers with citations (RAG).
- Multiple model engines — Ollama out of the box, with support for LM Studio, vLLM, and llama.cpp endpoints.
- Speech-to-text — Whisper running locally, no API calls.
- Air-gappable — bundle the install with
docker savetarballs and run with zero network access. - Single-user optimized — no complex auth, no SaaS, no telemetry, no analytics.
- One-command updates — built-in updater pulls new versions from the registry and restarts cleanly.
Architecture at a glance
Seven containers — three application services, three infrastructure services, one updater — all behind a single Caddy reverse proxy:
┌──────────┐ ┌────────────┐ ┌─────────────┐
│ Next.js │ ───▶ │ Django │ ───▶ │ PostgreSQL │
│ UI │ │ REST API │ └─────────────┘
└──────────┘ └────────────┘
│
├──▶ Ollama (LLM inference)
├──▶ RAG (FastAPI + vector store)
└──▶ Whisper (speech-to-text)
All traffic routed through Caddy on port 80.Why it exists
Cloud AI services route your conversations and uploaded files through third-party infrastructure. For privacy-sensitive work — legal documents, medical records, internal company data, compliance-bound industries — that's a non-starter.
local-ai.run gives you the same experience as a hosted chat tool, but every byte stays on your machine:
- No outbound API calls during chat or document processing.
- No cloud LLM dependency — Ollama, llama.cpp, vLLM, LM Studio, your choice.
- No vendor lock-in — MIT licensed, plain Docker Compose, swap any component.
- Built-in update system pulls signed images from the registry on your schedule.
Need help?
Join the Discord community for live install help and feature discussions, or open an issue on GitHub.