Welcome to local-ai.run

A self-hosted AI workspace that runs entirely on your own hardware. Chat with your files, swap model engines, transcribe audio — all without sending a single byte to a cloud provider.

New to local-ai.run? Head straight to Getting Started and run the one-command installer. Most people are up and running in under five minutes.

Start here

Pick a guide depending on what you want to do.

Getting Started

Prerequisites, the one-command installer, and the manual Docker Compose path.

Configuration

Every .env variable explained — secrets, ports, Ollama, Whisper, image tags.

Troubleshooting

Port conflicts, GPU detection, disk space, large uploads, model errors.

Source on GitHub

Browse the code, file an issue, or open a PR. MIT licensed end-to-end.

What local-ai.run is

A complete local AI workspace that runs in Docker on your own hardware. Six services on one Docker network, zero external dependencies after install:

Chat with your files — drop in PDFs, DOCX, XLSX, CSV, TXT, MD. Ask questions and get answers with citations (RAG).
Multiple model engines — Ollama out of the box, with support for LM Studio, vLLM, and llama.cpp endpoints.
Speech-to-text — Whisper running locally, no API calls.
Air-gappable — bundle the install with docker save tarballs and run with zero network access.
Single-user optimized — no complex auth, no SaaS, no telemetry, no analytics.
One-command updates — built-in updater pulls new versions from the registry and restarts cleanly.

Architecture at a glance

Seven containers — three application services, three infrastructure services, one updater — all behind a single Caddy reverse proxy:

┌──────────┐      ┌────────────┐      ┌─────────────┐
│ Next.js  │ ───▶ │   Django   │ ───▶ │  PostgreSQL │
│   UI     │      │  REST API  │      └─────────────┘
└──────────┘      └────────────┘
                        │
                        ├──▶ Ollama   (LLM inference)
                        ├──▶ RAG      (FastAPI + vector store)
                        └──▶ Whisper  (speech-to-text)

       All traffic routed through Caddy on port 80.

Why it exists

Cloud AI services route your conversations and uploaded files through third-party infrastructure. For privacy-sensitive work — legal documents, medical records, internal company data, compliance-bound industries — that's a non-starter.

local-ai.run gives you the same experience as a hosted chat tool, but every byte stays on your machine:

No outbound API calls during chat or document processing.
No cloud LLM dependency — Ollama, llama.cpp, vLLM, LM Studio, your choice.
No vendor lock-in — MIT licensed, plain Docker Compose, swap any component.
Built-in update system pulls signed images from the registry on your schedule.

Need help?

Join the Discord community for live install help and feature discussions, or open an issue on GitHub.