Quick Start • Demos • Tutorial • Docker • Cookbook • Docs
Maintained by Remsy Schmilinsky
Most AI apps start simple, then quickly need retrieval, auth, model routing, file upload, observability, quotas, and a way to connect real business systems. ORBIT packages those pieces into one self-hosted gateway you can run locally, on-prem, or in your own cloud.
Use ORBIT when you want to:
- Build private RAG apps without sending sensitive data to a hosted AI platform by default.
- Switch between local and cloud models without rewriting your frontend.
- Connect LLMs to databases, vector stores, files, APIs, Elasticsearch, GraphQL, DuckDB, and MCP tools.
- Ship an OpenAI-compatible API with API keys, quotas, rate limits, moderation, audit logs, and an admin dashboard.
- Prototype quickly, then keep the same architecture for production.
If that matches a project you are building, starring the repo helps more developers find it.
| Capability | What ORBIT provides |
|---|---|
| OpenAI-compatible gateway | A unified chat API for local, self-hosted, and cloud-backed models. |
| Private RAG infrastructure | File chat, vector search, SQL/NoSQL retrieval, REST/GraphQL adapters, DuckDB analytics, and Elasticsearch query translation. |
| Model flexibility | Run with Ollama, llama.cpp, vLLM, and external model providers behind one gateway contract. |
| Agentic tool loops | Connect to Model Context Protocol servers and let models execute multi-step tool workflows inside chat sessions. |
| Production controls | API key validation, request limits, token quotas, content moderation, circuit breakers, fallback routing, metrics, and audit logging. |
| Ready-to-use clients | A React chat client, Node.js SDK, admin UI, Docker Compose setup, examples, and cookbook recipes. |
Install ORBIT directly into a local Python environment on Linux or macOS:
curl -LO https://github.com/schmitech/orbit/releases/download/v2.7.6/orbit-2.7.6.tar.gz
tar -xzf orbit-2.7.6.tar.gz
cd orbit-2.7.6
./install/setup.sh
./bin/orbit.sh start
tail -f ./logs/orbit.logUse ./install/setup.sh --wizard for interactive setup. See the Getting Started Tutorial for configuration and customization.
Clone the repo and boot ORBIT with Ollama and a lightweight local model:
git clone https://github.com/schmitech/orbit.git
cd orbit/docker
docker compose up -dThis starts ORBIT with a local Ollama instance and the SmolLM2 model. For NVIDIA GPU acceleration:
docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -dSee the Docker Guide for GPU setup, model configuration, volumes, and troubleshooting.
curl -X POST http://localhost:3000/v1/chat \
-H 'Content-Type: application/json' \
-H 'X-API-Key: default-key' \
-H 'X-Session-ID: local-test' \
-d '{
"messages": [{"role": "user", "content": "Summarize ORBIT in one sentence."}],
"stream": false
}'Open the admin dashboard at http://localhost:3000/admin:
- Username:
admin - Password:
admin123
The dashboard shows API metrics, latency, active sessions, configured adapters, and system health.
Multi-source RAG and file chat
multimodal-image-generation.mp4
Upload PDFs, spreadsheets, and images, then query them together in a unified thread. ORBIT chunks, embeds, and retrieves documents locally.
Natural language to database queries
hr-github.mp4
Translate plain English into SQL, query structured databases, and generate dynamic visualizations directly in chat.
Agentic MCP and tool-calling loops
mcp-tool-demo.mp4
Expose local filesystem commands, Slack APIs, Postgres tools, and other MCP servers to multi-step model workflows.
Elasticsearch log translation
es-logs.mp4
Ask operational questions in natural language and let ORBIT compile Elasticsearch Query DSL for logs, error rates, and latency analysis.
Media and video generation
puppy.mp4
Generate videos with provider-backed generation adapters while keeping orchestration inside the chat workflow.
image-skills.mp4
Generate images as a cross-adapter skill using DALL-E or Stability AI with conversation context.
Admin panel and monitoring dashboard
orbit-admin.mp4
Monitor health, logs, adapter status, tokens, sessions, and query latency from the web dashboard.
Additional demos
sensitive-data.mp4
Analyze sensitive PII data offline using local llama.cpp models.
svg-rendering.mp4
Render dynamic SVGs generated by LLMs inline.
second-opinion.mp4
Switch inference models mid-conversation without breaking chat history.
business-analytics-demo.mp4
Use sub-conversation threading and document caching for faster retrieval.
- Internal knowledge assistants: Query policies, PDFs, spreadsheets, tickets, and documentation from a private chat UI.
- Database copilots: Convert natural language into SQL, DuckDB, Elasticsearch, REST, or GraphQL-backed answers.
- Local-first AI labs: Develop against Ollama, llama.cpp, or vLLM before moving selected workloads to cloud models.
- Tool-using agents: Give models controlled access to MCP tools while keeping auth, logs, and policies in one gateway.
- Customer-facing AI products: Put stable API keys, quotas, rate limits, and fallback routing in front of model providers.
| Client | Path / package | Description |
|---|---|---|
| ORBIT Chat | clients/orbitchat/ | React web chat client for ORBIT-backed conversations. |
| Node.js SDK | clients/node-api/ | Node library for integrating ORBIT backend features into apps. |
Run the chat client against a local ORBIT adapter:
ORBIT_ADAPTER_KEYS='{"simple-chat":"default-key"}' npx orbitchat| Problem | Custom setup | ORBIT |
|---|---|---|
| Provider lock-in | One SDK and request shape per provider. | One OpenAI-compatible gateway across local and cloud providers. |
| Glue code sprawl | Separate auth, RAG, model routing, file handling, metrics, and storage code. | Integrated gateway, adapters, session management, safety controls, and admin UI. |
| Narrow retrieval | Vector search over static text chunks only. | Structured data, files, APIs, vector stores, Elasticsearch, DuckDB, and hybrid workflows. |
| Privacy gaps | Sensitive data often flows through hosted services by default. | Self-hosted deployment with local models, local embeddings, API keys, RBAC, and audit logs. |
| Operational fragility | Slow providers or broken adapters can affect the whole app. | Circuit breakers, fallback routing, rate limits, queues, and observability. |
Roadmap items and active development tasks are tracked in GitHub Issues. Requests for new adapters, model providers, deployment patterns, or examples are welcome.
Contributions are welcome, especially:
- New retrievers, adapters, and provider integrations.
- Better examples and deployment guides.
- Tests, bug fixes, and documentation improvements.
- Real-world feedback from teams running private RAG or model gateway workloads.
Start with CONTRIBUTING.md, open an issue, or send a pull request.
If ORBIT is useful to you, please star the repository. It is the simplest way to support the project and helps other developers discover it.
ORBIT is licensed under the Apache 2.0 License. See LICENSE for details.