Skip to content

schmitech/orbit

ORBIT Logo

ORBIT

Open Retrieval-Based Inference Toolkit

A self-hosted AI gateway for private RAG, tool-calling agents, and multi-model applications.

GitHub stars License Python Version PRs Welcome

Quick Start  •  Demos  •  Tutorial  •  Docker  •  Cookbook  •  Docs

Maintained by Remsy Schmilinsky


Why ORBIT?

Most AI apps start simple, then quickly need retrieval, auth, model routing, file upload, observability, quotas, and a way to connect real business systems. ORBIT packages those pieces into one self-hosted gateway you can run locally, on-prem, or in your own cloud.

Use ORBIT when you want to:

  • Build private RAG apps without sending sensitive data to a hosted AI platform by default.
  • Switch between local and cloud models without rewriting your frontend.
  • Connect LLMs to databases, vector stores, files, APIs, Elasticsearch, GraphQL, DuckDB, and MCP tools.
  • Ship an OpenAI-compatible API with API keys, quotas, rate limits, moderation, audit logs, and an admin dashboard.
  • Prototype quickly, then keep the same architecture for production.

If that matches a project you are building, starring the repo helps more developers find it.


What You Get

Capability What ORBIT provides
OpenAI-compatible gateway A unified chat API for local, self-hosted, and cloud-backed models.
Private RAG infrastructure File chat, vector search, SQL/NoSQL retrieval, REST/GraphQL adapters, DuckDB analytics, and Elasticsearch query translation.
Model flexibility Run with Ollama, llama.cpp, vLLM, and external model providers behind one gateway contract.
Agentic tool loops Connect to Model Context Protocol servers and let models execute multi-step tool workflows inside chat sessions.
Production controls API key validation, request limits, token quotas, content moderation, circuit breakers, fallback routing, metrics, and audit logging.
Ready-to-use clients A React chat client, Node.js SDK, admin UI, Docker Compose setup, examples, and cookbook recipes.

Quick Start

Option A: Release Tarball

Install ORBIT directly into a local Python environment on Linux or macOS:

curl -LO https://github.com/schmitech/orbit/releases/download/v2.7.6/orbit-2.7.6.tar.gz
tar -xzf orbit-2.7.6.tar.gz
cd orbit-2.7.6

./install/setup.sh
./bin/orbit.sh start

tail -f ./logs/orbit.log

Use ./install/setup.sh --wizard for interactive setup. See the Getting Started Tutorial for configuration and customization.

Option B: Docker Compose

Clone the repo and boot ORBIT with Ollama and a lightweight local model:

git clone https://github.com/schmitech/orbit.git
cd orbit/docker
docker compose up -d

This starts ORBIT with a local Ollama instance and the SmolLM2 model. For NVIDIA GPU acceleration:

docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d

See the Docker Guide for GPU setup, model configuration, volumes, and troubleshooting.

Verify the Gateway

curl -X POST http://localhost:3000/v1/chat \
  -H 'Content-Type: application/json' \
  -H 'X-API-Key: default-key' \
  -H 'X-Session-ID: local-test' \
  -d '{
    "messages": [{"role": "user", "content": "Summarize ORBIT in one sentence."}],
    "stream": false
  }'

Open the admin dashboard at http://localhost:3000/admin:

  • Username: admin
  • Password: admin123

The dashboard shows API metrics, latency, active sessions, configured adapters, and system health.


Demos

Multi-source RAG and file chat

multimodal-image-generation.mp4

Upload PDFs, spreadsheets, and images, then query them together in a unified thread. ORBIT chunks, embeds, and retrieves documents locally.

Natural language to database queries

hr-github.mp4

Translate plain English into SQL, query structured databases, and generate dynamic visualizations directly in chat.

Agentic MCP and tool-calling loops

mcp-tool-demo.mp4

Expose local filesystem commands, Slack APIs, Postgres tools, and other MCP servers to multi-step model workflows.

Elasticsearch log translation

es-logs.mp4

Ask operational questions in natural language and let ORBIT compile Elasticsearch Query DSL for logs, error rates, and latency analysis.

Media and video generation

puppy.mp4

Generate videos with provider-backed generation adapters while keeping orchestration inside the chat workflow.

image-skills.mp4

Generate images as a cross-adapter skill using DALL-E or Stability AI with conversation context.

Admin panel and monitoring dashboard

orbit-admin.mp4

Monitor health, logs, adapter status, tokens, sessions, and query latency from the web dashboard.

Additional demos

sensitive-data.mp4

Analyze sensitive PII data offline using local llama.cpp models.

svg-rendering.mp4

Render dynamic SVGs generated by LLMs inline.

second-opinion.mp4

Switch inference models mid-conversation without breaking chat history.

business-analytics-demo.mp4

Use sub-conversation threading and document caching for faster retrieval.


Common Use Cases

  • Internal knowledge assistants: Query policies, PDFs, spreadsheets, tickets, and documentation from a private chat UI.
  • Database copilots: Convert natural language into SQL, DuckDB, Elasticsearch, REST, or GraphQL-backed answers.
  • Local-first AI labs: Develop against Ollama, llama.cpp, or vLLM before moving selected workloads to cloud models.
  • Tool-using agents: Give models controlled access to MCP tools while keeping auth, logs, and policies in one gateway.
  • Customer-facing AI products: Put stable API keys, quotas, rate limits, and fallback routing in front of model providers.

Client Integrations

Client Path / package Description
ORBIT Chat clients/orbitchat/ React web chat client for ORBIT-backed conversations.
Node.js SDK clients/node-api/ Node library for integrating ORBIT backend features into apps.

Run the chat client against a local ORBIT adapter:

ORBIT_ADAPTER_KEYS='{"simple-chat":"default-key"}' npx orbitchat

Compared to a Custom Stack

Problem Custom setup ORBIT
Provider lock-in One SDK and request shape per provider. One OpenAI-compatible gateway across local and cloud providers.
Glue code sprawl Separate auth, RAG, model routing, file handling, metrics, and storage code. Integrated gateway, adapters, session management, safety controls, and admin UI.
Narrow retrieval Vector search over static text chunks only. Structured data, files, APIs, vector stores, Elasticsearch, DuckDB, and hybrid workflows.
Privacy gaps Sensitive data often flows through hosted services by default. Self-hosted deployment with local models, local embeddings, API keys, RBAC, and audit logs.
Operational fragility Slow providers or broken adapters can affect the whole app. Circuit breakers, fallback routing, rate limits, queues, and observability.

Learn More


Roadmap

Roadmap items and active development tasks are tracked in GitHub Issues. Requests for new adapters, model providers, deployment patterns, or examples are welcome.


Contributing

Contributions are welcome, especially:

  • New retrievers, adapters, and provider integrations.
  • Better examples and deployment guides.
  • Tests, bug fixes, and documentation improvements.
  • Real-world feedback from teams running private RAG or model gateway workloads.

Start with CONTRIBUTING.md, open an issue, or send a pull request.

If ORBIT is useful to you, please star the repository. It is the simplest way to support the project and helps other developers discover it.


License

ORBIT is licensed under the Apache 2.0 License. See LICENSE for details.