Skip to content

Huskyauto/VisionClaw-Agent-Public-Release

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

VisionClaw Agent

CI Stars Live demo Docs Roadmap

CodeFlow card — scale, structure, activity

Open-Source Multi-Tenant AI Agent Workspace — Documents, Research & Workflows

Built for agencies, operators, and founders who want an always-on AI operations team they own and host themselves.

Created by Robert Washburn | huskyauto@gmail.com | Live demo: agenticcorporation.net

Not affiliated with the unrelated AR/wearable "VisionClaw" project. This is the AI agent platform.


⚠️ Autonomous Pipelines Disclaimer (please read before forking)

VisionClaw ships several autonomous decision pipelines — the multi-model jury (jury_triage), the Agentic CI Self-Healer, and downstream implementer hooks. In this public release these pipelines are safe-by-default:

  • Auto-apply is OFF. The jury votes and writes a human-readable decision log for your review, but it does not automatically queue fixes for code-mutation or apply verdicts to your codebase.
  • To opt in, set JURY_AUTOAPPLY=1 in your environment. This enables the implementer-pickup seam (data/jury-decisions/queue.json). Only enable this if you have read the code, understand the risk model, and have your own test/rollback discipline in place — autonomous code-mutation can corrupt data, break auth, leak secrets, or commit broken code if your guardrails aren't set up.
  • No warranty. This project is provided AS-IS under its open-source license. The maintainers, contributors, and any platform hosting the source (including Replit) are not responsible for outcomes when you enable autonomous pipelines in your own deployment. The defaults are deliberately conservative; if you change them, you own the consequences.
  • Reporting: if you find a safety bug in the gated-off default path (i.e. something autonomous happens without JURY_AUTOAPPLY=1), please open a GitHub issue or email huskyauto@gmail.com.

What Is This?

VisionClaw Agent is an open-source, multi-tenant AI platform where 16 specialized agents work together to produce real deliverables — research reports, legal documents, financial models, marketing campaigns, slide decks, spreadsheets, and PDFs.

Instead of a single chatbot, you get a full agent workforce. Give it a task. The right agent picks it up, selects the right tools, coordinates with other agents when needed, and delivers a finished result. Every decision is traceable, every action is governed, and every integration degrades gracefully when not configured.

Fork it. Configure your API keys. Deploy. You have an AI operations team.

The app runs with just one LLM key and a Postgres database. Everything else — email, payments, voice, Drive — is optional and appears automatically when you add the key.

Roughly 180k lines of TypeScript across 460+ files. 40+ pages. 393 tools · 126 active capabilities · 16 personas · 210 tables · 41 governance rules · 616 production indexes · 6 AI providers · 6 deployment targets. Live, always-current counts: docs/CURRENT_PLATFORM_TOTALS.md. Browsable indexes: docs/tools.md · docs/personas.md.


Who is this for?

  • Agencies — give every client a multi-agent operations team without burning headcount.
  • Operators & founders — replace 5-10 SaaS subscriptions with one agent workforce you own.
  • Solo entrepreneurs — run a research / content / sales / finance team of one.
  • Research teams — a real testbed for multi-agent orchestration, AHB safety layers, and tool governance.

What can it actually do?

Out of the box, ask it to:

  • Research a market, build the comparison spreadsheet, ship the PDF
  • Draft a proposal, contract, or pitch deck (with vision-graded slide art)
  • Analyze a contract for risk across 9 regulatory frameworks
  • Run a deep research sweep across arXiv / HN / Reddit / press releases
  • Build a financial forecast with charts and exec summary
  • Plan a content calendar, generate the posts, schedule them
  • Send invoices, track CRM, run an outreach sequence
  • Produce a YouTube long-form or Short, end to end (script → render → upload)

Every deliverable lands as a real file (PDF, XLSX, PPTX, MP4, MP3) in Google Drive or local storage — not a chat transcript.

60-second quickstart

git clone https://github.com/Huskyauto/VisionClaw-Agent-Public-Release.git
cd VisionClaw-Agent-Public-Release && npm install
export DATABASE_URL="postgresql://localhost/visionclaw"
export SESSION_SECRET="$(openssl rand -hex 32)"
export OPENAI_API_KEY="sk-..."   # or ANTHROPIC_API_KEY
npm run dev                      # http://localhost:5000

That's it. The platform auto-creates the schema, seeds 16 personas + 41 governance rules, and redirects you to /setup. First account becomes admin.

Don't want to run it locally? Look at the live deployment: agenticcorporation.net.

Architecture at a glance

                          ┌────────────────────────────────────────┐
                          │  AHB safety layer (intent + policy)    │
                          │  • intent gate (fail-OPEN, logged)     │
                          │  • destructive-tool policy (fail-CLOSE)│
                          │  • outbound redaction gate             │
                          │  • tenant-context (AsyncLocalStorage)  │
                          └─────────────┬──────────────────────────┘
                                        │  every call passes through
                                        ▼
   User ──▶ Chat Engine (SSE) ──▶ CEO (Felix) ──▶ Agent Router ──┐
                                                                 │
        ┌────────────────────────────────────────────────────────┤
        ▼                ▼                ▼                ▼     ▼
   Specialist       Specialist       Specialist        ...   Imported
   persona          persona          persona                 Claude Code
   (Forge,          (Scribe,         (Neptune,               agents (with
    Engineer)        Writer)          Research)              runtime adapter)
        │                │                │                       │
        └──────┬─────────┴────────────────┴───────────────────────┘
               │  tool dispatch (393 tools)
               ▼
   ┌──────────────────────────────────────────────────────────────┐
   │  Tool layer: file I/O · web · email · LLM · payments · drive │
   │  • TNR snapshots (R100) on irreversible calls — undoable     │
   │  • Tracing spans (R101) parent-linked into causality tree   │
   │  • Admission control (R102) priority pool + rate limit       │
   └──────────────────────────────────────────────────────────────┘
               │
               ▼
   PostgreSQL + pgvector  (tenant-isolated; every WHERE filters tenant_id)

Browse the full lineup: tools index · personas index.


Revision History

Brief one-liners. Full play-by-play in REVISIONS.md.

Round Date Headline
R125+52.47+sec 2026-06-18 Whole-app + 72h code review (3rd pass) — 4 findings closed (architect PASS). A cost-cap backstop added the two most expensive autonomous tools (second_opinion, venture_discovery) to the dispatcher's hardcoded expensive-tool set so the per-call spend throttle still fires even if the rate-limiter config fails to load; a projects lookup in the auto-transcript path is now tenant-scoped and fails closed so a poisoned conversation project id can't redirect downstream file writes onto a foreign project; the Token Efficiency probe import on the ecosystem-health dashboard moved inside its per-probe try with a full default shape so a probe-module load error degrades just that one card; and a founder-quote tool count was corrected 392 → 393. Wiring audit CLEAN (393 tools), tsc + esbuild build green. No new tools/tables/personas/capabilities.
R125+52.44 → +52.46 2026-06-18 Token Efficiency telemetry + two whole-app security/correctness reviews. A new Token Efficiency card on the ecosystem-health dashboard surfaces three READ-ONLY per-request overhead metrics so wasted spend is measurable, not a vibe: cache-hit starvation (cache-hit % on large ≥5000-token prompts over 30 days), instruction bloat (the fixed system-prompt token tax, measured live), and MCP tool bloat (the serialized tool-catalog token tax) — tenant-scoped, fail-soft, purely additive (no writes, no schema change). Two whole-app + 72h code reviews (architect PASS): 2 HIGH closed in the Venture Discovery loop (a budget reservation that settled even on $0 real spend now releases instead of settling; a non-atomic stage-advance that two concurrent approvals could double-execute now uses an atomic compare-and-set) and 2 MEDIUM closed (a linked-conversation backfill now joins through projects with a tenant guard so a poisoned cross-tenant link can't stamp a foreign project onto a conversation; the briefings routes gained Zod validation plus a 0-coordinate fix). No new tools/tables/personas/capabilities.
R125+52.43+sec 2026-06-17 Tenant-isolation nightly-audit hardening + triage. The nightly cross-tenant security audit was silently skipping oversized source files (they blew the model's input cap), so the green/exit-0 result wasn't honest; it now splits big files into overlapping windows and only counts a file audited when every window passes. A triage of 62 flagged findings closed 8 genuine cross-tenant isolation defects — the skill-synthesizer's writes and several skill-library reads now require an explicit tenant id and fail closed — with the remainder verified as false positives (intentional platform-global tables, guards enforced at another layer, stale source-drift). 3 architect passes → PASS, typecheck green.
R125+52.42 2026-06-17 A new agent second-opinion / cross-check (the second_opinion tool, wired to all 16 personas, no count changes): when the native multi-model ensemble returns a low-confidence answer (concordance κ < 0.5 or a single proposer), the platform automatically fetches an independent cross-check verdict before escalating to a human. That spend routes through a managed panel → judge → synthesize backend behind a hard $25/day owner-only cost cap, which this round hardened against cost-drift overshoot (architect HIGH → accepted LOW): a deterministic worst-case reservation clamp (every call reserves a deliberately pessimistic estimate before any money moves), a fail-closed cost-drift latch (the first time a real bill exceeds its reservation the whole feature disables itself and pages the owner instead of quietly spending more), and a dynamic reserve floor (each later reservation rises to the highest real cost seen that day). +11 query-free guard tests (20/20), whole-app + 72h code review (architect PASS).
R125+52.39 2026-06-15 A nine-round security + reliability hardening sprint (+1 skill, no other count changes): the run-completion judge now runs on a model distinct from the worker set, an SSRF DNS-rebinding TOCTOU was closed by pinning the high-risk public-fetch socket to the already-validated IPs, the multi-model jury's proposer set now fails OPEN to the default pool when caller ids dedupe to empty, every health probe now carries a degraded marker (a failed probe shows "telemetry unavailable" instead of healthy zeros), self-heal attempt updates are tenant-scoped, a mid-run budget-adaptive strategy controller was added (advisory, fail-open), plus a new engineering-discipline skill — three of the nine rounds were whole-app + 72h code reviews (architect PASS, agent-wiring audit exit 0).
R125+52.25 2026-06-13 Whole-app code review closed 2 HIGH + 2 MEDIUM (cost-governance / tenant-isolation correctness, no count changes): the multi-model jury's premium spend no longer pollutes the daily metered-Anthropic circuit breaker (exempted at the source while still billed at its real cost), the chat engine can never persist a blank reply on a background/scheduled/webhook turn, and two tenant-scope checks were tightened to reject invalid ids.
R125+52.22 2026-06-11 Live Instant AI Readiness Audit on the /audit wedge — point it at any public site and get a real scored report (/100 across AI Access, Structured Data, Metadata, Social, Technical → A–F grade + recommendations). Plus a whole-project security review that closed cross-tenant read leaks and hardened the SSRF jail against DNS-rebinding (validated resolved addresses pinned per redirect hop).
R98.20 2026-05-05 CI concurrency group — killed the cancellation-email storm (one cancellation per editing session, not one per job).
R98.19 + 98.19+sec 2026-05-05 Memory v2 (confidence-scored facts, debounced writes, dedup, 8K recall cap) + whole-app code-review sweep that closed 5 silent-degradation bugs from CommonJS require() in ESM try/catch.
R94 2026-05-03 Tenant cost-attribution integrity — every auth path wraps in runWithTenant(); AsyncLocalStorage propagation closes 9 distinct High-severity findings.
R83-R95 2026-05-02 → 03 24-hour security sweep — outbound redaction gate, prompt-injection scanner, streaming-aware prompt cache, orphan tool-call repair, tenant-context end-to-end.
R80 2026-05-02 Claude Code subagent importer — paste a .claude/agents URL, get a fully-governed multi-tenant runtime adapter with tier-aware policies and autonomy rules.
R79.3 2026-05-02 Full HITL approval surface — one-click HMAC-signed approve/deny emails, owner-self-email auto-approve, multi-day red CI gate closed (195/195 hard-gate green).
R74.13y 2026-04-28 Felix autonomous loop (dry-run) + SWD-inspired verification rail + two-pass architect security sweep.

Tested & CI-protected

This is not vibe-coded. Every push runs against a real test suite gated by GitHub Actions. Note: the public-mirror CI workflow at .github/workflows/ci.yml is preserved on every snapshot since 2026-05-06 (HyperAgent review followup) — earlier mirrors stripped .github/ entirely, which made the badge above 404 and the "every push" claim incorrect for the public mirror specifically. Counts and badges below now reflect what actually runs on the public mirror; for the live source-of-truth platform metrics, see docs/CURRENT_PLATFORM_TOTALS.md (auto-regenerated from live registries).

Gate What it proves Status
build Production bundle compiles end-to-end ✅ hard gate
security-tests 158 tests across 16 files in 6 categories — SSRF/DNS-rebinding, admin auth, recipe validation + atomic writes, webhook signatures, trigger rate-limit, tenant isolation, conversation IDOR, background-queue durability + reclaim boundaries, destructive-command rails (the deny-list that blocks db:push --force, DROP TABLE, git push --force, rm -rf /, etc.), LLM cost-rate-card correctness, tool-dispatch contract, tenant-context hardening (41 tests — the strict tenantScope() storage helper that rejects every fail-open shape, the STRICT_TENANT_CONTEXT env flag with assertTenantContext() runtime guard, and end-to-end propagation through chat → assertTenantContext → step-ledger → AsyncLocalStorage → recordExecution with a live-DB persist round-trip; full source paths and call-site line numbers in docs/EVIDENCE.md) ✅ hard gate
docker Multi-stage image builds, container boots clean, /healthz returns 200 ✅ hard gate
typecheck tsc --noEmit — tracked and burning down informational
silent-failure-hunter Greps server/ + shared/ for the silent-failure shapes that bit us before (tenantId ?? 1, default-1 params, log-and-swallow catches, literal-secret fallbacks); uploads the full scan report as a 30-day workflow artifact on every PR informational

Run locally with bash tests/run.sh. The full receipts (CI history, the destructive-command deny-list, the silent-failure baseline) are in docs/.


⚡ Deploy your own copy

Platform One-click
Replit Open in Replit →
Render Deploy to Render
Railway Deploy on Railway
Docker docker compose up — see FORK-SETUP.md

After deploy, you'll need a Postgres database with the vector extension (Render and Railway can provision one for you), one LLM key (OPENAI_API_KEY or ANTHROPIC_API_KEY), and a SESSION_SECRET. Everything else is optional.

VisionClaw Landing Page

Landing page with live agent activity feed and command center stats

VisionClaw Setup Dashboard

First-run setup dashboard — real-time status of every integration


Try These Prompts

Once you're set up, paste any of these into the chat to see the platform in action:

Prompt What Happens
"Research the top 5 competitors in [your industry] and build me a comparison spreadsheet" Radar researches, Atlas structures data, exports a formatted .xlsx to Google Drive
"Draft a professional proposal for [client name] based on our last conversation" Scribe pulls context from memory, writes a styled PDF, Proof reviews it for quality
"Analyze this contract for risks" (attach a PDF) Luna scans for 20 risk patterns across 9 regulatory frameworks, scores compliance
"Create a weekly content calendar for our social media" Teagan builds a structured plan with post ideas, hashtags, and optimal timing
"Give me a financial forecast for Q3 based on current revenue trends" Cassandra models projections, generates charts, delivers an executive summary
"What happened in AI news this week?" Neptune runs a deep research sweep across arXiv, HN, Reddit, and tech blogs

📸 Tour — what you actually see

Real screenshots from the live instance at agenticcorporation.net.

Landing hero — Hire an AI corporation, not another chatbot

Landing hero — value prop in one line, with three real CTAs.

Command Center — 16 agents, 393 tools, 41 curated models + 1000+ daily catalog, live workflows

Command Center — live counts, recent ops with status pills, capability chips.

Agent Activity Feed — Neptune, Luna, Cassandra, Scribe, Felix shipping work

Agent Activity Feed — Neptune ships research, Luna runs compliance, Cassandra builds the Excel model, Felix delivers the styled PDF.

Mixture of Agents — 4 frontier proposers + Opus aggregator

Mixture of Agents — 4 frontier proposers (Sonnet · GPT-4.1 · Gemini 2.5 Pro · DeepSeek Reasoner) feed a Claude Opus aggregator for ensemble-quality answers.

Gated Code Proposals — shadow verifier in a git worktree

Gated Code Proposals (R25) — nightly research generates real edits; a shadow verifier compiles each in an isolated git worktree before any human reviewer sees it.

Glasses Gateway — Meta Ray-Ban + Gemini Live

Glasses Gateway — Meta Ray-Ban smart glasses stream POV video and audio to a tenant-isolated gateway with sub-second voice replies via Gemini Live.


Platform at a Glance

Authoritative counts live in docs/CURRENT_PLATFORM_TOTALS.md. If any number on this page conflicts with that doc, the truth doc wins.

Metric Count
AI Agents (Personas) 16
Built-in Tools 393
AI Models in Core Registry 41 curated
Daily Catalog Discovery 1000+ models scanned on OpenRouter
AI Providers 6 (OpenAI, Anthropic, Google, xAI, OpenRouter, Perplexity)
Governance Rules 41
Corporate Operation Scaffolds 75
Corporate Departments 12
Agent Skills 62
Frontend Pages 40+
API Endpoints 300+
Database Tables 210

How It Works

  User Request
       │
       ▼
┌──────────────────┐     ┌─────────────────────────────┐
│   Chat Engine    │────▶│   Agent Router              │
│  (SSE streaming) │     │   picks best agent for task  │
└──────────────────┘     └──────────┬──────────────────┘
                                   │
                    ┌──────────────┼──────────────┐
                    ▼              ▼              ▼
              ┌──────────┐  ┌──────────┐  ┌──────────┐
              │  Felix   │  │ Minerva  │  │ Neptune  │  ... 16 agents
              │  (CEO)   │  │(Strategy)│  │(Research)│
              └────┬─────┘  └────┬─────┘  └────┬─────┘
                   │             │              │
                   ▼             ▼              ▼
            ┌─────────────────────────────────────────┐
            │          393 Tools                      │
            │  Search · Write · Build · Analyze ·     │
            │  Email · Pay · Generate · Research       │
            └──────────────────┬──────────────────────┘
                               │
              ┌────────────────┼────────────────┐
              ▼                ▼                ▼
        ┌──────────┐   ┌────────────┐   ┌────────────┐
        │ PostgreSQL│   │ Google     │   │ 6 AI       │
        │ + pgvector│   │ Drive      │   │ Providers  │
        │ 210 tables│   │ Storage    │   │ 41 curated │
        └──────────┘   └────────────┘   └────────────┘

Example flow: You say "Research competitor pricing and build me a comparison spreadsheet."

  1. The Chat Engine routes to Felix (CEO) who sees this needs research + document production
  2. Felix spawns Radar (Intelligence) to research competitors and Atlas (Metrics) to structure the data
  3. Radar uses web search and scraping tools, deposits findings into the knowledge base
  4. Atlas pulls findings, builds a formatted Excel spreadsheet, uploads to Google Drive
  5. You get back a summary with a download link — no manual steps

The 16-Agent Team

Every agent has a defined role, personality, skill set, and operating rules. They work independently or collaborate through orchestration engines.

Agent Role What They Do
VisionClaw Personal Assistant Default conversational agent — handles general tasks, delegates complex ones
Felix CEO / COO Revenue strategy, task orchestration, multi-agent DAG decomposition
Forge Staff Engineer Code execution, engineering standards, infrastructure, security review
Teagan Content Marketing Social media strategy, content calendars, brand voice, ad copy
Blueprint Innovation Lead Skill creation, tool learning, self-improvement, capability expansion
Chief of Staff Operations Director System health monitoring, task routing, scheduling, daily operations
Scribe Content Creator Long-form writing, editing, SEO content, documentation, blog posts
Proof Quality Reviewer Proofreading, fact-checking, QA, content review, accuracy scoring
Radar Intelligence Analyst Market intelligence, competitive analysis, trend tracking, OSINT
Neptune Deep Research Academic analysis, overnight autonomous research, multimedia deep dives
Apollo Revenue & Pipeline Sales outreach, lead qualification, pipeline management, CRM
Atlas Metrics & Reporting Analytics, dashboards, KPI tracking, data visualization
Cassandra CFO Budgets, forecasting, P&L modeling, financial analysis
Luna Legal & Compliance Contract review, regulatory compliance, risk assessment, legal drafting
Minerva Strategic Planner Plan-of-record drafting, decision-theory analysis, Felix approval-loop partner; closes the auto-apply → strategic-plan loop for the proactive self-healing engine (R63)

Feature Overview

AI & Intelligence

  • 41 Curated AI Models in the Core Registry with cost-aware auto-routing across OpenAI, Anthropic, Google Gemini, xAI Grok, OpenRouter, and Perplexity. Bring your own provider credentials — API billing is governed by each provider's own terms.
  • Adaptive Model Discovery (R73) — a daily background task scans OpenRouter's full catalog of 1000+ models, tier-classifies each by completion price (reasoning / powerful / balanced / fast), probes the Replit gateway for new releases, and emails the owner a ranked digest of new models worth adding to the registry. Hard caps prevent inbox spam (10 alerts/run max), lifetime dedupe prevents re-alerts, and silent days mean nothing changed.
  • Streaming Responses via Server-Sent Events (SSE) — real-time token-by-token output
  • Thinking Mode — explainable reasoning with decision traces for complex problems
  • Model Failover — automatic fallback to healthy providers when one goes down
  • Context Window Management — automatic conversation compaction that preserves every fact before summarizing
  • Cost Ledger with Token Telemetry — every byte across every provider boundary is tracked in a per-tenant cost ledger; orchestrator bills callers automatically via onTokenUsage callbacks with row-level pg_advisory locks preventing double-counting

Document & Content Production

  • PDF Reports — executive-quality styled PDFs with cover pages, branded headers/footers, charts, and tables
  • Word Documents (.docx) — professional documents with formatting, headers, and styles
  • Excel Spreadsheets (.xlsx) — auto-formatted workbooks with formulas and conditional formatting
  • Google Slides — automated presentation generation delivered to Google Drive
  • Charts & Diagrams — Recharts visualizations and Mermaid.js diagrams rendered to PNG
  • PDF Form Filling — fill existing PDF forms programmatically
  • Invoices — professional invoices with line items, taxes, and branding

Research & Intelligence

  • Autonomous Overnight Research — configurable research programs that run autonomously, with LLM-judged experiment scoring and auto-deposit of findings into your knowledge base
  • Web Search — powered by Perplexity with Wikipedia and Jina fallbacks
  • Deep Web Scraping — Firecrawl integration for full-site crawling and markdown extraction
  • Trend Research — parallel scanning across Reddit, Hacker News, Polymarket, and X/Twitter
  • Competitive Intelligence — automated competitor analysis with structured output

Memory & Knowledge

  • Semantic Memory Palace — hierarchical memory organized by Wing and Room with three-tier recall (Hot/Warm/Cold)
  • Zero-Loss Compaction — full pre-compaction transcripts archived and recoverable; every fact extracted before conversation summarization
  • Vector Knowledge Base — RAG-powered knowledge retrieval with MMR diversity re-ranking
  • Temporal Knowledge — subject-predicate-object facts with time validity tracking
  • Dialectic User Modeling — three internal agents (Deriver, Dialectic, Dreamer) progressively build a profile of each user from conversations

Multi-Agent Orchestration

  • Crews — agent teams with defined roles, goals, and backstories working toward a shared objective
  • Flows — event-driven workflow pipelines that chain agent actions
  • Minds — 4-role deliberation system (Proposer, Critic, Synthesizer, Judge) for complex decisions
  • Auto-Orchestration — the COO automatically decomposes complex requests into DAG task graphs and delegates to specialists
  • Subagent Spawning — agents can spawn child agents for sub-tasks with full tool access
  • Chain of Debates — multi-persona deliberation where 3-6 specialists argue complex questions from different perspectives

Communication & Integrations

  • Email — built-in email server with tenant-specific inboxes, send/reply, and notification handling
  • WhatsApp — full bot integration for sending/receiving messages and approval workflows
  • Telegram — bot integration for external interaction
  • Discord — bot integration for team communication
  • Google Workspace — Gmail, Calendar, Sheets, Docs, Slides, and Contacts integration
  • Google Drive — primary storage for generated deliverables; every project gets a dedicated Drive folder with automatic backup

Payment Processing

  • Stripe — subscription management, checkout sessions, usage billing, and customer portal
  • Stripe Connect — tenants can connect their own Stripe accounts for white-label payment processing
  • Coinbase Commerce — cryptocurrency payments via hosted checkout
  • Coinbase CDP — on-chain wallet management and balance checks
  • Usage Metering — token tracking and feature access limits tied to billing tiers

Voice & Media

  • Text-to-Speech — ElevenLabs integration with 23+ voice profiles
  • Voice Conversations — real-time voice input/output with configurable wake words
  • Image Generation — DALL-E and Replit AI image generation
  • Video Production — scene-based MP4 pipeline with parallel TTS, Ken Burns motion, 25+ transitions, and background music

Project Management

  • Project Brain — filing cabinet system linking conversations, files, notes, and Google Drive assets to projects
  • Scheduled Tasks — cron-like automation for recurring agent work
  • Activity Logging — comprehensive system-wide activity tracking
  • Agent Board — visual overview of all agent activities and status

Governance & Safety

  • 41 Governance Rules — built-in rules controlling agent autonomy and behavior
  • Process Governor — enforces execution limits and approval requirements
  • Trust Engine — evaluates safety and reliability of tool calls; high-risk actions require human approval
  • Prompt Injection Scanner — detects and blocks malicious injection attempts
  • 3-Layer Failure Recovery:
    1. Self-correction retry with adjusted parameters
    2. Lean mode fallback to a lighter model on overload
    3. Backup agent reroute to mapped specialist
    4. 5-part failure transparency (what failed, why, what was tried, what succeeded, what the user should know)
  • Critique Agent — every response auto-evaluated on accuracy, completeness, relevance, and clarity (scored 1-10); low scores trigger auto-refinement

Multi-Tenant Architecture

  • Full Tenant Isolation — each tenant has separate conversations, memory, projects, files, settings, and billing; isolation extends to background services so the heartbeat engine's working memory, knowledge writes, daily notes, task list, and activity logs are all tenant-scoped at the storage layer
  • Per-Tenant WhatsApp/Email/Payment — communication and payment channels isolated by tenant
  • Encryption at Rest — Telegram bot tokens and WhatsApp/Baileys session credentials are encrypted with AES-256-GCM via a key derived from SESSION_SECRET; backward-compatible with legacy plaintext rows so existing installs auto-upgrade on first write
  • Team Management — invite users, manage roles, and control access
  • API Keys — per-tenant API key management for external integrations

Developer & Admin Tools

  • Settings Dashboard — comprehensive admin panel with tabs for General, Payments, Integrations, Voice, Tools, Security, Data, and Tenants
  • Diagnostics — stuck task detection, health monitoring, provider latency testing
  • Heartbeat Engine — system health monitoring with configurable check intervals
  • Auto-Tuner — autonomous performance optimization that runs daily
  • Webhook System — inbound/outbound webhook triggers for external automation
  • MCP Server — Model Context Protocol server for AI tool integration
  • Backup & Restore — automated daily backups to Google Drive with manual export/import
  • Vault — secure credential storage for sensitive data

Technical Stack

Layer Technology
Frontend React 18, TypeScript, Vite, TailwindCSS, shadcn/ui, Wouter, TanStack Query v5
Backend Express.js, TypeScript, Node.js 20+
Database PostgreSQL with pgvector extension, Drizzle ORM
AI Routing OpenAI, Anthropic, Google Gemini, xAI Grok, OpenRouter, Perplexity
Real-time Server-Sent Events (SSE) for streaming
Auth Email/Password with HMAC-SHA256, Admin PIN, Replit OAuth, Google OAuth
Validation Zod schemas with drizzle-zod integration
Security Helmet, CSRF protection, rate limiting, injection scanning
File Storage Google Drive (primary), local uploads (fallback)
Payments Stripe, Coinbase Commerce, Coinbase CDP
Voice ElevenLabs TTS (23+ voices)
Search Perplexity, Firecrawl, Jina, Wikipedia

Repository Structure

client/                       # React frontend
  src/
    pages/                    # 40+ route pages
    components/               # Reusable UI components (shadcn/ui)
    hooks/                    # Custom React hooks
    lib/                      # Utilities, query client, API helpers
server/                       # Express backend
  chat-engine.ts              # Core AI conversation engine with streaming
  tools.ts                    # 393 tool definitions and execution handlers
  routes.ts                   # 300+ API endpoints
  site-config.ts              # Centralized env-driven configuration
  seed.ts                     # Database seeding (210 tables, 41 rules, 16 personas)
  heartbeat.ts                # Background task scheduler with model-catalog sync (R73)
  model-catalog.ts            # Daily OpenRouter catalog scan + gateway probe (R73)
  orchestrator-ledger.ts      # Per-tenant cost ledger with pg_advisory locks (R73.B)
  agent-manager.ts            # Autonomous agent orchestration
  subagents.ts                # Hierarchical agent spawning
  agent-channels.ts           # Internal agent messaging system
  google-drive.ts             # Google Drive integration
  stripe-connect.ts           # Stripe payment processing
  coinbase-commerce.ts        # Crypto payment processing
  whatsapp.ts                 # WhatsApp bot integration
  email.ts                    # Email server and tenant inboxes
  scaffolding.ts              # 75 corporate operation scaffolds
shared/
  schema.ts                   # Drizzle ORM schema (210 tables)
scripts/
  clean-for-release.sh        # Sanitize codebase for public release
FORK-SETUP.md                 # Detailed setup instructions

Getting Started

Prerequisites

  • Node.js 20+ (or a Replit account)
  • PostgreSQL database
  • At least one AI provider API key (OpenAI, Anthropic, Google, or xAI)

Quick Start

# 1. Clone the repo
git clone https://github.com/Huskyauto/VisionClaw-Agent-Public-Release.git
cd VisionClaw-Agent-Public-Release

# 2. Install dependencies
npm install

# 3. Set required environment variables
export DATABASE_URL="postgresql://user:pass@host:5432/dbname"
export SESSION_SECRET="$(openssl rand -hex 32)"
export OPENAI_API_KEY="sk-..."   # Or ANTHROPIC_API_KEY, XAI_API_KEY, etc.

# 4. Start the platform
npm run dev

# 5. Open your browser
# Visit http://localhost:5000
# Fresh deploys auto-redirect to /setup

What Happens on First Run

In under 10 minutes, you go from git clone to a live dashboard with 16 agents, seeded governance, and a /setup checklist that tells you exactly what's configured and what's missing.

  1. The database auto-creates all 210 tables and full index set
  2. 41 governance rules and 16 AI personas are seeded automatically
  3. You're redirected to the Setup Checklist at /setup showing what's configured
  4. Click Create Account — the first account becomes the admin
  5. Start chatting — the AI is ready to work

Environment Variables

See FORK-SETUP.md for the complete list. Here's the quick reference:

Required

Variable What It Does
DATABASE_URL PostgreSQL connection string
SESSION_SECRET Random string for session encryption
One AI key OPENAI_API_KEY, ANTHROPIC_API_KEY, XAI_API_KEY, or OPENROUTER_API_KEY

Recommended (Branding)

Variable What It Does Default
SITE_PLATFORM_NAME Your platform's display name everywhere VisionClaw
SITE_COMPANY_NAME Company name for branding Your Company
SITE_OWNER_EMAIL Admin contact email (empty)
SITE_WEBSITE_URL Your public URL (empty)

Optional (Unlock More Features)

Variable What It Unlocks
ELEVENLABS_API_KEY Voice synthesis (23+ voices, text-to-speech)
FIRECRAWL_API_KEY Advanced web scraping and full-site crawling
BROWSERLESS_API_KEY PDF generation and browser automation
STRIPE_LIVE_SECRET_KEY + STRIPE_LIVE_PUBLISHABLE_KEY Payment processing and subscriptions
COINBASE_COMMERCE_API_KEY Cryptocurrency payments
GOOGLE_DRIVE_ROOT_FOLDER_ID Google Drive file storage and backups
AGENTMAIL_API_KEY + AGENTMAIL_INBOX Email sending/receiving
TELEGRAM_BOT_TOKEN Telegram bot integration
DISCORD_BOT_TOKEN Discord bot integration
X_API_KEY + X_API_SECRET + X_ACCESS_TOKEN + X_ACCESS_TOKEN_SECRET X/Twitter posting and search

Graceful Degradation

Features that aren't configured don't break the app — they gracefully disappear:

Missing Config What Happens
No email key Email, WhatsApp pages hidden from sidebar
No Telegram token Telegram page hidden
No Stripe keys Payments page hidden from admin panel
No Drive folder Files saved locally; Drive tools show "not configured"
No ElevenLabs key Voice tools return "not configured"
No Firecrawl/Browserless Scraping tools fall back gracefully
No Coinbase keys Crypto payment features disabled
No OAuth client IDs OAuth connection buttons hidden

The /setup page gives you a real-time checklist showing exactly what's configured and what's not.


Admin Settings

Once logged in as admin, the Settings page (/settings) gives you control over everything:

Tab What You Configure
General Agent name, personality, default AI model, API keys, OAuth connections, billing
Payments Stripe/Coinbase integration, pricing plans, subscription tiers
Integrations Discord bot, public chat settings, webhooks, system hooks
Voice Wake words, text-to-speech provider, voice profiles
Tools Browser/search settings, code sandbox, safety limits, rate limiting
Security Access PIN, auth health monitoring
Data Backup to Google Drive (manual + automated at 3 AM UTC), export/import
Tenants Multi-tenant management for agency deployments

Pages & Navigation

The platform includes 40+ pages organized by function:

Core: Home, Chat, Inbox, Email, Projects, Files, Documents

AI Management: Personas, Memory, Knowledge, Skills, Skills Marketplace, Agent Board, Agentic Operations

Intelligence: Research, Insights, Content Writing, Scheduled Tasks

Communication: WhatsApp, Telegram, Discord (with approval workflows)

Admin: Settings, Analytics, Activity Logs, Heartbeat, Team, API Keys, MCP, Webhooks, Channel Routing, Payments

Public: Landing Page, Architecture Overview, Login/Signup, Legal Pages (Terms, Privacy, About, Contact, Refund)


Agentic Design Patterns

These are the patterns we actually use in daily production — not just research papers:

  1. Parallel Tool Execution — read-only tools run concurrently via Promise.all; mutating tools execute sequentially for causal ordering
  2. Critique Agent / Self-Correction — every response auto-evaluated across 4 dimensions (accuracy, completeness, relevance, clarity); scores below 6/10 trigger auto-refinement
  3. Chain of Debates — 3-6 specialist agents argue complex questions from their domain expertise; synthesizes a recommendation with consensus level
  4. Tree-of-Thought Reasoning — 2-5 distinct analytical branches evaluated by a meta-reasoning judge for optimal answers
  5. Auto-Orchestration — complex requests decomposed into DAG task graphs with dependency tracking and parallel execution
  6. Dialectic User Modeling — three agents (Deriver, Dialectic, Dreamer) progressively understand user preferences and behavior

Deployment

Self-hosted only. You must deploy on your own infrastructure — your own Replit account, your own server, your own Docker host. We do not provide hosting, shared instances, or managed deployments. Every fork runs independently with its own database, API keys, and configuration.

The platform works on any Node.js hosting:

  • Replit: Create your own Replit account, import the repo, set secrets in the Secrets panel, hit Run
  • Railway/Render: Connect your repo, set env vars, deploy
  • Docker: docker-compose up -d — includes PostgreSQL with pgvector, ready out of the box
  • VPS: Clone, npm install, set env vars, npm run dev
  • Port: Serves frontend and backend on a single port (default: 5000)
git clone https://github.com/Huskyauto/VisionClaw-Agent-Public-Release.git
cd VisionClaw-Agent-Public-Release
cp .env.example .env   # edit with your API keys
docker-compose up -d    # or: npm install && npm run dev

About the Name

VisionClaw Agent is an independent AI agent platform — not related to the Intent-Lab/VisionClaw project (a smart glasses AI assistant for Meta Ray-Ban). This repo is a standalone, self-hosted multi-tenant operations platform. It works with just an LLM provider and PostgreSQL — no external ecosystem required.


Roadmap

Areas under active development:

  • Modularization — Breaking down large server files (routes, tools) into domain-specific modules for easier navigation and community contribution
  • Type safety — Incremental migration from any types to strict TypeScript interfaces
  • CI/CD — GitHub Actions pipeline for lint, typecheck, and automated testing
  • Plugin architecture — Making it easier to add custom tools and agents without modifying core files
  • API documentation — OpenAPI/Swagger spec for the 300+ endpoints

Community contributions welcome — see CONTRIBUTING.md.


Built With

VisionClaw Agent was originally built and hosted on Replit — a collaborative cloud development platform that makes it easy to build, deploy, and share full-stack applications. Replit's integrated environment, managed PostgreSQL, one-click deployments, and AI-assisted development made it possible to go from idea to production-ready platform without managing infrastructure. If you're looking for the fastest way to fork and run your own instance, Replit is a great place to start.


License

MIT License — free to fork, modify, and deploy for any purpose. See LICENSE.


Created by Robert Washburn | huskyauto@gmail.com

About

Open-source multi-tenant AI agent workspace · 16 personas · 393 tools · 210 tables · 126 active capabilities · 616 production indexes · self-hosted, BYO-keys

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

Packages

 
 
 

Contributors