Skip to content

Latest commit

 

History

History
460 lines (377 loc) · 20.7 KB

File metadata and controls

460 lines (377 loc) · 20.7 KB
title CLI Agents
description Spawn external CLI tools (Claude Code, Codex, Pi, …) and pipe them through the workflow runtime.

CLI-backed agent classes wrap external AI command-line tools and implement the AI SDK Agent interface. Use them anywhere Smithers accepts an agent, including <Task>. Reach for these for a vendor's full CLI surface (sessions, sandboxes, slash commands, MCP). For API-billed provider wrappers, see SDK Agents.

Quick Start

import { ClaudeCodeAgent, Task, Workflow, createSmithers } from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  analysis: z.object({ summary: z.string() }),
});

const claude = new ClaudeCodeAgent({
  model: "claude-fable-5",
  systemPrompt: "You are a careful code reviewer.",
  timeoutMs: 30 * 60 * 1000,
});

export default smithers((ctx) => (
  <Workflow name="review">
    <Task id="analysis" output={outputs.analysis} agent={claude}>
      {`Analyze the codebase and identify potential improvements.`}
    </Task>
  </Workflow>
));

Available agents

agents[10]{class,cli,modelDefault,hijack,notes}:
  ClaudeCodeAgent,claude,CLI default,native session id,Anthropic Claude Code CLI
  CodexAgent,codex,CLI default,native thread id,OpenAI Codex CLI (codex exec via stdin + JSON stream)
  AntigravityAgent,agy,CLI default,native session id,Google Antigravity CLI
  GeminiAgent,gemini,CLI default,legacy session id,Deprecated legacy Gemini CLI wrapper
  PiAgent,pi,CLI default,native session id,Pi CLI (text/json/rpc modes + extension UI hooks)
  KimiAgent,kimi,CLI default,native session id,Moonshot Kimi CLI (auto-isolates KIMI_SHARE_DIR)
  ForgeAgent,forge,CLI default,conversation id,Forge CLI (300+ models)
  AmpAgent,amp,CLI default,thread id,Amp CLI (--execute headless mode)
  VibeAgent,vibe,CLI default,headless session id,Mistral Vibe CLI
  OpenCodeAgent,opencode,CLI default,not yet,OpenCode CLI (opencode run --format json)

CLI binaries must be on PATH: claude, codex, agy, gemini, pi, kimi, forge, amp, vibe, opencode.

Codex CLI Agent

CodexAgent is the Smithers wrapper for OpenAI's codex CLI. It runs codex exec in non-interactive mode, sends the task prompt over stdin, forces --json so Smithers can stream structured progress, and captures the final assistant message via --output-last-message.

import { CodexAgent, Task, Workflow, createSmithers } from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  patch: z.object({ summary: z.string(), files: z.array(z.string()) }),
});

const codex = new CodexAgent({
  model: "gpt-5.5",
  sandbox: "workspace-write",
  skipGitRepoCheck: true,
  yolo: true,
});

export default smithers(() => (
  <Workflow name="codex-implementation">
    <Task id="implement" output={outputs.patch} agent={codex}>
      Implement the requested change and summarize the edited files.
    </Task>
  </Workflow>
));

For ChatGPT-account Codex auth, use the ChatGPT model id such as gpt-5.5; gpt-5.5-codex is rejected by that auth surface.

Authentication

  • Subscription login: run codex login once. For isolated accounts, pass configDir; Smithers sets CODEX_HOME for that invocation.
  • API billing: pass apiKey or set OPENAI_API_KEY; Smithers forwards it to the spawned codex process.
  • Account registry: bunx smithers-orchestrator agents add --provider codex ... registers a subscription config directory, while --provider openai-api registers API-key billing for Codex-compatible providers.

Structured output

  • If the Smithers task has an output schema and outputSchema is not set, Smithers writes a temporary OpenAI-compatible JSON Schema file and passes it as --output-schema.
  • Resume attempts use codex exec resume <thread-id> and skip --output-schema, matching the Codex CLI's resume command surface.
  • Hijack opens native Codex with codex resume <thread-id> -C <cwd>.

Claude Code CLI Agent

ClaudeCodeAgent is the Smithers wrapper for Anthropic's claude CLI. It runs the CLI non-interactively, captures the final assistant message, and (by default) forces --output-format stream-json so Smithers can stream structured progress.

Authentication

  • Subscription billing (default): ClaudeCodeAgent clears ANTHROPIC_API_KEY from the spawned process so the CLI bills your Claude Pro/Max subscription instead of the API. No key is required. The agent logs a one-time warning when it unsets an inherited ANTHROPIC_API_KEY.
  • Subscription login: the claude CLI stores credentials per config directory. To set up an isolated subscription, run CLAUDE_CONFIG_DIR=<dir> claude once and complete /login interactively. The credentials land at <dir>/.credentials.json.
  • Pinning a subscription: pass configDir to use that directory's credentials. Smithers sets CLAUDE_CONFIG_DIR=<configDir> for that invocation, so you can run several subscriptions side by side. Omit it to use the default ~/.claude/.
  • API billing: pass apiKey to bill the Anthropic API instead. When apiKey is set, Smithers stops clearing ANTHROPIC_API_KEY and forwards your key as ANTHROPIC_API_KEY to the spawned claude process.
import { ClaudeCodeAgent } from "smithers-orchestrator";

// Subscription billing, pinned to an isolated login dir (no API key):
const claude = new ClaudeCodeAgent({
  model: "claude-fable-5",
  configDir: "/home/me/.claude-work", // CLAUDE_CONFIG_DIR for this invocation
});

// API billing (switches off subscription auth):
const apiClaude = new ClaudeCodeAgent({
  model: "claude-fable-5",
  apiKey: process.env.ANTHROPIC_API_KEY,
});

This mirrors the Codex authentication surface (codex login / CODEX_HOME via configDir, OPENAI_API_KEY via apiKey).

Subscription-mode structured completion

You do not need a Workflow or Task graph to call a model once and get a typed object back. Construct a CLI agent and call agent.generate({ prompt, outputSchema, timeout, abortSignal }) directly. With no apiKey, the call bills the host subscription, returns a single completion, and is bounded by timeout and abortSignal.

generate() resolves to an AI SDK GenerateTextResult. Read the full text from .text; when the response is valid JSON, Smithers parses it into .output (also .experimental_output).

Claude Code (subscription, no API key)

ClaudeCodeAgent.generate() does not auto-inject the schema into the prompt (that injection happens inside <Task>), so when calling it standalone, instruct JSON in the prompt yourself. outputSchema drives .output parsing and validation. Setting outputFormat: "json" and tools: "" keeps the run a single quiet completion with no tool use.

import { ClaudeCodeAgent } from "smithers-orchestrator";
import { z } from "zod";

const schema = z.object({
  sentiment: z.enum(["positive", "neutral", "negative"]),
  summary: z.string(),
});

const claude = new ClaudeCodeAgent({
  model: "claude-fable-5",
  outputFormat: "json",
  tools: "", // no tools, pure completion
  // no apiKey → bills your Claude Pro/Max subscription
});

const controller = new AbortController();
const result = await claude.generate({
  prompt:
    "Classify the sentiment of this review and summarize it in one sentence. " +
    'Respond with ONLY a raw JSON object: {"sentiment": "...", "summary": "..."}. ' +
    "First character `{`, last character `}`, no prose or code fences.\n\n" +
    "Review: Shipping was slow but the product is excellent.",
  outputSchema: schema,
  timeout: { totalMs: 2 * 60 * 1000, idleMs: 30 * 1000 },
  abortSignal: controller.signal,
});

const parsed = schema.parse(result.output ?? JSON.parse(result.text));
console.log(parsed.sentiment, parsed.summary);

Codex (subscription, strict JSON)

CodexAgent with nativeStructuredOutput: true forwards the schema to the CLI as codex exec --output-schema, so the model is constrained to emit JSON matching the schema. No API key is needed; it bills your ChatGPT subscription via ~/.codex/auth.json (or CODEX_HOME when configDir is set).

import { CodexAgent } from "smithers-orchestrator";
import { z } from "zod";

const schema = z.object({
  sentiment: z.enum(["positive", "neutral", "negative"]),
  summary: z.string(),
});

const codex = new CodexAgent({
  model: "gpt-5.5",
  nativeStructuredOutput: true, // forwards outputSchema as --output-schema
  // no apiKey → bills your ChatGPT subscription
});

const result = await codex.generate({
  prompt:
    "Classify the sentiment of this review and summarize it in one sentence.\n\n" +
    "Review: Shipping was slow but the product is excellent.",
  outputSchema: schema,
  timeout: { totalMs: 2 * 60 * 1000, idleMs: 30 * 1000 },
});

const parsed = schema.parse(result.output);
console.log(parsed.sentiment, parsed.summary);

Notes:

  • No API key is required for either agent; the call bills the host subscription.
  • outputSchema is honored differently per agent: Codex constrains decoding via --output-schema (strict JSON); Claude Code relies on the JSON you ask for in the prompt and parses it into .output. Both validate against the schema.
  • A single generate() call returns one completion. There is no graph, no durability, and no retry loop unless you add one. For schema-validation retries, durability, and multi-step orchestration, wrap the agent in a <Task>.
  • timeout: { totalMs, idleMs } caps wall-clock and idle time; pass an AbortSignal to cancel from the outside.

Common options

All CLI agents accept the same base option surface:

type BaseCliAgentOptions = {
  id?: string;                       // Agent instance id (default: random UUID)
  model?: string;                    // Model name passed to --model
  systemPrompt?: string;             // Prepended to the user prompt
  instructions?: string;             // Alias for systemPrompt
  cwd?: string;                      // Working directory (default: tool ctx rootDir or process.cwd())
  env?: Record<string, string>;      // Extra env vars merged with process.env
  yolo?: boolean;                    // Skip permission prompts (default: true)
  timeoutMs?: number;                // Hard wall-clock cap
  idleTimeoutMs?: number;            // Inactivity cap; resets on any stdout/stderr
  maxOutputBytes?: number;           // Truncate captured output
  extraArgs?: string[];              // Additional CLI flags appended to the command
};

Per-call timeout override:

await agent.generate({
  prompt: "do the thing",
  timeout: { totalMs: 15 * 60 * 1000, idleMs: 2 * 60 * 1000 },
});

Per-agent extras

ClaudeCodeAgent extends the base with Claude Code-specific session and permission flags. Key additions: permissionMode, sessionId, mcpConfig, resume.

import { ClaudeCodeAgent } from "smithers-orchestrator";
new ClaudeCodeAgent({
  permissionMode?: "acceptEdits" | "bypassPermissions" | "default" | "delegate" | "dontAsk" | "plan";
  allowedTools?: string[]; disallowedTools?: string[]; disableSlashCommands?: boolean;
  addDir?: string[]; file?: string[]; fromPr?: string; fallbackModel?: string;
  appendSystemPrompt?: string; agents?: Record<string, { description?: string; prompt?: string }> | string;
  agent?: string; tools?: string[] | "default" | "";
  betas?: string[]; pluginDir?: string[]; resume?: string; sessionId?: string;
  mcpConfig?: string[]; mcpDebug?: boolean; maxBudgetUsd?: number; jsonSchema?: string;
  configDir?: string; apiKey?: string;
  dangerouslySkipPermissions?: boolean; allowDangerouslySkipPermissions?: boolean; chrome?: boolean; noChrome?: boolean;
  continue?: boolean; forkSession?: boolean; noSessionPersistence?: boolean; replayUserMessages?: boolean;
  debug?: boolean | string; debugFile?: string; ide?: boolean; includePartialMessages?: boolean;
  inputFormat?: "text" | "stream-json"; settingSources?: string; settings?: string; strictMcpConfig?: boolean; verbose?: boolean;
  outputFormat?: "text" | "json" | "stream-json"; // default stream-json
});

CodexAgent extends the base with OpenAI Codex-specific flags. Key additions: sandbox, config, outputSchema.

import { CodexAgent } from "smithers-orchestrator";
new CodexAgent({
  sandbox?: "read-only" | "workspace-write" | "danger-full-access";
  fullAuto?: boolean; dangerouslyBypassApprovalsAndSandbox?: boolean;
  config?: Record<string, string | number | boolean | object | null> | string[];
  enable?: string[]; disable?: string[];
  oss?: boolean; localProvider?: string;
  image?: string[]; profile?: string; cd?: string; addDir?: string[];
  skipGitRepoCheck?: boolean; color?: "always" | "never" | "auto";
  outputSchema?: string; outputLastMessage?: string; json?: boolean;
  configDir?: string; apiKey?: string;
});

AntigravityAgent wraps the Google agy CLI. Key additions: allowedMcpServerNames, geminiDir, conversation, continue, and resume.

import { AntigravityAgent } from "smithers-orchestrator";
new AntigravityAgent({
  model?: string; sandbox?: boolean;
  yolo?: boolean; dangerouslySkipPermissions?: boolean;
  allowedMcpServerNames?: string[]; allowedTools?: string[];
  conversation?: string; continue?: boolean; resume?: string;
  includeDirectories?: string[];
  extensions?: string[]; listExtensions?: boolean;
  listSessions?: boolean; deleteSession?: string;
  screenReader?: boolean; outputFormat?: "text" | "json" | "stream-json";
  debug?: boolean;
  binary?: string; configDir?: string; geminiDir?: string; apiKey?: string;
  // Deprecated and rejected at runtime: debug, screenReader, outputFormat,
  // extensions, listExtensions, listSessions, deleteSession.
});

Current agy builds changed several Gemini-era flags. Smithers treats that as a runtime contract, not a best-effort pass-through:

Smithers option Emitted agy surface
includeDirectories --add-dir
conversation / resume --conversation <id>
continue --continue
configDir / geminiDir --gemini_dir <dir> and GEMINI_DIR=<dir>
prompt text -p <prompt>

Smithers does not emit --output-format, --include-directories, --resume, --screen-reader, --debug, extension flags, session-list flags, or --prompt for Antigravity. Options that would require those removed flags fail fast with AGENT_CONFIG_INVALID and a replacement hint. Plugins are managed outside workflow launch through agy plugin.

GeminiAgent is the deprecated legacy wrapper for the older gemini CLI. Prefer AntigravityAgent for new Google CLI integrations, but existing workflows can still use GeminiAgent.

import { GeminiAgent } from "smithers-orchestrator";
new GeminiAgent({
  debug?: boolean; model?: string; sandbox?: boolean; yolo?: boolean;
  approvalMode?: "default" | "auto_edit" | "yolo" | "plan";
  experimentalAcp?: boolean;
  allowedMcpServerNames?: string[]; allowedTools?: string[];
  extensions?: string[]; listExtensions?: boolean;
  resume?: string; listSessions?: boolean; deleteSession?: string;
  includeDirectories?: string[]; screenReader?: boolean;
  outputFormat?: "text" | "json" | "stream-json";
  configDir?: string; apiKey?: string;
});

PiAgent wraps the Pi CLI and adds extension UI hook support. Key additions: provider, model, mode, onExtensionUiRequest, extension, thinking.

import { PiAgent, type PiExtensionUiRequest, type PiExtensionUiResponse } from "smithers-orchestrator";
new PiAgent({
  provider?: string; model?: string; apiKey?: string; appendSystemPrompt?: string; mode?: "text" | "json" | "rpc";
  print?: boolean; continue?: boolean; resume?: boolean; session?: string;
  sessionDir?: string; noSession?: boolean;
  models?: string | string[]; listModels?: boolean | string;
  extension?: string[]; skill?: string[]; promptTemplate?: string[]; theme?: string[];
  noExtensions?: boolean; noSkills?: boolean; noPromptTemplates?: boolean; noThemes?: boolean;
  tools?: string[]; noTools?: boolean; files?: string[];
  thinking?: "off" | "minimal" | "low" | "medium" | "high" | "xhigh";
  export?: string; verbose?: boolean;
  onExtensionUiRequest?: (req: PiExtensionUiRequest) => Promise<PiExtensionUiResponse | null> | PiExtensionUiResponse | null;
});

KimiAgent wraps the Moonshot Kimi CLI with automatic session isolation. Key additions: thinking, agent, maxRalphIterations.

import { KimiAgent } from "smithers-orchestrator";
new KimiAgent({
  thinking?: boolean; outputFormat?: "text" | "stream-json";
  finalMessageOnly?: boolean; quiet?: boolean;
  agent?: "default" | "okabe"; agentFile?: string;
  workDir?: string; session?: string; continue?: boolean;
  skillsDir?: string; mcpConfigFile?: string[]; mcpConfig?: string[];
  maxStepsPerTurn?: number; maxRetriesPerStep?: number; maxRalphIterations?: number;
  verbose?: boolean; debug?: boolean; configDir?: string;
});

ForgeAgent wraps the Forge CLI and supports 300+ models via provider/model strings. Key additions: conversationId, provider, workflow.

import { ForgeAgent } from "smithers-orchestrator";
new ForgeAgent({
  directory?: string; provider?: string; agent?: string;
  conversationId?: string; sandbox?: string; restricted?: boolean;
  verbose?: boolean; workflow?: string; event?: string; conversation?: string;
});

AmpAgent wraps the Amp CLI in --execute headless mode. Key additions: visibility, mcpConfig, dangerouslyAllowAll.

import { AmpAgent } from "smithers-orchestrator";
new AmpAgent({
  visibility?: "private" | "public" | "workspace" | "group";
  mcpConfig?: string; settingsFile?: string;
  logLevel?: "error" | "warn" | "info" | "debug" | "audit"; logFile?: string;
  dangerouslyAllowAll?: boolean; ide?: boolean; jetbrains?: boolean;
});

VibeAgent wraps Mistral's vibe CLI with streaming JSON output. Key additions: agent, maxTurns, maxPrice, maxTokens, enabledTools, sessionId, continueSession.

import { VibeAgent } from "smithers-orchestrator";
new VibeAgent({
  agent?: string;
  maxTurns?: number; maxPrice?: number; maxTokens?: number;
  enabledTools?: string[];
  sessionId?: string; continueSession?: boolean;
});

OpenCodeAgent wraps the OpenCode CLI via opencode run --format json. Key additions: agentName, continueSession, sessionId. Note: native hijack support is not yet shipped.

import { OpenCodeAgent } from "smithers-orchestrator";
new OpenCodeAgent({
  model?: string; agentName?: string;
  attachFiles?: string[];
  continueSession?: boolean; sessionId?: string;
  variant?: string;
});

Hijack handoff

Most built-in CLI agents support bunx smithers-orchestrator hijack RUN_ID, which relaunches the agent in its native CLI session for interactive takeover.

Smithers persists the native session or conversation id on each task event. On hijack, it waits for a safe boundary between blocking tool calls, then reopens the session via the vendor's resume flag:

Agent class Resume flag
ClaudeCodeAgent claude --resume
CodexAgent codex resume
AntigravityAgent agy --conversation
PiAgent pi --session
KimiAgent kimi --session
ForgeAgent forge --conversation-id
AmpAgent amp threads continue

On clean exit the workflow resumes in detached mode. Vibe and OpenCode stream capture and headless session continuation are documented above, but native bunx smithers-orchestrator hijack support for Vibe and OpenCode is not shipped yet. The deprecated GeminiAgent has no native hijack launcher; use AntigravityAgent (agy --conversation) for Google CLI takeover. See How it works → Durability and resume.

Notes

  • Yolo defaults. yolo: true (default) maps to each CLI's "skip approvals" flag (--dangerously-skip-permissions, --dangerously-bypass-approvals-and-sandbox, --yolo, --dangerously-allow-all). Set yolo: false or use the agent-specific approval option for tighter control.
  • Pi rpc mode sends prompts as JSON over stdin and is required for onExtensionUiRequest callbacks; text/json modes pass the prompt as a positional arg with files emitted as @path.
  • Kimi share dir. KimiAgent auto-creates an isolated KIMI_SHARE_DIR per invocation to prevent kimi.json corruption under concurrent runs. Set env.KIMI_SHARE_DIR to opt out.
  • Antigravity config. AntigravityAgent launches the agy binary and passes configDir/geminiDir as both --gemini_dir and GEMINI_DIR, matching Antigravity's ~/.gemini/antigravity-cli config root. Current agy prompts use -p, extra directories use --add-dir, and native resume uses --conversation.
  • Non-idempotent retries. When a <Task> retries, Smithers prepends a warning listing previously-called side-effect tools so the agent can verify external state before re-invoking them.