You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Google target: Claude-on-Vertex spike, mixed-model seam, gap workarounds, honest reframe
Spike (offline-verified, live blocked on Model Garden enablement):
- experiments/claude-on-vertex/: ADK 1.34.3 resolves Claude on Vertex and the
mixed-model shape composes (web sub-agents stay Gemini); guarded live probe + RESULTS.md.
No new dependency (1.34.3 is already our floor).
Code (implemented + tested, no user-facing passthrough until a live receipt exists):
- google_codegen: web_model() pins wrapped web tool-agents to Gemini by construction
(Search grounding / URL Context are Gemini built-ins) -- robust to a future Claude parent.
- google_plan: refuse a Claude --google-model (google.deploy_model.claude_unsupported)
rather than silently encode an unverified deploy path.
Docs (reframe gaps as non-equivalences / non-goals, not parity TODOs):
- deploy-google.md: document the two Google gaps + workarounds (sandbox -> URL MCP server,
:ask -> gate client-side or keep on Anthropic).
- README/provider-matrix/limitations/CLAUDE: roadmap reframe, audit reason de-undersold,
coverage-matrix footnote notes the web tools' separate live receipt.
Tests: pin the web_model invariant + the deploy-model guard. 115 passed (offline suite).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CLAUDE.md
+16-6Lines changed: 16 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -82,10 +82,10 @@ A bare ref resolves to the agent's **own** resource first, then `shared/`.
82
82
| Skills | uploaded, shared by id (skill-bearing agents auto-get `read` — Managed Agents needs it to open `SKILL.md`) | ✅ embedded in source package, loaded via ADK `load_skill_from_dir` (update = redeploy) | export comment only |
83
83
| Remote MCP | mapped | ✅ URL → ADK `McpToolset` + `tool_filter`; inline auth → Agent Engine `env_vars` (resolved at deploy, never inlined) | export comment only |
84
84
| Built-in web tools (`web_search`/`web_fetch`) | mapped | ✅ `web_search`→Google Search grounding, `web_fetch`→URL Context, each a wrapped single-tool ADK sub-agent (`AgentTool`, `propagate_grounding_metadata=True`); always-wrap so they coexist with `transfer_to_agent`; pins `google-adk>=1.34.3`|`WebSearchTool` / self-host fetch |
| Model | Claude (native) | 🔁 mapped to Gemini (`gemini-2.5-flash`) | 🔁 mapped to `gpt-*`|
88
+
| Model | Claude (native) | 🔁 mapped to Gemini (`gemini-2.5-flash`); Claude-on-Vertex is an offline-verified **spike, not shipped** (`experiments/claude-on-vertex/`) — a Claude `--google-model` is refused (`google.deploy_model.claude_unsupported`)| 🔁 mapped to `gpt-*`|
89
89
90
90
**Live-verified (6/6 both):** one neutral fixture (`tests/live/fixtures/coverage-matrix`) was deployed
91
91
+ queried on **both** Anthropic and Google; all six portability dimensions (agents · subagents ·
@@ -111,9 +111,19 @@ so the objective signal is the tool-call + its response content, not citation ch
111
111
`deploy --target google` reports *agentlift's current implementation*. These now agree on
112
112
skills, URL MCP, and the built-in **web** tools (all mapped). They still diverge on the
113
113
built-in **sandbox** tools and `:ask` (`audit` rates them `degraded`/`unsupported` for
114
-
Google; `deploy` skips a stdio MCP server / sandbox-tool-only folder). Pipeline for Google
115
-
mirrors Anthropic's *plan-is-the-contract* discipline: `google_plan.py` is pure and
116
-
offline-tested, only `google_target.py` touches the network.
114
+
Google; `deploy` skips a stdio MCP server / sandbox-tool-only folder). Those two are framed
115
+
as **non-goals with workarounds**, not parity TODOs (sandbox → expose via a URL MCP server;
116
+
`:ask` → gate client-side or keep on Anthropic — see
117
+
[docs/deploy-google.md](docs/deploy-google.md)). Pipeline for Google mirrors Anthropic's
118
+
*plan-is-the-contract* discipline: `google_plan.py` is pure and offline-tested, only
119
+
`google_target.py` touches the network.
120
+
121
+
**Claude-on-Vertex (spike, not shipped):** ADK 1.34.3 resolves Claude on Vertex and the
122
+
mixed-model shape composes (web sub-agents must stay Gemini — Search/URL-Context are Gemini
123
+
built-ins, encoded by `web_model()` in `google_codegen.py`). Offline-verified in
124
+
`experiments/claude-on-vertex/`; no live receipt yet, so `build_google_plan`**refuses** a
125
+
Claude `--google-model` (`google.deploy_model.claude_unsupported`) rather than silently
126
+
shipping it (the *confirm-live-before-encoding* rule).
reason: hosted sandbox is Python/JS only - no bash, no persistent workspace filesystem
188
188
unsupported:
189
189
x Per-tool approval gate (:ask / human-in-the-loop)
190
190
reason: not enforced with VertexAiSessionService on the deployed runtime
@@ -225,7 +225,7 @@ One neutral [coverage fixture](tests/live/fixtures/coverage-matrix/) — a coord
225
225
226
226
> `*`**OpenAI is an export target, not a live deploy** — there is no code-define-and-OpenAI-host path. agents + subagents are real (the `as_tool` composition is trace-verified in [`experiments/subagent-composition`](experiments/subagent-composition/)); MCP and skills compile to guided self-host scaffolding (`HostedMCPTool` / Skills-API call sites), since the orchestration loop runs in *your* app. `◐` = compiled to a self-host stub, not live-exercised.
227
227
>
228
-
> `EXERCISED` = an objective runtime event proved it. **Both Anthropic and Google exercised all six dimensions server-side (6/6)** on a real, billable deploy. Anthropic's subagents cell keys on the native delegation event (`session.thread_created` + `agent.thread_message_sent`) because coordinator delegation is async (the worker's reply lands after a one-shot query returns). The **wired** layer is pinned offline in [`tests/test_coverage_matrix_plan.py`](tests/test_coverage_matrix_plan.py) (CI); the EXERCISED column comes from committed live receipts under [`tests/live/receipts/`](tests/live/receipts/). Full evidence + reproduce steps: [`docs/tested-platforms.md`](docs/tested-platforms.md#live-coverage-matrix--receipt-evidence-not-a-capability-ranking) and [`tests/live/README.md`](tests/live/README.md).
228
+
> `EXERCISED` = an objective runtime event proved it. **Both Anthropic and Google exercised all six dimensions server-side (6/6)** on a real, billable deploy. Anthropic's subagents cell keys on the native delegation event (`session.thread_created` + `agent.thread_message_sent`) because coordinator delegation is async (the worker's reply lands after a one-shot query returns). The **wired** layer is pinned offline in [`tests/test_coverage_matrix_plan.py`](tests/test_coverage_matrix_plan.py) (CI); the EXERCISED column comes from committed live receipts under [`tests/live/receipts/`](tests/live/receipts/). The built-in **web tools** (`web_search`/`web_fetch`) are not part of this six-dimension fixture — they were exercised on a **separate** Google live deploy (both tool-agents fired server-side), receipted independently. Full evidence + reproduce steps: [`docs/tested-platforms.md`](docs/tested-platforms.md#live-coverage-matrix--receipt-evidence-not-a-capability-ranking) and [`tests/live/README.md`](tests/live/README.md).
229
229
230
230
## Isolation: each agent gets only its folder
231
231
@@ -392,7 +392,15 @@ Everything is here or one click away:
392
392
393
393
## Roadmap
394
394
395
-
-**Google deploy parity** — the live `deploy --target google` now ships prompts + coordinator/`sub_agents` + **skills + URL MCP (with inline-auth-via-env-vars) + built-in web tools (`web_search`→Google Search grounding, `web_fetch`→URL Context)** + model, idempotent via a spec hash. Remaining for full parity: the built-in **sandbox** tools (bash/files/glob-grep — Vertex's sandbox is Python/JS only), `:ask`/per-tool approval (not enforced on `VertexAiSessionService`), Claude-on-Vertex models (today Claude→Gemini), and per-agent IDs via A2A.
395
+
The live `deploy --target google` ships prompts + coordinator/`sub_agents` + **skills + URL MCP (with inline-auth-via-env-vars) + built-in web tools (`web_search`→Google Search grounding, `web_fetch`→URL Context)** + model, idempotent via a spec hash. The remaining Google differences are **known gaps and non-equivalences, not all roadmap items** — two are deliberate non-goals:
396
+
397
+
-**Built-in sandbox tools (bash/files/glob-grep) — emulating them in-engine is a non-goal, not a TODO.** Vertex's hosted sandbox is Python/JS only with no persistent workspace filesystem; pretending it's a shell+FS *inside* Agent Engine would be exactly the silent degradation agentlift exists to surface. If a Google-hosted agent needs filesystem/shell/code-search, expose that environment deliberately through a URL MCP server, which *does* deploy — see [the workaround](docs/deploy-google.md#two-known-gaps-and-how-to-work-around-them).
398
+
-**`:ask` / per-tool approval — gate it in your caller, don't fake it in the runtime.** Not enforced on `VertexAiSessionService`, so enforce approval client-side in the loop that calls the engine, or keep `:ask` agents on Anthropic where the gate is native. Surfaced as a diagnostic, never lost.
399
+
-**Claude-on-Vertex models — offline-verified spike, not shipped.** Today Claude folder models map to Gemini. ADK 1.34.3 *can* resolve Claude on Vertex and the mixed-model shape composes (web sub-agents stay Gemini) — proven offline in [`experiments/claude-on-vertex/`](experiments/claude-on-vertex/) — but with no live receipt yet, a Claude `--google-model` is **refused**, not silently shipped.
400
+
-**Per-agent IDs via A2A.** Google deploys the roster as one `reasoningEngine`; per-agent addressability would need the A2A protocol across deployments.
401
+
402
+
Genuinely on the roadmap:
403
+
396
404
-**`export openai-chatkit`** — wrap the `openai-agents` script in a self-hostable ChatKit server (the Agents SDK export already ships)
Copy file name to clipboardExpand all lines: docs/limitations.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -60,7 +60,7 @@ same folder reaches every target. What differs is how far each runtime takes it:
60
60
| Target | Status | Limits |
61
61
|---|---|---|
62
62
| Anthropic Managed Agents | Live deploy | Reference target; most complete mapping (skills, MCP, `:ask`, coordinator). |
63
-
| Google Vertex AI Agent Engine | Live deploy, preview | Deployed as a real `reasoningEngine`; maps **skills** (embedded + ADK `load_skill_from_dir`), **URL MCP** (`McpToolset` + `tool_filter`, inline auth → Agent Engine `env_vars`), and the **built-in web tools** (`web_search` → Google Search grounding, `web_fetch` → URL Context, each a wrapped tool-agent), idempotent via a spec hash. **All six portability dimensions exercised live** (delegation, both MCP servers, both skills — see the [coverage matrix](tested-platforms.md#live-coverage-matrix--receipt-evidence-not-a-capability-ranking)); the web tools were separately exercised live (both tool-agents fired on a deployed engine). Not mapped: the built-in **sandbox** tools (`bash/files/glob-grep` — Vertex's sandbox is Python/JS only) and `:ask`/per-tool approval; stdio MCP refused; Claude models map to Gemini. |
63
+
| Google Vertex AI Agent Engine | Live deploy, preview | Deployed as a real `reasoningEngine`; maps **skills** (embedded + ADK `load_skill_from_dir`), **URL MCP** (`McpToolset` + `tool_filter`, inline auth → Agent Engine `env_vars`), and the **built-in web tools** (`web_search` → Google Search grounding, `web_fetch` → URL Context, each a wrapped tool-agent), idempotent via a spec hash. **All six portability dimensions exercised live** (delegation, both MCP servers, both skills — see the [coverage matrix](tested-platforms.md#live-coverage-matrix--receipt-evidence-not-a-capability-ranking)); the web tools were separately exercised live (both tool-agents fired on a deployed engine). Not mapped, each with a workaround in [deploy-google.md](deploy-google.md#two-known-gaps-and-how-to-work-around-them): the built-in **sandbox** tools (`bash/files/glob-grep` — Vertex's sandbox is Python/JS only; expose equivalents via a URL MCP server, an explicit non-goal to emulate in-engine) and `:ask`/per-tool approval (gate it client-side); stdio MCP refused; Claude models map to Gemini (Claude-on-Vertex is an [offline-verified spike](../experiments/claude-on-vertex/), not shipped). |
64
64
| OpenAI Agents SDK | Export / self-host | Subagents via agent-as-tool; the delegation loop runs in your app — no hosted-deploy target. |
0 commit comments