Skip to content

Commit e113225

Browse files
phurynclaude
andcommitted
Claude-on-Vertex spike: make the live probe turn-key on claude-sonnet-4-6
The Model Garden precondition is now satisfied (claude-sonnet-4-6 enabled), so the deploy probe is no longer blocked on enablement -- only on being run. - deploy probe: default CLAUDE_VERTEX_MODEL=claude-sonnet-4-6, engine region us-central1; add optional CLAUDE_VERTEX_REGION -> injected as a GOOGLE_CLOUD_LOCATION Agent Engine env var so the in-engine Claude client can target the global model endpoint while the engine resource stays regional (the one live unknown). - construct probe: default to the bare claude-sonnet-4-6 id (verified to resolve in ADK 1.34.3 without an @Version suffix). - RESULTS.md: refresh console output, correct the "id must be @versioned" claim (bare resolves), and reframe the live-half preconditions around the region question. Still offline-only: no live receipt, planner still refuses a Claude --google-model. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 2108eb4 commit e113225

3 files changed

Lines changed: 58 additions & 28 deletions

File tree

experiments/claude-on-vertex/RESULTS.md

Lines changed: 24 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -17,19 +17,19 @@ probe that would graduate it.
1717

1818
```
1919
registry:
20-
'claude-sonnet-4-5@20250929' -> google.adk.models.anthropic_llm.Claude
20+
'claude-sonnet-4-6' -> google.adk.models.anthropic_llm.Claude
2121
Claude.supported_models() = ['claude-3-.*', 'claude-.*-4.*']
2222
2323
constructed (offline, no ADC):
24-
parent : lead model=claude-sonnet-4-5@20250929 -> Claude
24+
parent : lead model=claude-sonnet-4-6 -> Claude
2525
web sub : lead_web_search model=gemini-2.5-flash -> Gemini
2626
2727
OK: Claude main agent + Gemini-pinned web sub-agent compose. Mixed-model invariant holds.
2828
```
2929

3030
Two facts established:
3131

32-
1. **ADK natively resolves Claude on Vertex.** `LLMRegistry.resolve("claude-sonnet-4-5@20250929")`
32+
1. **ADK natively resolves Claude on Vertex.** `LLMRegistry.resolve("claude-sonnet-4-6")`
3333
returns `google.adk.models.anthropic_llm.Claude`, backed by `AsyncAnthropicVertex`
3434
Claude served through Vertex AI, no extra package. An `LlmAgent(model="claude-…")` is a
3535
valid ADK agent.
@@ -38,27 +38,35 @@ Two facts established:
3838
`GoogleSearchTool`/`url_context`, which are **Gemini built-ins** — they cannot run on a
3939
Claude model. So a Claude parent must pin its wrapped web sub-agents to Gemini.
4040

41-
### The Vertex Claude id is `@versioned`
41+
### The Vertex Claude id resolves bare (a `@version` suffix is optional)
4242

43-
A Vertex Claude model id carries an `@version` suffix (`claude-sonnet-4-5@20250929`),
44-
unlike the Anthropic API ids the folder uses (`claude-haiku-4-5`). Any future passthrough
45-
would have to map folder ids → the `@versioned` Vertex ids, per region availability.
43+
ADK resolves the **bare** Vertex Claude id (`claude-sonnet-4-6`) — confirmed above against
44+
`Claude.supported_models() = ['claude-3-.*', 'claude-.*-4.*']` — and an `@versioned` form
45+
(`claude-sonnet-4-5@20250929`) resolves too. So a future passthrough maps folder ids →
46+
Vertex Claude ids subject to **per-region/Model-Garden availability**, not a mandatory
47+
version-pinning step.
4648

4749
## What this does NOT prove (the live half — blocked)
4850

4951
`claude_on_vertex_deploy.py` is the live probe that would close the loop: it deploys ONE
5052
`reasoningEngine` with a Claude-on-Vertex root + Gemini web sub-agent, queries it (the
5153
instruction prepends a literal `CLAUDEVTX` token so the reply confirms which brain
5254
answered), and tears it down. It is **env-driven and committed without identifiers**, and
53-
has **not been run** — it needs preconditions that aren't satisfiable in this session:
55+
has **not been run yet**. Preconditions:
5456

5557
- **Claude enabled in the project's Vertex AI Model Garden** — a one-time console action;
56-
Claude on Vertex is an enable-per-project, region-gated partner model.
57-
- **A region that serves the chosen Claude model** (e.g. `us-east5` — not every region
58-
serves every Claude model).
58+
Claude on Vertex is an enable-per-project, region-gated partner model. *Now satisfied:*
59+
`claude-sonnet-4-6` was enabled in this project (2026-06-04), so this is no longer the
60+
blocker — only running the probe is.
61+
- **The model-call region (the one live unknown).** An Agent Engine *resource* deploys to a
62+
real region; at runtime the in-engine ADK Claude client calls `AsyncAnthropicVertex` with
63+
`GOOGLE_CLOUD_LOCATION`. If the model is served only at the **global** endpoint (the
64+
Vertex quickstart uses `region="global"`), the probe injects `GOOGLE_CLOUD_LOCATION=global`
65+
as an engine env var (`CLAUDE_VERTEX_REGION=global`) while the engine stays in a real
66+
region. Whether one knob or the override is needed is exactly what the live run settles.
5967
- **A billable project + staging bucket + ADC**, exactly like a normal Google deploy.
6068

61-
Until that runs green, "Agent Engine will deploy *and run* a Claude-on-Vertex engine
69+
Until the probe runs green, "Agent Engine will deploy *and run* a Claude-on-Vertex engine
6270
end-to-end" is **NOT-PROVEN** — distinct from the offline-verified construction.
6371

6472
## What shipped in agentlift as a result of this spike
@@ -85,10 +93,11 @@ user-facing passthrough flag. Concretely:
8593
1. Run `claude_on_vertex_deploy.py` against a project with Claude enabled in Model Garden;
8694
capture the `CLAUDEVTX`-prefixed reply as a receipt (the unforgeable signal that the
8795
Claude brain — not the Gemini default — answered).
88-
2. Encode the wire behavior: the folder-id → `@versioned`-Vertex-id map (per region), and
89-
whatever `requirements`/region constraints the live deploy revealed.
96+
2. Encode the wire behavior: the folder-id → Vertex-Claude-id map (per Model-Garden/region
97+
availability; bare id ok), and whatever `requirements`/region constraints the live
98+
deploy revealed (notably whether the model call needs the `global` endpoint).
9099
3. Replace the planner guard with a real passthrough (e.g. `--google-model claude-…` or a
91100
per-agent opt-in), keeping `web_model()` pinning the web sub-agents to Gemini.
92101

93102
*Offline half confirmed 2026-06-04 with google-adk 1.34.3. Live half: NOT-PROVEN (Model
94-
Garden enablement required).*
103+
Garden now enabled for `claude-sonnet-4-6`; deploy probe not yet run).*

experiments/claude-on-vertex/claude_on_vertex_construct.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@
2222
"""
2323
from __future__ import annotations
2424

25-
# A Vertex Claude model id carries an @version suffix (unlike the Anthropic API ids the
26-
# folder uses, e.g. claude-haiku-4-5). Pick one your project has enabled for the deploy half.
27-
CLAUDE_VERTEX_MODEL = "claude-sonnet-4-5@20250929"
25+
# ADK resolves the bare Vertex Claude id (an @version suffix also works). Pick one your
26+
# project has enabled for the deploy half; this matches claude_on_vertex_deploy.py's default.
27+
CLAUDE_VERTEX_MODEL = "claude-sonnet-4-6"
2828
GEMINI_WEB_MODEL = "gemini-2.5-flash" # web grounding / URL Context are Gemini built-ins
2929

3030

experiments/claude-on-vertex/claude_on_vertex_deploy.py

Lines changed: 31 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -9,18 +9,29 @@
99
1010
PRECONDITIONS (all on you, the deployer):
1111
* Claude models ENABLED in your Vertex AI Model Garden (a one-time console action;
12-
Claude on Vertex is an enable-per-project, region-gated partner model).
13-
* A region that offers the chosen Claude model (e.g. us-east5 -- NOT every region
14-
serves every Claude model; check Model Garden for availability).
12+
Claude on Vertex is an enable-per-project, region-gated partner model). Confirmed
13+
available for claude-sonnet-4-6 in this project (2026-06-04).
1514
* A billable GCP project + a Cloud Storage staging bucket + ADC, exactly like a
1615
normal Google deploy (see docs/deploy-google.md).
1716
17+
THE REGION QUESTION (the one thing the live run is here to settle):
18+
An Agent Engine *resource* deploys to a real region (GOOGLE_CLOUD_LOCATION). At
19+
runtime the in-engine ADK Claude client builds AsyncAnthropicVertex(region=that same
20+
location) -- so the model is called in the engine's region. If your Claude model is
21+
served *regionally* (e.g. the engine region also serves it), one knob is enough. If
22+
it is served only at the **global** endpoint (the Vertex quickstart uses
23+
region="global"), set CLAUDE_VERTEX_REGION=global: the engine still deploys to a real
24+
region, but we inject GOOGLE_CLOUD_LOCATION=global as an engine env var so the model
25+
call targets the global endpoint. Try the one-knob path first; reach for the override
26+
only if the run fails to find the model.
27+
1828
Environment:
1929
GOOGLE_CLOUD_PROJECT=your-project
20-
GOOGLE_CLOUD_LOCATION=us-east5 # a region where your Claude model is served
30+
GOOGLE_CLOUD_LOCATION=us-central1 # where the ENGINE deploys (Agent Engine region)
2131
GOOGLE_GENAI_USE_VERTEXAI=TRUE
2232
AGENTLIFT_GCP_STAGING_BUCKET=gs://your-bucket
23-
CLAUDE_VERTEX_MODEL=claude-sonnet-4-5@20250929 # the @versioned Vertex Claude id
33+
CLAUDE_VERTEX_MODEL=claude-sonnet-4-6 # the Vertex Claude id you enabled (bare id ok)
34+
# CLAUDE_VERTEX_REGION=global # OPTIONAL: model-call region if != engine region
2435
# ADC from `gcloud auth application-default login`, or GOOGLE_APPLICATION_CREDENTIALS
2536
2637
Run:
@@ -87,19 +98,29 @@ def deploy() -> None:
8798
import vertexai
8899
from vertexai import agent_engines
89100

90-
env = _require("GOOGLE_CLOUD_PROJECT", "AGENTLIFT_GCP_STAGING_BUCKET", "CLAUDE_VERTEX_MODEL")
91-
location = os.environ.get("GOOGLE_CLOUD_LOCATION", "us-east5")
101+
env = _require("GOOGLE_CLOUD_PROJECT", "AGENTLIFT_GCP_STAGING_BUCKET")
102+
model = os.environ.get("CLAUDE_VERTEX_MODEL", "claude-sonnet-4-6")
103+
location = os.environ.get("GOOGLE_CLOUD_LOCATION", "us-central1")
92104
os.environ.setdefault("GOOGLE_GENAI_USE_VERTEXAI", "TRUE")
93105

106+
# The model-call region. If the Claude model is only served at the global endpoint
107+
# (the Vertex quickstart uses region="global"), set CLAUDE_VERTEX_REGION=global: the
108+
# engine still deploys to `location`, but we inject GOOGLE_CLOUD_LOCATION as an engine
109+
# env var so the in-engine AsyncAnthropicVertex client targets that endpoint instead.
110+
model_region = os.environ.get("CLAUDE_VERTEX_REGION")
111+
env_vars = {"GOOGLE_CLOUD_LOCATION": model_region} if model_region else None
112+
94113
vertexai.init(
95114
project=env["GOOGLE_CLOUD_PROJECT"],
96115
location=location,
97116
staging_bucket=env["AGENTLIFT_GCP_STAGING_BUCKET"],
98117
)
99-
print(f"deploying Claude-on-Vertex engine: model={env['CLAUDE_VERTEX_MODEL']} region={location}")
118+
region_note = f" model_region={model_region}" if model_region else ""
119+
print(f"deploying Claude-on-Vertex engine: model={model} region={location}{region_note}")
100120
remote = agent_engines.create(
101-
agent_engine=_build_app(env["CLAUDE_VERTEX_MODEL"]),
121+
agent_engine=_build_app(model),
102122
requirements=["google-cloud-aiplatform[adk,agent_engines]", "google-adk>=1.34.3"],
123+
env_vars=env_vars,
103124
)
104125
with open(STATE, "w", encoding="utf-8") as fh:
105126
fh.write(remote.resource_name)
@@ -129,7 +150,7 @@ def teardown() -> None:
129150
raise SystemExit("no state file; nothing to tear down (or delete the engine in the console).")
130151
resource_name = open(STATE, encoding="utf-8").read().strip()
131152
env = _require("GOOGLE_CLOUD_PROJECT")
132-
location = os.environ.get("GOOGLE_CLOUD_LOCATION", "us-east5")
153+
location = os.environ.get("GOOGLE_CLOUD_LOCATION", "us-central1")
133154
vertexai.init(project=env["GOOGLE_CLOUD_PROJECT"], location=location)
134155
print(f"deleting {resource_name} ...")
135156
agent_engines.get(resource_name).delete(force=True)

0 commit comments

Comments
 (0)