Skip to content

Commit 02ad8ec

Browse files
phurynclaude
andcommitted
Release 0.5.0: full Google Skills + URL MCP mapping, 6/6 live coverage verification
Google (--target google) deploy now maps the same portability surface as Anthropic for skills and remote MCP, and both providers are verified live, end to end. Google target: - Skills embedded in the engine's source package, loaded via ADK load_skill_from_dir - URL MCP -> ADK McpToolset + tool_filter allowlist - Inline MCP auth resolved from the deployer's local env -> Agent Engine env_vars (never inlined into source, plan, or lockfile; only env-var NAMES are written down) - google_plan.py (pure) / google_codegen.py / google_lock.py spec-hash idempotency (create / update / skip), mirroring Anthropic's plan-is-the-contract discipline Live coverage matrix (tests/live/): one neutral 6-dimension fixture deployed to BOTH runtimes and queried; classification keyed on objective runtime events, not answer text. Both Anthropic and Google exercised all six dimensions server-side (6/6). For async Anthropic subagents the proof is the native delegation event (session.thread_created + agent.thread_message_sent), not a completed worker round-trip. Receipts committed as evidence under tests/live/receipts/ (GCP project id redacted; engines torn down). Portability fix: the planner auto-enables `read` for skill-bearing agents (Managed Agents needs it to open SKILL.md) with a skills.read_enabled warning. Docs: README 3-provider coverage matrix (OpenAI export-only, asterisked); tested-platforms, limitations, deploy-google updated to the 6/6 live state. Offline contract pinned in tests/test_coverage_matrix_plan.py (CI); live wrapper gated behind AGENTLIFT_LIVE_COVERAGE=1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 4a92bdb commit 02ad8ec

47 files changed

Lines changed: 4114 additions & 122 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@ jobs:
1717
- uses: actions/setup-python@v5
1818
with:
1919
python-version: ${{ matrix.python-version }}
20-
- run: python -m pip install -e ".[dev]"
20+
# install the Google ADK stack too, so the ADK-dependent offline tests
21+
# (generated-package build + import) actually run on GitHub instead of skipping
22+
- run: python -m pip install -e ".[dev,google]"
2123
- name: run offline suite (no API key)
2224
run: pytest -m "not live" -q
2325

.gitignore

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,22 @@ venv/
1616
src/*.egg-info/
1717

1818
# agentlift deploy state (per-project lockfile is meant to be committed by USERS,
19-
# but in THIS repo's examples we keep generated lockfiles out)
19+
# but in THIS repo's examples + test fixtures we keep generated state out — it
20+
# carries real resource IDs and is regenerated on every deploy)
2021
examples/**/.agentlift-lock.json
22+
tests/live/**/.agentlift-lock.json
23+
tests/live/**/.agentlift-google.json
24+
# live-harness operational state (deploy bookkeeping; points at real/short-lived
25+
# resources). The committed *receipts* are the evidence; this scratch is not.
26+
tests/live/receipts/_state-*.json
27+
tests/live/receipts/_preflight-*.json
28+
29+
# generated Google Agent Engine source package (rebuilt on every deploy)
30+
.agentlift-build/
31+
**/.agentlift-build/
32+
33+
# local agent runtime state (Claude Code scheduled-task locks, etc.)
34+
.claude/
2135

2236
# scratch
2337
Temp/

AGENTS.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# AGENTS.md
2+
3+
**Read [CLAUDE.md](CLAUDE.md) first — it is the canonical guide for this repository.**
4+
5+
It covers the architecture (`parse → plan → apply → run`), the module map, the
6+
`.managed-agents/` folder convention, per-provider status (Anthropic / Google /
7+
OpenAI), the commands, and the dev workflow + ground rules.
8+
9+
## Quick orientation
10+
11+
agentlift compiles one neutral agent folder to multiple managed-agent runtimes.
12+
The front half is **pure** (`parser.py`, `planner.py`, `capabilities.py`,
13+
`export.py`); only the `*_target.py` and `runtime.py` modules touch the network.
14+
15+
```bash
16+
python -m pip install -e ".[dev]"
17+
pytest -m "not live" # deterministic suite CI runs — start here
18+
```
19+
20+
Non-negotiables (full version in [CLAUDE.md](CLAUDE.md)):
21+
22+
- Keep `parser.py` and `planner.py` pure — no network, clock, or randomness.
23+
- Every translation rule gets an offline test asserting the plan. The plan is the contract.
24+
- Surface untranslatable things as `Diagnostic`s; never drop silently.
25+
- `capabilities.py` is the single source of truth for provider support tiers.

CLAUDE.md

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
# CLAUDE.md
2+
3+
Guidance for Claude Code (and any AI agent) working in this repository.
4+
5+
## What agentlift is
6+
7+
A compiler with a CLI. You define an agent **once** as a neutral folder
8+
(`.managed-agents/` — system prompt + skills + MCP servers + tool allowlist +
9+
subagent roster). agentlift then treats each managed-agent runtime as a back-end:
10+
11+
- `audit` — report, per provider, what is `native` / `emulated` / `degraded` / `unsupported` (offline).
12+
- `export` — compile the folder to a provider-native artifact: `anthropic-yaml` (for the `ant` CLI), `google-adk`, `openai-agents` (offline).
13+
- `deploy` — push to a live managed runtime via API: **Anthropic** (full) and **Google `--target google`** (preview).
14+
15+
Tagline: *Own the definition. Rent the runtime.*
16+
17+
## The pipeline: `parse → plan → apply → run`
18+
19+
```
20+
folder ──parse──▶ Project ──plan──▶ DeployPlan ──apply──▶ live IDs ──▶ lockfile
21+
(pure) (pure) (network)
22+
```
23+
24+
- **parse** ([parser.py](src/agentlift/parser.py)) — read the folder into a `Project` of `AgentSpec`s. Pure file IO.
25+
- **plan** ([planner.py](src/agentlift/planner.py)) — `Project → DeployPlan`: a deterministic list of API ops with **symbolic refs** (`@skill:<hash8>`, `@agent:<name>`), skill dedup, validation, diagnostics. **No network.** This is what `agentlift plan` prints and what offline tests assert against — *the plan is the contract.*
26+
- **apply** ([anthropic_target.py](src/agentlift/anthropic_target.py)) — the only Anthropic networking. Resolves symbolic refs to real IDs, uploads skills (deduped via lockfile), creates agents in dependency order, writes `.agentlift-lock.json` for idempotent re-deploys.
27+
- **run** ([runtime.py](src/agentlift/runtime.py)) — invoke a deployed agent by ID, or run the same folder locally (`--local`).
28+
29+
## Module map (`src/agentlift/`)
30+
31+
| File | Role | Pure? |
32+
|---|---|---|
33+
| [model.py](src/agentlift/model.py) | dataclasses: `Project`, `AgentSpec`, `SkillSpec`, `McpServerSpec`; `BUILTIN_TOOL_MAP` ||
34+
| [parser.py](src/agentlift/parser.py) | folder → `Project` (frontmatter, skills, MCP, knowledge, shared/local refs) ||
35+
| [planner.py](src/agentlift/planner.py) | `Project``DeployPlan` (Anthropic wire shape, symbolic refs) ||
36+
| [capabilities.py](src/agentlift/capabilities.py) | the provider capability map (`anthropic`/`google`/`openai` × feature → tier) — **single source of truth** for `audit` and `export` annotations ||
37+
| [audit.py](src/agentlift/audit.py) | cross-reference folder features against `capabilities` ||
38+
| [export.py](src/agentlift/export.py) | `Project`/`DeployPlan` → text artifact (anthropic-yaml, google-adk, openai-agents) ||
39+
| [anthropic_target.py](src/agentlift/anthropic_target.py) | `DeployPlan` → Anthropic API (skills + agents + multiagent) | ❌ network |
40+
| [google_plan.py](src/agentlift/google_plan.py) | `Project``GoogleDeployPlan` (ADK recipe: agents, skills, URL MCP, env-var names, model map, spec hash, diagnostics) ||
41+
| [google_codegen.py](src/agentlift/google_codegen.py) | `GoogleDeployPlan` → source package (`agentlift_engine/agent.py` + `requirements` + embedded skill bundles) ||
42+
| [google_lock.py](src/agentlift/google_lock.py) | `.agentlift-google.json` spec-hash state + pure `decide_action` → create/update/skip ||
43+
| [google_target.py](src/agentlift/google_target.py) | `GoogleDeployPlan` → built source package → live `reasoningEngine` via `agent_engines.create/update()` (source-deploy as a relative `ModuleAgent`; resolves MCP auth env vars) | ❌ network |
44+
| [lockfile.py](src/agentlift/lockfile.py) | `.agentlift-lock.json` idempotency state (Anthropic) ||
45+
| [diff.py](src/agentlift/diff.py) | plan vs lockfile (and optional `--remote`) | mostly |
46+
| [runtime.py](src/agentlift/runtime.py) | run managed / run local | ❌ network |
47+
| [cost.py](src/agentlift/cost.py), [graders.py](src/agentlift/graders.py) | token→USD estimate; substring + LLM graders | mixed |
48+
| [cli.py](src/agentlift/cli.py) | argparse entry point (`python -m agentlift.cli`) ||
49+
50+
## The folder convention (the input)
51+
52+
```
53+
.managed-agents/
54+
shared/
55+
skills/<name>/SKILL.md # skill shared across agents (uploaded once on Anthropic)
56+
mcp.json # MCP servers shared across agents
57+
<agent>/
58+
agent.md # YAML frontmatter + system prompt (CLAUDE.md also accepted)
59+
skills/<name>/SKILL.md # private skill (this agent only)
60+
mcp.json / .mcp.json # private MCP servers
61+
knowledge/*.md # folded into the system prompt
62+
```
63+
64+
Also accepted: a **single agent dir** passed directly (must contain `agent.md` or `CLAUDE.md`),
65+
including an existing `.claude/agents/<name>/` embedded folder. `.claude/agents/` is **never
66+
auto-scanned** — those are local subagents, not deploy targets.
67+
68+
`agent.md` frontmatter: `name`, `model`, `description`, `tools: [read, glob, bash:ask, ...]`
69+
(built-in allowlist; `:ask`/`:allow` permission suffix), `skills: [name, shared/name]`,
70+
`mcp: [name, shared/name]`, `subagents: [a, b]` (makes it a coordinator), `knowledge: skip`.
71+
A bare ref resolves to the agent's **own** resource first, then `shared/`.
72+
73+
## Provider status (keep honest — see [IMPLEMENTATION-STATUS], external)
74+
75+
| | Anthropic | Google (`--target google`) | OpenAI |
76+
|---|---|---|---|
77+
| Handoff | `deploy` (live, **full**) | `deploy` (live, **preview**) | `export` + self-host only |
78+
| Subagents | native, per-agent IDs | emulated (one `reasoningEngine`, server-side delegation) | `as_tool`, loop in your app |
79+
| Skills | uploaded, shared by id (skill-bearing agents auto-get `read` — Managed Agents needs it to open `SKILL.md`) | ✅ embedded in source package, loaded via ADK `load_skill_from_dir` (update = redeploy) | export comment only |
80+
| Remote MCP | mapped | ✅ URL → ADK `McpToolset` + `tool_filter`; inline auth → Agent Engine `env_vars` (resolved at deploy, never inlined) | export comment only |
81+
| Built-in tools | mapped | 🚧 skipped (sandbox is Python/JS only) | self-host runner |
82+
| `:ask` | permission policy | 🚧 unsupported on `VertexAiSessionService` | client-side |
83+
| Idempotency | lockfile + content hashes |`.agentlift-google.json` spec hash → create/update/skip | n/a |
84+
| Model | Claude (native) | 🔁 mapped to Gemini (`gemini-2.5-flash`) | 🔁 mapped to `gpt-*` |
85+
86+
**Live-verified (6/6 both):** one neutral fixture (`tests/live/fixtures/coverage-matrix`) was deployed
87+
+ queried on **both** Anthropic and Google; all six portability dimensions (agents · subagents ·
88+
shared MCP · individual MCP · shared skill · individual skill) were **EXERCISED server-side**
89+
objective runtime events, not answer text. Anthropic's subagents cell keys on the native delegation
90+
event (`session.thread_created` + `agent.thread_message_sent`) since coordinator delegation is async.
91+
Committed receipts: `tests/live/receipts/20260604-012428-anthropic` + `20260604-004318-google`. The
92+
WIRED layer is pinned offline in `tests/test_coverage_matrix_plan.py` (CI); the live harness is
93+
`tests/live/coverage_matrix.py` (gated pytest wrapper: `tests/live/test_coverage_matrix.py`). See
94+
[docs/tested-platforms.md](docs/tested-platforms.md). OpenAI stays `export`-only (no hosted engine).
95+
96+
**The Google divergence to remember:** `audit` reports each *platform's* capability;
97+
`deploy --target google` reports *agentlift's current implementation*. These now agree on
98+
skills + URL MCP (both mapped). They still diverge on the built-in sandbox and `:ask`
99+
(`audit` rates them `degraded`/`unsupported` for Google; `deploy` refuses or skips a
100+
stdio MCP server / built-in-tool-only folder). Pipeline for Google mirrors Anthropic's
101+
*plan-is-the-contract* discipline: `google_plan.py` is pure and offline-tested, only
102+
`google_target.py` touches the network.
103+
104+
## Commands
105+
106+
```bash
107+
agentlift validate <path> # parse + plan, report problems (exit 1 on errors)
108+
agentlift plan <path> [--json] [--target anthropic|google] [--google-model M] # deterministic deploy plan, no network
109+
agentlift audit <path> --targets anthropic,google,openai
110+
agentlift export <target> <path> [--out DIR] # anthropic-yaml | google-adk | openai-agents
111+
agentlift diff <path> [--remote]
112+
agentlift deploy <path> [--target anthropic|google] [--build-only] [--prune] [-y]
113+
agentlift run <agent> --project <path> --task "..." [--local]
114+
agentlift list/destroy/bench ...
115+
```
116+
117+
Not on PATH? `python -m agentlift.cli <cmd>` always works.
118+
119+
## Dev workflow & ground rules
120+
121+
```bash
122+
python -m pip install -e ".[dev]"
123+
pytest -m "not live" # fast, deterministic, no API key — what CI runs
124+
ANTHROPIC_API_KEY=... pytest -m live # hits the real API, costs cents
125+
```
126+
127+
- **Keep `parser.py` and `planner.py` pure.** No network, no clock, no randomness. If a behavior can be tested offline, it lives there and gets an offline test in `tests/`.
128+
- **Every translation rule needs an offline test asserting the plan** ([tests/test_planner.py](tests/test_planner.py)). The plan is the contract.
129+
- **New API behavior gets confirmed live first, then encoded.** Don't guess wire format from docs alone — the betas move. Anthropic wire format notes live in [anthropic_target.py](src/agentlift/anthropic_target.py) docstring + [docs/anthropic-mapping.md](docs/anthropic-mapping.md).
130+
- **Surface, don't swallow.** Anything agentlift can't translate becomes a `Diagnostic` (error/warning), visible in `agentlift plan` — never a silent drop.
131+
- **`capabilities.py` is the single source of truth** for what each provider supports. `audit` and `export` annotations both read it; update it (not ad-hoc strings) when provider support changes.
132+
- **Adding a provider target:** implement the same `apply(plan)` contract as `anthropic_target.Deployer`; the planner already emits provider-agnostic ops. Keep the convention identical so one folder deploys anywhere.
133+
- Windows shell is PowerShell; Bash tool is available for POSIX scripts. The repo ships both `demo/*.ps1` and `demo/*.sh`.
134+
135+
## Key docs
136+
137+
- [docs/convention.md](docs/convention.md) — the `.managed-agents/` spec
138+
- [docs/anthropic-mapping.md](docs/anthropic-mapping.md) — exact local → Managed Agents field mapping
139+
- [docs/deploy-google.md](docs/deploy-google.md) — Google ADC/credentials/setup
140+
- [docs/tested-platforms.md](docs/tested-platforms.md) — per-platform live test receipts
141+
- [docs/how-it-works.md](docs/how-it-works.md), [docs/deploying.md](docs/deploying.md), [docs/limitations.md](docs/limitations.md)
142+
- **External single source of truth for "real vs roadmap":** the author's `IMPLEMENTATION-STATUS.md` (kept in sync with README/article). Version is **0.5.0**.

0 commit comments

Comments
 (0)