[P1] validate_setup must fail fast when claude-code main/fallback provider can't spawn in a nested Claude session

## Problem

When the `main` (or `fallback`) model role is configured with the `claude-code` provider and the engine runs *inside* a Claude Code / MCP session, `validate_setup` (the SETUP gate, namespaced `mcp__atlas-engine__validate_setup`) returns `ready=true` and the pipeline advances SETUP -> DISCOVER. The failure only surfaces much later, at `parse_prd`, as an opaque `Claude Code process exited with code 1` with empty stderr. The mechanism: the `claude-code` provider is a CLI-borrowing provider — for the model-spawning roles it shells out `claude -p "<prompt>"`, and a nested Claude session refuses the recursive launch (the child inherits `CLAUDECODE=1`). Setup should fail fast here with a clear diagnostic and a remediation, before any spawn is attempted.

## Current Behavior

`validate_setup()` lives at `mcp-server/capabilities.py:298-450` and runs EXACTLY 6 checks: `binary`, `version`, `project`, `config`, `provider_main`, `provider_research`.

- Check 5 (`provider_main`, lines 386-413) parses `.taskmaster/config.json` (loaded at line 393) and reads only `models.<role>.modelId`:
  - `main_model = models.get("main", {}).get("modelId")` (line 395)
  - `research_model = models.get("research", {}).get("modelId")` (line 396)
  - `fallback_model = models.get("fallback", {}).get("modelId")` (line 397)
  - `provider_ok = bool(main_model)` (line 398)
- It **never reads `models.main.provider`**. A config with `provider="claude-code"` + `modelId="sonnet"` passes `provider_main` (modelId is truthy), so `critical_failures=0` and `ready=true`. The word "provider" in the check id is cosmetic — no provider VALUE is inspected.
- There is **zero env-var inspection** anywhere in the plugin: a grep over `mcp-server/` finds no reference to `CLAUDECODE` or `CLAUDE_CODE_CHILD_SESSION`, and `capabilities.py` does not even import `os`. The only `os.environ` touch is `taskmaster.py:14` (`env = os.environ.copy()` to strip `TASK_MASTER_PROJECT_ROOT` before spawning the CLI — unrelated). No "am I a nested Claude session?" detection exists.

`preflight()` (`mcp-server/pipeline.py:179-234`, namespaced `mcp__atlas-engine__engine_preflight`) reads only `pipeline.json` + taskmaster tag state and recommends `parse_prd` (line 205-206); it performs no provider or env inspection and does not load `config.json`.

The SETUP skill exit gate (`skills/setup/SKILL.md`, Step 4 probe + Exit gate) advances SETUP -> DISCOVER on `validate_setup` readiness, passing the result dict as evidence — so the bad config flows straight through to `parse_prd`. `parse_prd`'s actual `claude -p` spawn is in the external `task-master-ai` CLI, not in this repo, so the fix cannot live at the spawn site; it must be a fail-fast refusal in the SETUP gate.

(Naming note: there is no Python function literally named `engine_preflight`. The MCP tools are `preflight` (pipeline.py) and `validate_setup` (capabilities.py); the `atlas-engine` runtime namespaces them as `engine_preflight` / `validate_setup`.)

## Expected Behavior

When the `main` or `fallback` role uses a CLI-spawning provider (`claude-code` and siblings) AND a nested-Claude environment signal is present (`CLAUDECODE` and/or `CLAUDE_CODE_CHILD_SESSION` set), `validate_setup` must emit a **critical (non-warning) check** that fails. This flips `ready=false` via the existing `critical_failures` aggregation (lines 432-436) and blocks the SETUP -> DISCOVER advance. The check `detail` must explain the recursive-spawn refusal, and the `fix` must steer to a non-spawning provider (anthropic/perplexity) or to running from a plain shell outside Claude Code — never to `--set-main claude-code`. The `research` role (plain-HTTP, e.g. perplexity) must NOT trip the guard.

## Files to Touch

- `mcp-server/capabilities.py` — PRIMARY. In/near Check 5 (after config is loaded at line 393), read `models.main.provider` and `models.fallback.provider` and check `os.environ` for `CLAUDECODE` / `CLAUDE_CODE_CHILD_SESSION`. Add a new critical check `provider_spawnable` (keep the 6 existing ids intact). Add `import os` at the top (lines 11-15) — it is not currently imported.
- `mcp-server/server.py` — update the `validate_setup` tool docstring (line 73) from "Run the 6 Phase-0 SETUP checks" to 7 if a new check id is added.
- `tests/test_capabilities.py` — update the hard-coded id list at line 183 (`["binary","version","project","config","provider_main","provider_research"]`) to include the new id.
- `mcp-server/pipeline.py` — OPTIONAL secondary. `preflight()`'s recommended_action ladder (lines 203-219) could short-circuit `parse_prd` (line 205-206) via a shared helper for callers that bypass the SETUP gate. Only via a shared helper to avoid duplicating the config read.
- `skills/setup/SKILL.md` — OPTIONAL doc note that a nested-Claude claude-code provider is a hard stop (the gate already blocks automatically once the critical check exists).

## Researched Fix Approaches

### 1. [Recommended] — Env+config fail-fast inside validate_setup() (confidence: 92%)
- **Library/Config:** `task-master-ai@0.43.1` config key `models.<role>.provider`; Python `os.environ` (stdlib)
- **Pattern:** After config.json loads (capabilities.py:393), read `models.main.provider` / `models.fallback.provider` (NOT modelId). Define a spawning-provider set `{"claude-code","codex-cli","gemini-cli"}`. Compute `nested = bool(os.environ.get("CLAUDECODE") or os.environ.get("CLAUDE_CODE_CHILD_SESSION"))`. If a spawning provider is the main OR fallback role AND `nested`, append a NEW non-warning check `provider_spawnable` with `passed=False`. Leave `provider_main`'s modelId logic intact (the 6 ids keep meaning); ADD the 7th. The existing `critical_failures` aggregation (lines 432-436) flips `ready=false` and the SETUP skill gate blocks before parse_prd spawns. Exclude the research role so perplexity never trips.
- **Why:** Lowest-surface change — `validate_setup()` already owns the config.json parse and the `critical_failures` contract, and it auto-blocks the skill gate with no skill edit required.
- **Risk:** `tests/test_capabilities.py:183` hard-codes the 6-id list and `server.py:73` says "6 Phase-0 SETUP checks" — both must update to 7. `tests/test_integration.py` asserts no fix string contains `--set-main claude-code`; the new `fix` must steer to anthropic/perplexity or a non-Claude shell, NOT `--claude-code`. `CLAUDE_CODE_ENTRYPOINT` alone can be `sdk`/`mcp` in non-nested contexts, so gate on `CLAUDECODE` / `CLAUDE_CODE_CHILD_SESSION` as primary signals and treat `ENTRYPOINT` as corroborating only.
- **Implementation hint:**
```python
SPAWNING_PROVIDERS = {"claude-code", "codex-cli", "gemini-cli"}
nested = bool(os.environ.get("CLAUDECODE") or os.environ.get("CLAUDE_CODE_CHILD_SESSION"))
main_provider = models.get("main", {}).get("provider")
fb_provider = models.get("fallback", {}).get("provider")
bad = nested and (main_provider in SPAWNING_PROVIDERS or fb_provider in SPAWNING_PROVIDERS)
checks.append({
    "id": "provider_spawnable",
    "name": "Main/fallback provider can spawn in this environment",
    "passed": not bad,
    "detail": (
        f"main provider '{main_provider}' must spawn a CLI but a nested Claude Code "
        f"session was detected (CLAUDECODE/CLAUDE_CODE_CHILD_SESSION set) — recursive "
        f"spawn is refused (exit 1, empty stderr)"
        if bad else f"main={main_provider}, fallback={fb_provider}; nested={nested}"
    ),
    "fix": (
        "Switch the main/fallback role off claude-code: set ANTHROPIC_API_KEY and run "
        "`task-master models --set-main sonnet --anthropic`, OR run task-master from a "
        "plain shell outside any Claude Code session."
        if bad else None
    ),
})
# verified env dump from a real nested run: CLAUDECODE=1, CLAUDE_CODE_CHILD_SESSION=1
```

### 2. [Alternative] — Add the guard to preflight() (pipeline.py) for an earlier refusal (confidence: 70%)
- **Library/Config:** same key `models.<role>.provider`; `os.environ` (stdlib)
- **Pattern:** Add the nested+spawning-provider detection to `preflight()`'s recommended_action ladder (pipeline.py:203-219); before recommending `parse_prd` (line 205-206), return `recommended_action="fix_provider"` with the diagnostic, so agents that call `engine_preflight` directly (bypassing the SETUP gate) still fail fast.
- **Why:** `preflight()` is the tool literally named in the bug and is the recommender that today points at `parse_prd` (including into a polluted prior tag that already holds done tasks). Belt-and-suspenders with Approach 1.
- **Risk:** `preflight()` does NOT currently load `.taskmaster/config.json` (only pipeline.json + tasks/state), so adding the check here duplicates the config read — DRY violation. Must be factored as a shared helper (e.g. `spawn_block_reason(models) -> str | None` in capabilities.py) imported by both. Do this IN ADDITION to Approach 1, not instead of it.

### 3. [Fallback] — Active spawn-probe: actually run `claude -p "ok"` (confidence: 58%)
- **Library/Config:** the `claude` CLI `-p`/`--print` non-interactive flag (`task-master-ai` exposes NO doctor/health/test/dry-run command — verified against installed v0.43.1)
- **Pattern:** When main/fallback provider is claude-code, run `subprocess.run(["claude","-p","ping"], capture_output=True, text=True, timeout=10)` inheriting the current env, mirroring what task-master will do; `passed = rc == 0`.
- **Why:** Strictly more accurate than env-sniffing — also catches missing CLI, version mismatch, broken auth.
- **Risk:** The probe ITSELF spawns `claude -p` inside the nested session — the very thing that crashes — and reports say a nested launch "will crash all active sessions", so probing could destabilise the parent. Adds latency/cost to every validate_setup. Only safe as a NON-nested augmentation: never probe when `CLAUDECODE`/`CLAUDE_CODE_CHILD_SESSION` is set (fail immediately per Approach 1); probe only to catch auth/version failures in a non-nested shell. Keep as a follow-on.

## Reference

How `task-master-ai` (claude-task-master by eyaltoledano) handles this today: it does NOT detect or guard against it at all. The `claude-code` provider is CLI-borrowing — for the `main`/`fallback` roles it spawns `claude -p "<prompt>"` as a child process to reuse the authenticated session (no HTTP/API call), which is why the perplexity research role (plain HTTP) is unaffected. There is NO provider-callability test surface: verified against installed `task-master-ai@0.43.1` — `task-master --help` exposes no doctor/health/validate/test/dry-run (only an unrelated `validate-dependencies`), and `task-master models --help` exposes only `--set-main`/`--set-research`/`--set-fallback`/`--setup` plus per-provider allow flags, no `--test`/`--dry-run`. The failure is a known, OPEN upstream bug: [eyaltoledano/claude-task-master#1509](https://github.com/eyaltoledano/claude-task-master/issues/1509) ("Claude Code and Codex CLI Provider Failures in parse-prd" — exit code 1, AI_APICallError, "fails silently", un-root-caused) and [#928](https://github.com/eyaltoledano/claude-task-master/issues/928). The mechanism is documented in [anthropics/claude-agent-sdk-python#573](https://github.com/anthropics/claude-agent-sdk-python/issues/573) ("Subprocess inherits CLAUDECODE=1 env var, preventing SDK usage from Claude Code hooks/plugins") and the Claude Code 2 nested-session guard ("Claude Code cannot be launched inside another Claude Code session... unset the CLAUDECODE environment variable"). Upstream's only workaround is to unset `CLAUDECODE` at the CLI spawn site — external to this plugin — so the correct in-repo fix is a fail-fast preflight refusal. Peers (claude-agent-sdk, clay #161, paperclip #560) all hit the identical nesting wall and handle it by detecting the env var and erroring early — exactly Approach 1.

## Acceptance Criteria

- [ ] When `.taskmaster/config.json` has `models.main.provider == "claude-code"` AND `CLAUDECODE=1` (or `CLAUDE_CODE_CHILD_SESSION=1`) is in the environment, `validate_setup()` returns `ready=false` with `critical_failures >= 1` and a failed check whose `detail` names the nested-Claude / recursive-spawn refusal.
- [ ] The same condition with `models.fallback.provider == "claude-code"` (main set to anthropic) also fails the new check.
- [ ] With NO nested-Claude env signal (`CLAUDECODE` and `CLAUDE_CODE_CHILD_SESSION` unset), a `claude-code` main provider passes the new check (`ready=true` if all other checks pass) — the guard fires only inside a nested session.
- [ ] A `research` role using a non-spawning provider (e.g. perplexity) never trips the new check, regardless of nested env.
- [ ] The remediation `fix` string for the failing check does NOT contain `--set-main claude-code` / `--set-research claude-code` / `--set-fallback claude-code` (passes `tests/test_integration.py` regression guard) and steers to anthropic/perplexity or a non-Claude shell.
- [ ] `tests/test_capabilities.py:183` id-list assertion is updated and passes; `server.py` validate_setup docstring count matches the new number of checks.
- [ ] The SETUP -> DISCOVER advance in `skills/setup/SKILL.md` is blocked (advance_phase not called / fails its evidence gate) when the new critical check fails.
- [ ] `parse_prd` succeeds and writes tasks to a fresh tag when invoked from inside a Claude Code / MCP session — i.e. with a non-spawning provider (anthropic main) configured, the end-to-end SETUP -> parse_prd path completes without the `exit code 1` failure. (Inverse: with claude-code main in a nested session, setup refuses BEFORE parse_prd is reached.)
- [ ] `npm test` / the Python test suite (`pytest tests/`) passes after the id-list and docstring updates.

## Complexity: S

## Trust Level: HINT (not specification)
The researched approaches above are starting points. Before implementing:
1. Verify the library/config exists as stated (e.g. `task-master models`, read .taskmaster/config.json for the `models.main.provider` key).
2. Check that imports/keys match reality (`capabilities.py` does NOT currently import `os` — add it).
3. Try the recommended approach — if it works in 1-2 attempts, use it.
4. If it fails, do NOT keep retrying — research why, explore alternatives.
5. The acceptance criteria are the real spec, not the approach.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[P1] validate_setup must fail fast when claude-code main/fallback provider can't spawn in a nested Claude session #12

Problem

Current Behavior

Expected Behavior

Files to Touch

Researched Fix Approaches

1. [Recommended] — Env+config fail-fast inside validate_setup() (confidence: 92%)

2. [Alternative] — Add the guard to preflight() (pipeline.py) for an earlier refusal (confidence: 70%)

3. [Fallback] — Active spawn-probe: actually run `claude -p "ok"` (confidence: 58%)

Reference

Acceptance Criteria

Complexity: S

Trust Level: HINT (not specification)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[P1] validate_setup must fail fast when claude-code main/fallback provider can't spawn in a nested Claude session #12

Description

Problem

Current Behavior

Expected Behavior

Files to Touch

Researched Fix Approaches

1. [Recommended] — Env+config fail-fast inside validate_setup() (confidence: 92%)

2. [Alternative] — Add the guard to preflight() (pipeline.py) for an earlier refusal (confidence: 70%)

3. [Fallback] — Active spawn-probe: actually run claude -p "ok" (confidence: 58%)

Reference

Acceptance Criteria

Complexity: S

Trust Level: HINT (not specification)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

3. [Fallback] — Active spawn-probe: actually run `claude -p "ok"` (confidence: 58%)