Skip to content

Re-queue failed sandboxes: handle SandboxFailed with bounded retry #76

Description

@lsfera

When the orchestrator claims a ready-for-agent issue it removes the label so the next tick won't re-pick it. If the sandbox then fails, the issue is silently dropped and never retried. The reducer's SandboxFailed branch is a stub (reduce.ts: // SandboxFailed lands in a later slice.default returns []), and main.ts has two un-retried failure paths:

  • Throw path: runner.runIssue throws (e.g. the transient Claude usage-limit AgentError that hit Review gate before auto-merge in /afk (AI Reviewer step) #74) → the catch logs [afk] #N failed: … and leaves the issue unlabelled — no comment, no relabel.
  • 0-commit path: sandbox finishes with commits.length === 0 → posts a comment but leaves the issue unlabelled (the comment itself says "Automated backoff/retry is later work, not slice 1").

Observed live: a /afk run on #74 hit the usage limit, the sandbox threw, and #74 had to be re-labelled ready-for-agent by hand to retry. This issue implements that bounded automated retry.

Goal (end-to-end value)

A failed sandbox is automatically re-queued (re-labelled ready-for-agent) up to a bounded number of attempts, so transient failures (usage limit, network blip) self-recover without a human. After the cap is reached, the issue is left unlabelled with an explanatory comment — never an infinite retry loop.

Acceptance criteria

  • Reducer gate (the real logic — see Testing): reduce.ts handles the SandboxFailed event (remove the stub). State gains per-issue failed-attempt tracking and Policy gains a maxRetries cap (default 2). SandboxFailed → if attempts-so-far < maxRetries: emit Relabel { issueId, label: READY_LABEL } (re-queue); else: do not relabel (drop) and signal exhaustion so the orchestrator can comment (a new Comment/GiveUp action, or an equivalent the implementer chooses — keep the reducer pure).
  • No infinite loop: a persistently-failing issue is retried at most maxRetries times within a run, then left unlabelled. This must be asserted.
  • main.ts wiring: on a sandbox throw (AgentError etc.) AND on the existing 0-commit path, dispatch SandboxFailed to the reducer and execute the resulting Relabel (re-add ready-for-agent) or exhaustion-comment. Attempt counts are tracked in the orchestrator's in-run State.
  • Exhaustion is visible: when retries are exhausted, post a comment explaining the issue was left unlabelled after N failed attempts (reuse the existing 0-commit comment style).
  • afk/hitl-agnostic: re-queue applies in both modes (the failure is orthogonal to the review gate).
  • Happy path untouched: SandboxFinished → review gate (Review gate before auto-merge in /afk (AI Reviewer step) #74) → merge is unchanged; only the failure branch changes.
  • Doc: a short note in CLAUDE.md (or a brief ADR) on the retry semantics + the known limitation below.

Known limitation (state in the doc, don't over-build)

Attempt counts live in the orchestrator's in-run State; a fresh orchestrator start resets them (a persistently-failing issue could get another maxRetries attempts on the next run). In-run bounding is sufficient for v1 — do not add cross-run persistence.

Testing (highest, single seam — logic in TypeScript)

Reuse the reducer seam — reduce.test.ts (node:test + node:assert/strict), where the afk/hitl + review transitions are already tested:

  • SandboxFailed with attempts < maxRetriesRelabel(ready-for-agent).
  • SandboxFailed at the cap → no Relabel (exhausted) + the exhaustion signal; assert the issue is not re-queued (no infinite loop).
  • After a re-queue Relabel, the issue is isReady again on the next Tick.
  • Existing SandboxFinished/ReviewFinished/reviewVerdict/dependency tests still pass unchanged.
    Prior art: the onPrMerged / onReviewFinished handlers and the existing Relabel assertions.

Unchanged (do not touch)

The review gate (#74 / ADR-0020), merge mechanics, and dependency unblocking (#2). Only the SandboxFailed/0-commit failure branch changes.

Constraints (per repo workflow)

Implement with /tdd per criterion, run shell via /exec, stay scoped to reduce.ts + reduce.test.ts + the main.ts wiring (+ the small State/Policy additions), no new dependencies, no pushing to main.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions