Skip to content

Rebuild realistic end-to-end coverage#1155

Open
bmdavis419 wants to merge 1 commit into
RhysSullivan:mainfrom
davis7dotsh:feat/realistic-e2e-system
Open

Rebuild realistic end-to-end coverage#1155
bmdavis419 wants to merge 1 commit into
RhysSullivan:mainfrom
davis7dotsh:feat/realistic-e2e-system

Conversation

@bmdavis419

Copy link
Copy Markdown

Summary

  • Centralize 16 e2e projects in one capability matrix with fail-closed required coverage.
  • Replace shared and live test dependencies with scoped emulators or deterministic protocol fixtures while keeping product SDKs and clients real.
  • Add pinned Claude Code coverage with deterministic Anthropic wire replay, same-name account replacement, real MCP calls, and sanitized client evidence.
  • Add realistic cloud, self-host, CLI, and packaged desktop account and organization switching journeys.
  • Add Linux KVM desktop execution plus hardened Tart and EC2 lifecycle support, exact cleanup, and stale-runner recovery.
  • Make traces, recordings, ledgers, logs, test source, and publication provenance portable and reviewable through one sanitized viewer.

Product guarantees

  • Cloud organization switching proves distinct data, URL scope, multitab isolation, first-paint safety, and member access transitions.
  • Self-host invitation and first-owner claims are transaction-backed, and membership removal revokes browser, API, CLI, and MCP access.
  • CLI profiles support safe same-origin multi-account switching, atomic credential updates, unambiguous logout, and contention-safe recovery.
  • Desktop keeps local and remote profiles isolated, preserves a remote active profile across restart, and cannot be clobbered by late sidecar hydration.
  • Real Claude Code switches identities without retaining stale MCP authorization or account data.

CI and evidence

  • Add hermetic PR lanes, production Docker, real Cloudflare Access JWT/JWKS coverage, Linux desktop GUI, native package smoke, and scheduled KVM, Tart, and EC2 lanes.
  • Fail when required clients, GUI, emulators, or VM capabilities disappear instead of silently skipping.
  • Publish immutable sanitized viewer bundles for trusted runs, with verified direct scenario links and an honest artifact fallback for fork runs.
  • Preserve exact per-run cleanup and add repository-scoped TTL sweepers for interrupted VM jobs.

Validation

  • bun run format:check (1,567 files)
  • bun run lint (1,305 files)
  • bun run typecheck (42/42 tasks)
  • bun run test (36/36 tasks)
  • E2E harness: 34/34
  • Real client suite: 6/6
  • VM provider and lifecycle suite: 40/40
  • Workflow YAML and actionlint checks
  • Diff, attribution, content, and provenance-forgery checks

Full app-booting and physical VM lanes were not launched from this workstation. The new CI matrix owns those environment-dependent executions and publishes their sanitized evidence.

@greptile-apps

greptile-apps Bot commented Jun 27, 2026

Copy link
Copy Markdown

Too many files changed for review. (148 files found, 100 file limit)

@bmdavis419

Copy link
Copy Markdown
Author

CI is waiting for maintainer approval because this pull request comes from a fork. Please approve the pending CI, end-to-end, preview, and package workflow runs so the checks and sanitized evidence matrix can execute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant