cavekit

compressed spec-driven development for claude code
_{one file · one loop · zero sub-agents}

what this is

Plan-then-execute forgets. SDD remembers — but most SDD frameworks bury that value under agent swarms, dashboards, and ceremony that costs more tokens than it saves.

Cavekit is the simplest full loop: grill → spec → research → review → build, over one SPEC.md file, no sub-agents. Three commands you run every time; four more you reach for only when the change earns it.

The spine is three properties that earn their tokens:

durable spec — SPEC.md at repo root survives context resets. It is the agent's long-term memory: lose the window, reload the spec, keep going.
caveman encoding — ~75% fewer tokens than prose. Symbols, fragments, pipe tables. All nine skill descriptions cost ~1.1k context — 16× lighter than spec-kit's 18.6k. That is the whole point.
backprop reflex — every test failure becomes a §B entry; classes of bug become §V invariants the spec never forgets.

And one rule that keeps it from bloating into the frameworks it replaces: right-size. A one-line fix is just /build. The full chain is for genuinely uncertain or high-blast-radius work — never for a typo.

commands

the loop — run these every time:

cmd	job
`/ck:spec`	create / amend / backprop `SPEC.md`. Sole mutator.
`/ck:build`	native plan → execute against spec. Names which test proves each `§V`. Auto-backprops on failure.
`/ck:check`	read-only drift report. Lists §V / §I / §T violations. The drift detector.

reach for these — only when the change earns the ceremony:

cmd	job
`/ck:grill`	interrogate a fuzzy idea into a sharp `§G`/`§C`, one question at a time, before you spec.
`/ck:research`	gather external knowledge into `§R` so build grounds in facts, not hallucinations. Every finding cites a source.
`/ck:review`	adversarial senior review of the spec before build. Refutes, hardens `§V`, ends in a go/no-go gate.
`/ck:deepen`	spare-budget design pass — make one shallow module deep. Behavior held, tests green before & after.

install

One line, via the skills CLI:

npx skills add JuliusBrussee/cavekit

Installs nine skills into ~/.claude/skills/: spec, build, check (the loop), grill, research, review, deepen (reach-for), plus caveman and backprop (the utilities). Claude activates each when its trigger context matches — e.g. "write a spec for…" invokes spec, a fuzzy idea invokes grill, a risky change before build invokes review. Claude Code picks them up on next launch.

Or via the Claude Code marketplace (also adds the /ck:spec, /ck:build, /ck:check, /ck:grill, /ck:research, /ck:review, /ck:deepen slash commands):

/plugin marketplace add juliusbrussee/cavekit
/plugin install ck@cavekit

Or clone directly:

git clone https://github.com/juliusbrussee/cavekit.git ~/.claude/plugins/cavekit

format

See FORMAT.md. Sections: §G goal, §C constraints, §I interfaces, §R research (optional, pipe table), §V invariants, §T tasks (pipe table), §B bugs (pipe table). Each verb owns specific sections — no verb rewrites a section it does not own.

files

FORMAT.md             spec schema + caveman encoding + sectioned ownership
commands/             seven thin slash-command entry points → the skills (loop + reach-for)
skills/spec           spec mutator — sole writer
skills/build          plan-execute, verification contract
skills/check          drift report
skills/grill          sharpen a fuzzy idea → §G/§C before spec
skills/research       external knowledge → §R, every finding sourced
skills/review         adversarial senior review of the spec → hardens §V
skills/deepen         spare-budget design pass — make one module deep
skills/caveman        encoding utility
skills/backprop       bug → spec protocol (six steps)

non-goals

no sub-agents. Main Claude does the work.
no dashboards. cat SPEC.md is the dashboard.
no parallel workers. One thread, one spec, one diff.
no JSON / YAML spec bodies. Markdown + pipe tables.
no hooks, no orchestration binaries, no TypeScript helpers.

older cavekit (the Hunt lifecycle, v3.1.0 and earlier)

The previous generation is not deprecated — it is frozen at tag v3.1.0 and remains a fully working plugin.

What it is:

Spec-driven AI development with an autonomous execution loop. Four-command Hunt lifecycle (/ck:sketch → /ck:map → /ck:make → /ck:check), plus /ck:ship, /ck:review, /ck:revise, /ck:status, /ck:design, /ck:research, /ck:init, /ck:config, /ck:resume, /ck:help — 16 slash commands total. 12 named sub-agents. Per-task token budgets, stop-hook state machine, model-tier routing, auto-backpropagation from test failures, tool-result caching, Codex peer review, Karpathy behavioral guardrails, caveman token compression, knowledge-graph integration, and design-system enforcement. Parallel wave execution and team mode.

Pick v3.1.0 if you want the full autonomous loop, parallel agents, peer review, or design-system workflow. Pick v4 if you want the distilled loop — one spec, no orchestration, right-sized ceremony.

install the older version

Marketplace:

/plugin marketplace add juliusbrussee/cavekit@v3.1.0
/plugin install ck@cavekit

Git:

git clone -b v3.1.0 https://github.com/juliusbrussee/cavekit.git

Full docs live at the tag — git checkout v3.1.0 and read the README there for command reference, skill catalog, and the Hunt lifecycle guide.

choosing, or moving

See UPGRADE.md. Honest framing:

Stay on v3.1.0 if your project has active context/kits/ investment.
Move to v4 if you want fewer moving parts and smaller token bills.
It is a two-way door — SPEC.md is plain markdown; nothing traps you in either direction.

ecosystem

Cavekit is one rock in the caveman family:

repo	what
caveman	output compression skill — why use many token when few do trick
cavemem	cross-agent persistent memory — why agent forget when agent can remember
cavekit (you here)	spec-driven build loop — why agent guess when agent can know
cavegemma	Gemma 4 31B fine-tuned on caveman pairs — why prompt every turn when weight remember

philosophy

The spec is the only artifact that earns its tokens. Everything else that costs tokens must either save more tokens later, or the user's attention, or it gets cut.

See CHANGELOG.md for the full v3 → v4 break.

license

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
.claude-plugin		.claude-plugin
.github		.github
commands		commands
skills		skills
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
FORMAT.md		FORMAT.md
LAUNCH-POST.md		LAUNCH-POST.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
UPGRADE.md		UPGRADE.md
plugin.json		plugin.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

cavekit

what this is

commands

install

format

files

non-goals

older cavekit (the Hunt lifecycle, v3.1.0 and earlier)

install the older version

choosing, or moving

ecosystem

philosophy

license

About

Uh oh!

Releases 4

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

cavekit

what this is

commands

install

format

files

non-goals

older cavekit (the Hunt lifecycle, v3.1.0 and earlier)

install the older version

choosing, or moving

ecosystem

philosophy

license

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Packages