GitHub - orenlab/codeclone: Deterministic structural change controller for AI-assisted Python development. Bound, verify, and audit agent edits before the diff — one canonical report, every surface (CLI, MCP, SARIF, CI)

Structural Change Controller for AI-assisted Python development

Let agents move fast.
Keep structural change explicit, bounded, remembered, and verifiable.

Note

This repository and the documentation site track the unreleased v2.1.0 development line. For the current stable release, use CodeClone v2.0.2 or install CodeClone 2.0.2 from PyPI.

CodeClone is a deterministic Structural Change Controller for AI-assisted Python development, built on one canonical structural analysis of the repository.

Before editing, an agent declares intent. CodeClone maps the structural blast radius, establishes explicit edit boundaries, and exposes the regression budget. After editing, it compares the actual patch with the declared scope, verifies structural changes, checks review claims against report facts, and leaves an auditable receipt.

intent → blast radius → bounded edit → patch check → review receipt

CodeClone does not use LLM judgment to classify structural regressions or authorize edits. Structural facts come from deterministic analysis; the same facts serve agents, human reviewers, IDEs, and CI.

Install and try

Stable release:

uv tool install codeclone
codeclone .
codeclone . --html --open-html-report

Run without installing:

uvx codeclone@latest .

Install the MCP server for local AI agents and IDE clients:

uv tool install "codeclone[mcp]"
codeclone-mcp --transport stdio

Install the in-development 2.1 line (alpha/beta prereleases). A plain install resolves the latest stable release; add a prerelease flag to get 2.1:

uv tool install --prerelease allow "codeclone[mcp]"   # uv
pip install --pre "codeclone[mcp]"                     # pip

Run the current development line from source:

git clone https://github.com/orenlab/codeclone.git
cd codeclone
uv sync --all-extras
uv run codeclone .

Why CodeClone

AI coding agents accelerate implementation, but they also make scope expansion easier to miss. A narrow task can quietly spread into shared helpers, tests, public APIs, configuration, and unrelated modules while the final diff still looks reasonable.

Most review tools start with the completed diff. CodeClone starts with the declared intent.

declare intent
  → inspect structural blast radius
  → establish edit boundaries
  → make the change
  → compare declared and actual scope
  → verify structural regressions
  → record the outcome

The agent still writes the code. CodeClone makes the declared scope explicit before editing and exposes undeclared expansion when the patch is verified.

Structural Change Controller

The controller reduces the governed agent workflow to four steps:

analyze → start → edit → finish

Start controlled change — start_controlled_change checks workspace state, records intent, maps blast radius, separates allowed paths from review context and do-not-touch boundaries, and returns the authoritative edit_allowed permission.
Finish controlled change — finish_controlled_change resolves the actual changed files once, checks scope, verifies the patch against the canonical report, validates optional review claims, and produces a review receipt.
Patch Trail — records declared, changed, untouched-in-declared, and boundary-held paths together with verification and audit anchors.
Multi-agent coordination — lease-bound intents, queues, recovery, and workspace hygiene make concurrent work visible without treating advisory ownership as structural truth.

Host integrations can enforce the permission model before file edits where the host supports hooks. Regardless of host enforcement, finish-time verification remains deterministic.

Structural Change Controller documentation

One canonical report, every structural surface

CodeClone runs one deterministic structural analysis and renders its canonical report through CLI, HTML, JSON, Markdown, SARIF, MCP, IDE integrations, GitHub Action, and CI. There is no separate analysis engine for agents.

The report covers:

function clones through CFG fingerprints;
block clones through statement windows and report-only segment clones;
clone-cohort drift, duplicated branch families, and guard/exit divergence;
cyclomatic complexity, coupling, cohesion, dependency cycles, and dead code;
overloaded-module and other report-only design context;
type and docstring adoption;
public API inventory and baseline-aware API break detection;
external Cobertura coverage joined with structural hotspots;
report-only security capability boundaries without vulnerability claims;
deterministic structural health and review priorities.

codeclone . --json --html --md --sarif --text

How CodeClone works · Canonical report contract

Baseline-aware CI

CodeClone separates accepted legacy debt from new structural regressions.

# Create and commit the project baseline once
codeclone . --update-baseline

# Gate future changes against that baseline
codeclone . --ci

The baseline is a versioned, integrity-checked contract. CI can reject newly introduced clones and baseline-aware metric, API, and coverage regressions without requiring the existing codebase to be clean first. Absolute threshold gates remain opt-in.

codeclone . --fail-on-new-metrics
codeclone . --fail-complexity 20 --fail-coupling 10 --fail-cohesion 4
codeclone . --fail-cycles --fail-dead-code
codeclone . --coverage coverage.xml --fail-on-untested-hotspots
codeclone . --api-surface --fail-on-api-break

Metrics and quality gates · Baseline contract

Engineering Memory

Engineering Memory gives agents durable, repository-specific context without treating model output as project truth.

The local SQLite store contains typed, evidence-linked knowledge such as contracts, architecture decisions, risks, test anchors, public surfaces, git provenance, and prior controlled changes. Scope-aware retrieval supports the current change, while project-wide search can combine FTS5 with optional semantic retrieval.

Audit-derived trajectories preserve how work actually unfolded. Trajectory passports, anomaly profiles, Patch Trail evidence, and recurring advisory patterns called Experiences make previous successes and failures reusable. Agent-created records remain drafts until a human approves them.

codeclone memory init --root .
codeclone memory search "baseline schema" --match all
codeclone memory approve mem-12345678 --i-know-what-im-doing

Memory can guide an agent. It cannot authorize edits, override blast radius, change a gate, or replace canonical report facts.

Engineering Memory documentation · Trajectories and Experiences

AI agents and IDE integrations

The MCP server is triage-first: analyze the repository, narrow the problem, inspect evidence, start a controlled change, and finish with verification. get_implementation_context projects bounded, drift-aware structural context for repo-relative paths from the existing run, with separate digests for the source artifact and exact response. It is evidence for planning, never edit authorization. Bounded tools and resources keep the full report out of agent context until deeper evidence is requested.

codeclone-mcp --transport stdio
codeclone-mcp --transport streamable-http

Structural analysis tools do not mutate source files, baselines, generated reports, or analysis cache. Controller and memory operations update only their explicit state stores.

Warning

Analysis tools require an absolute repository root. Keep stdio as the default transport for local clients. Exposing HTTP beyond loopback requires explicit --allow-remote.

Surface	Install or source	Documentation
VS Code extension	VS Code Marketplace	Setup
Cursor plugin	Cursor storefront	Install
Claude Code plugin	Claude Code marketplace	Install
Codex plugin	Codex marketplace	Install
Claude Desktop bundle	Bundle repository	Setup

Every client uses the same codeclone-mcp interface and canonical structural facts.

MCP usage guide · MCP interface contract · Implementation-context tools

Quick workflows

Review only the current Git scope:

codeclone . --changed-only --diff-against main
codeclone . --paths-from-git-diff HEAD~1

Inspect structural blast radius or run a baseline-relative patch check:

codeclone . --blast-radius codeclone/analysis/parser.py
codeclone . --patch-verify

--patch-verify is a terminal-only controller query: it cannot combine with --changed-only, --diff-against, or --paths-from-git-diff. Use changed-scope flags for git-selected review; use --patch-verify alone for a trusted-baseline budget check on the working tree. Patch-local before/after verification with explicit changed-file evidence belongs in MCP change control (check_patch_contract).

Use CodeClone in GitHub Actions:

- uses: orenlab/codeclone/.github/actions/codeclone@v2
  with:
    fail-on-new: "true"
    sarif: "true"
    pr-comment: "true"

The Action can run baseline-aware gating, publish SARIF to GitHub Code Scanning, upload reports, and maintain a PR summary comment.

GitHub Action documentation

Platform Observability

Platform Observability is an opt-in diagnostics layer for developing CodeClone itself. It correlates CLI, MCP, analysis, database, semantic-index, and projection-worker execution and exposes timings, RSS/CPU, query shapes, payload pressure, causal worker chains, and costly no-ops.

It is disabled by default, stores no raw payload bodies, and cannot affect repository findings, gates, baselines, memory facts, or edit authorization.

CODECLONE_OBSERVABILITY_ENABLED=1 codeclone .
codeclone observability trace --root . --html /tmp/codeclone-observer.html

Platform Observability documentation

Configuration

Project configuration lives in pyproject.toml:

[tool.codeclone]
baseline = "codeclone.baseline.json"

min_loc = 10
min_stmt = 6

block_min_loc = 20
block_min_stmt = 8

Precedence is CLI flags > pyproject.toml > built-in defaults.

Configuration reference · Inline suppressions

Documentation

The documentation site contains user guides, interface contracts, report and baseline schemas, configuration reference, integration setup, and maintainer material:

orenlab.github.io/codeclone

License

Code: MPL-2.0 (LICENSE)
Documentation and docs-site content: MIT (LICENSE-MIT)

Links

Documentation: https://orenlab.github.io/codeclone/
PyPI: https://pypi.org/project/codeclone/
Issues: https://github.com/orenlab/codeclone/issues
Discussions: https://github.com/orenlab/codeclone/discussions
Licenses: MPL-2.0 · MIT documentation license · License scope map

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
.agents/plugins		.agents/plugins
.github		.github
benchmarks		benchmarks
codeclone		codeclone
docs		docs
extensions		extensions
plugins		plugins
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSE-MIT		LICENSE-MIT
LICENSES.md		LICENSES.md
README.md		README.md
SECURITY.md		SECURITY.md
codeclone.baseline.json		codeclone.baseline.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock
zensical.toml		zensical.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install and try

Why CodeClone

Structural Change Controller

One canonical report, every structural surface

Baseline-aware CI

Engineering Memory

AI agents and IDE integrations

Quick workflows

Platform Observability

Configuration

Documentation

License

Links

About

Licenses found

Uh oh!

Releases 19

Uh oh!

Contributors 2

Languages

Folders and files

Latest commit

History

Repository files navigation

Install and try

Why CodeClone

Structural Change Controller

One canonical report, every structural surface

Baseline-aware CI

Engineering Memory

AI agents and IDE integrations

Quick workflows

Platform Observability

Configuration

Documentation

License

Links

About

Topics

Resources

License

Licenses found

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 19

Uh oh!

Contributors 2

Languages