Evolve is a Python library and service which enables AI agents to improve through self-reflection.
- Trajectory: A recorded agent conversation
- Entity: Anything which is appropriately stored in a vector database, such as a guideline, policy, or some other knowledge.
- Namespace: Isolated storage for entities
- Conflict Resolution: LLM-based merging of duplicate/conflicting entities
- Guidelines: Instructions intended to assist an agent in completing some task
- Agent completes some task, and the resulting trajectory is automatically saved into a logging framework such as Langfuse or Arize Phoenix.
- The agent can call the sync MCP function, or the user can manually sync, which causes evolve to process the trajectory and save any generated guidelines.
- Generated guidelines are stored as entities with conflict resolution applied.
- Future agents can query the Evolve MCP server to fetch guidelines for similar tasks
.
├── demo (Files used by the Claude Code demo)
│ ├── filesystem
│ └── workdir
├── docs (Data used by README files)
├── explorations (Tangential projects for feeling out future work. Should be avoided unless otherwise prompted.)
│ └── claudecode
├── evolve (Primary Source Root)
│ ├── backend (Entity Database Backend implementations, primarily vector databases)
│ ├── cli (A CLI wrapper over the native Python client)
│ ├── config (All configurations which are derived from environment variables or instantiated as an object)
│ ├── db (A sqlite database for when vector databases are a poor fit for the data)
│ ├── frontend (Interfaces to interact with the backend)
│ │ ├── client (A native Python client which thinly wraps the configured backend)
│ │ └── mcp (An MCP server implementing some high-level methods useful for AI agents)
│ ├── llm (All code that prompts an LLM)
│ ├── schema (All well-defined datatypes used throughout the project)
│ ├── sync (Upstream data sources to be processed and stored in the backend)
│ └── utils (Small reusable code snippets)
├── tests (All tests for evolve)
├── .env.example (Environment variable template file)
└── .env (Environment variables used to configure evolve)
uv sync && source .venv/bin/activate
cp .env.example .env # Configure any environment variables, defined in `./altk_evolve/config`
pre-commit install- This project is managed by
uv, notpythonorpip, so any python commands need to go throughuv. All dependencies are defined inpyproject.toml.
- Run pytest verbosely with the
-vflag by default so that you have more context when tests fail. - Use
uv run pytest tests/.../<test_name.py>to run tests individually. - We use the pytest markers
e2efor end-to-end tests, andunitfor unit tests, andphoenixto test integration with Phoenix. - When running
uv run pytestit will skip the tests marked withphoenix. - To run specific markers:
uv run pytest -m e2eoruv run pytest -m unit - To run all tests:
uv run pytest -m "e2e or unit or phoenix"
- MCP Server:
get_entities(),get_guidelines(),save_trajectory() - CLI: Run
evolve --helpif details are needed about its subcommands. Available subcommands includenamespaces,entities, andsync - Python Client:
EvolveClient()for programmatic access
- Use Ruff for linting and formatting (configured in pyproject.toml)
- Always run
uv run ruff format .andgit add -ubefore committing to avoid pre-commit stash conflicts with ruff auto-fixes - Write commit messages in the Conventional Commits format expected by
python-semantic-release:feat(scope): description— new feature, triggers a minor version bumpfix(scope): description— bug fix, triggers a patch version bumpperf(scope): description— performance improvement, triggers a patch version bumptest(scope): description,chore(scope): description,docs(scope): description, etc. — no version bump- Breaking changes: append
!after the type/scope (e.g.feat!:) or addBREAKING CHANGE:in the footer
- All new features need tests (unit + e2e where applicable)
- Use uv to run Python commands, including pip.