Decide and adopt a test object construction strategy (factory_boy vs scenario builders)

## Summary
The API test suite has no agreed way to construct DB-backed domain objects. It leans on pytest fixture cascades, which work for infrastructure but scale poorly for the domain's many related entities. This issue is a decision: adopt either `factory_boy` (SQLAlchemy model factories with subfactories) or explicit scenario-builder functions, then roll it out incrementally. Fixtures stay for genuine infrastructure; this is specifically about domain object construction.

## Problem
DB-backed setup currently flows through cascade fixtures (`setup_lib_db` → `setup_lib_db_with_score_set` → `setup_lib_db_with_variant`). The cascade:

- **Hides state.** A leaf fixture implies an unseen chain of entities (a variant fixture silently pulls in a score set, experiment, experiment set, users, licenses). The test signature doesn't reveal what world it runs in.
- **Fits poorly off the happy path.** When a test needs "most of scaffold A but one piece different," you either accept excess/irrelevant state or spawn yet another cascade variant.
- **Proliferates.** Each new shape becomes another fixture, and the construction logic gets reinvented per directory.

This is already visible in the recent annotation-pipeline allele refactor: two test files written together invented allele creation two different ways — `tests/models/test_annotation_event_model.py` defines an `allele` pytest fixture, while `tests/models/test_annotation_event_view.py` defines a plain `_allele(session, digest, level)` helper plus an `_event(...)` factory-style function. The instinct to write factory functions is already emerging ad hoc; the open question is whether to formalize it and how.

## Proposed behavior
Pick one construction strategy and document it as the convention for new DB-backed test objects:

**Option 1 — `factory_boy`** (`SQLAlchemyModelFactory` + `SubFactory` for relationship chains).
- Pros: declarative; a single `create()` builds the whole relationship chain with sensible defaults; override only the fields a test cares about; canonical, well-documented solution.
- Cons: new framework dependency and DSL to learn; SQLAlchemy session wiring per test has known friction; transitional period where factories coexist with cascade fixtures.

**Option 2 — explicit scenario-builder functions** (e.g. `build_annotation_scenario(session)` returning a named struct of the created pieces, called in the test's arrange block).
- Pros: explicit, readable, debuggable; no new dependency; directly extends the existing `tests/helpers/util/` pattern (e.g. the HTTP-level `create_seq_score_set`).
- Cons: lower ceiling; deep relationship chains are less automatic and must be wired by hand.

Whichever is chosen, the convention must state that genuine infrastructure (`session`, `client`, base `setup_lib_db`) stays as fixtures — only domain object construction moves.

## Acceptance criteria
- [ ] A decision is recorded (in this issue or a short doc) naming the chosen approach and the reasoning.
- [ ] A documented example exists for constructing a non-trivial entity graph (e.g. score set → variant → annotation event) using the chosen approach.
- [ ] The convention explicitly states fixtures remain for infrastructure and the new approach is for domain objects.
- [ ] Migration is incremental: new tests use the chosen approach; existing cascade fixtures are migrated only as the tests touching them change. No big-bang migration is required or implied.
- [ ] At least one existing ad-hoc case (the allele construction duplicated across the two annotation-event test files) is unified onto the chosen approach as a reference implementation.

## Implementation notes
- Non-DB mock factories already exist in `tests/helpers/mocks/factories.py`; DB-backed construction is the new, missing piece. If `factory_boy` is chosen, keep DB factories distinct from those mock factories to avoid confusing the two layers.
- If `factory_boy` is chosen, resolve session injection up front (binding the active test `session` to factories) since that is the main friction point; verify it composes with the per-test PostgreSQL fixture.
- This issue depends on / pairs with the narrower fixture-deduplication cleanup tracked separately — that one consolidates the existing cascade fixtures; this one decides the longer-term construction strategy. Sequence them so the dedup work isn't redone.
- Scope guard: this is a direction-setting decision plus a reference implementation, not a suite-wide rewrite.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Decide and adopt a test object construction strategy (factory_boy vs scenario builders) #782

Summary

Problem

Proposed behavior

Acceptance criteria

Implementation notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Decide and adopt a test object construction strategy (factory_boy vs scenario builders) #782

Description

Summary

Problem

Proposed behavior

Acceptance criteria

Implementation notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions