Skip to content

refactor(src): dissolve basic.py, give every module an honest home#199

Merged
cailmdaley merged 16 commits into
developfrom
refactor/src-module-layout
Jun 23, 2026
Merged

refactor(src): dissolve basic.py, give every module an honest home#199
cailmdaley merged 16 commits into
developfrom
refactor/src-module-layout

Conversation

@cailmdaley

@cailmdaley cailmdaley commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

refactor(src): dissolve basic.py, give every module an honest home

A behavior-preserving reshape of src/sp_validation/no logic changed, only moves, renames, and dead-code removal. The grab-bag basic.py is dissolved, spatial masking gets its own module, and misnamed/dead files are cleaned up. Base: develop.

flowchart LR
    classDef new fill:#d6f5e0,stroke:#2a9d52,color:#0b3d20
    classDef gone fill:#fadbdd,stroke:#c0392b,color:#5b1a1f,stroke-dasharray:5 4

    basic["basic.py"]:::gone
    util["util.py"]
    rjc["run_joint_cat.py"]
    rho["rho_tau.py"]
    cat["cat.py"]

    calib["calibration.py"]
    stats["statistics.py"]:::new
    masks["masks.py"]:::new
    fmt["format.py"]
    cb["catalog_builders.py"]
    plots["plots.py"]
    catalog["catalog.py"]

    basic -->|"metacal + size/SNR cuts"| calib
    basic -->|"math helpers"| stats
    util  -->|rename| fmt
    rjc   -->|rename| cb
    cb    -->|"spatial Mask + algebra"| masks
    rho   -->|"SquareRootScale"| plots
    cat   -->|rename| catalog
    cb -.->|builds on| catalog
Loading

statistics.py and masks.py are new; basic.py is deleted. The catalogue layer now reads as a clear pair: catalog.py (data primitives — read/write/access/match) with catalog_builders.py (the runner-class pipeline built on it). Two re-exports keep shared importers working (SquareRootScale via rho_tau, the mask primitives via catalog_builders); the renames preserve each caller's local binding, so no call sites change.

Also folded in: deleted dead info.py, five unused functions, and a shadowed duplicate confusion_matrix; made __init__.__all__ honest and fixed a phantom version-import; and repaired two pre-existing broken imports — a vanished transform_nan, and 6 papers/harmonic files still pointing at the long-gone utils_cosmo_val.

Martin's #197 requests — all ✅

Calibration scripts clustered under scripts/calibration/ · analyse_matched_stars → scratch as a .py · demo_binned_mask / plot_binned_quantities → scratch · all 7 deletions done. (Sacha's namaster_utils and config-space items are reserved for his own next PR.)

— Claude on behalf of Cail

cailmdaley and others added 2 commits June 20, 2026 04:50
The module named "basic" was a grab-bag: the 546-line `metacal` response
class (the heart of shear calibration) plus galaxy-selection masks and a
handful of cosmology-independent statistics helpers — none of which "basic"
described. Its symbols now live where they belong, and basic.py is deleted.

- `metacal` class + `mask_gal_size`/`mask_gal_SNR` (galaxy selection) →
  calibration.py, joining the m/c routines that already consumed a
  `gal_metacal` instance. One subsystem, one module.
- `jackknif_weighted_average2`, `corr_from_cov`, `chi2_and_pte`,
  `cov_from_one_covariance` → new statistics.py (a clean leaf: numpy/scipy
  only; calibration imports the jackknife from it).
- Every importer repointed (papers, scripts, the two scratch/guerrini
  import lines — path-only, his logic untouched); dead `from sp_validation
  import basic` lines removed from calibration.py and cat.py; `__all__` and
  the architecture docs updated.
- Tests split: metacal + mask pins → test_calibration.py, jackknife pin →
  test_statistics.py; test_basic.py removed.

All moved code is byte-identical to the original (md5-verified); value-drift
pins (metacal R-matrix rtol 1e-12) and the full suite pass in-container,
except the pre-existing galaxy/cs_util.size old-sandbox gap. No circular
imports. Verified by an adversarial multi-agent pass (byte-identity,
no-stale-refs, value-pins, no-cycles).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
scratch/guerrini/ and the namaster_utils→source / Gaussian-sims work he
reserved for his next PR in the #197 review. So future workers don't touch it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cailmdaley cailmdaley force-pushed the refactor/src-module-layout branch from 902377f to 1ff913c Compare June 20, 2026 02:51
…ation

Follow-up polish on the basic.py dissolution (#199).

Tests — characterization (value-drift) coverage for the three statistics.py
helpers that had none, with literals generated by running the real functions
in-container and teeth on each:
- corr_from_cov: unit diagonal + reconstruction from cov/outer(std,std)
- chi2_and_pte: diagonal reduces to sum((d/sigma)^2) with matching scipy PTE,
  plus a non-diagonal case exercising the full d^T C^-1 d path
- cov_from_one_covariance: gaussian(col 10) vs non-gaussian(col 9) selection
  and a row-major-layout check (a transpose would be caught)

Calibration — strictly behavior-preserving dead-code removal:
- 3 unused module imports (util, io, get_footprint — verified unreferenced)
- an unused local (col_noshear) in metacal._read_data
- the uncallable metacal._return method (defined without self, references
  self.* in its body — would NameError if ever invoked; referenced nowhere)

Value pins (metacal R-matrix, m/c bias) stay green; conservatively skipped any
change that would reorder float ops or restructure an estimator. Verified by an
adversarial behavior-preservation review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cailmdaley cailmdaley changed the base branch from cleanup/restructuring to develop June 20, 2026 12:26
cailmdaley and others added 9 commits June 20, 2026 14:50
info.py had zero importers; its only content was a redundant
__name__ = 'sp_validation'. cat.py imported __version__ and __name__
from the package root but used only __version__ (line 607); the
software name at line 606 is already hardcoded. Repoint to the
canonical home: from sp_validation.version import __version__.

Register the retired import path in the dangling-move guard.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The old __all__ listed modules nobody imports through the package
(io, plot_style, cosmo_val) and omitted the two genuinely public
diagnostic modules rho_tau and b_modes. Replace it with the real
public surface, alphabetised, and drop the stale commented-out
explicit-import block. Nothing does `from sp_validation import *`,
so this is purely a documentation fix.

(util and run_joint_cat are renamed in the Tier-2 commits.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Five methods each imported from .b_modes inside their bodies. b_modes
is import-time side-effect-light (it pulls only .cosmology) and has no
back-edge to cosmo_val, so the locals were defensive, not necessary.
Consolidate the union of imported names into one top-level block next
to the existing cosmology/rho_tau imports and drop the five inline
imports. test_cosmo_val: 11 passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two module-level defs named confusion_matrix existed; the first
(mask, confidence_level=0.9) was a near-duplicate of correlation_matrix
and was unconditionally shadowed by the second
(prediction, observation) ~40 lines later. Every caller — in
scripts/calibration and scripts/examples — uses the
(prediction, observation) signature. Remove the dead first def.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Five functions had zero callers anywhere in the repo (src, scripts,
papers, scratch, notebooks), including across the star-imports of
plots:
  cosmology.py: get_clusters, stack_mm3, gamma_T_tc, xi_gal_gal_tc
  plots.py:     plot_map_stacked
Removing the cosmology block also orphaned the imports that existed
solely for it (treecorr, fits, canfar, radec2xy, cKDTree, tqdm,
get_footprint); drop those too. test_cosmology + test_plots: 29 passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
util.py held only millify / print_millified — number formatting, not a
grab-bag. Rename it to format.py and sweep all importers:
  internal: cat.py, run_joint_cat.py (util.millify -> format.millify)
  scripts:  apply_alpha.py, examples/demo_calibrate_minimal_cat.py,
            calibration/extract_info.py (star-import x2)
  papers:   catalog/hist_mag.py
Register the rename in the dangling-move guard; update package __all__.

B1: scripts/plot_leakage.py imported transform_nan from the old util
module — a symbol removed from the library long ago and never used in
the script. Drop the dead import.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
SquareRootScale is a matplotlib ScaleBase subclass — plotting
infrastructure, not rho/tau logic. Move the class and its
register_scale call into plots.py (which now carries the
matplotlib.scale/ticker/transforms imports it needs). rho_tau.py keeps
a compat re-export so existing
`from sp_validation.rho_tau import SquareRootScale` callers — in
cosmo_inference, scratch/guerrini, papers/harmonic — still resolve.

workflow/scripts/plotting_utils.py holds a near-duplicate that
diverges (ScalarFormatter(useMathText=True); inverted transform method
named transform_non_affine not transform), so it is left in place.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Six harmonic-space scripts imported SquareRootScale from
sp_validation.utils_cosmo_val, a module that no longer exists. Repoint
to sp_validation.rho_tau (the compat re-export), matching the sibling
2026_03_17 script. get_params_rho_tau was already correctly imported
from rho_tau. No other utils_cosmo_val imports remain (the two
scratch/guerrini mentions are prose noting the module's removal).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The module holds the catalogue-builder runner classes (JointCat,
ApplyHspMasks, CalibrateCat) and their run_* entry-point functions;
catalog_builders names that role. Sweep all importers (the
`as sp_joint` alias is preserved, only the module name changes):
  scripts/apply_hsp_masks.py
  scripts/examples/{demo_check_footprint, create_binned_mask_comprehensive,
    demo_comprehensive_to_minimal_cat, demo_create_footprint_mask,
    demo_calibrate_minimal_cat}.py
  scripts/calibration/{create_joint_comprehensive_cat (direct symbol),
    demo_apply_hsp_masks, calibrate_comprehensive_cat}.py
  scratch/kilbinger/demo_binned_mask.py
  papers/catalog/hist_mag.py
Also update the two prose references in docs and update __all__ and
the dangling-move guard.

The OPTIONAL masks.py extraction (Mask + mask-algebra fns) is deferred:
those symbols are reached externally through the sp_joint.* module
alias (Mask, get_masks_from_config, print_mask_stats across 4 scripts +
papers/catalog), so splitting them out would require either re-exports
or sweeping the public call surface — beyond a behavior-preserving move.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cailmdaley cailmdaley changed the title refactor(src): dissolve basic.py into calibration + statistics refactor(src): dissolve basic.py + make the module layout honest Jun 20, 2026
@cailmdaley cailmdaley marked this pull request as ready for review June 20, 2026 13:35
Move the healsparse-backed spatial-masking cluster out of
catalog_builders.py into a dedicated masks.py: the Mask class plus
get_masks_from_config, print_mask_stats, correlation_matrix, and
confusion_matrix. Bodies are byte-identical; the move carries the
numexpr/scipy.stats imports those helpers need (now removed from
catalog_builders, which no longer references them).

catalog_builders.py re-exports the five symbols from sp_validation.masks
so external code using `from sp_validation import catalog_builders as
sp_joint` keeps resolving sp_joint.Mask, sp_joint.get_masks_from_config,
etc. The *Cat runner classes (ApplyHspMasks, ReadCat, run_* entry points)
stay; ApplyHspMasks uses healsparse directly, not the Mask class.

masks added to __init__.__all__. No MOVE_MAP entry: this is an
extraction-in-place (catalog_builders survives), not a retired path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cailmdaley cailmdaley changed the title refactor(src): dissolve basic.py + make the module layout honest refactor(src): dissolve basic.py, give every module an honest home Jun 20, 2026
The catalogue module pair now reads by role: catalog.py is the data
layer (read/write/column-access/matching free functions), catalog_builders.py
is the construction pipeline (runner classes built on it). Module docstrings
state this hierarchy explicitly.

Behaviour-preserving. Every importer of the local sp_validation.cat module is
swept to sp_validation.catalog, each preserving its local binding (bare
`import cat` forms gain `as cat` so function bodies are unchanged). The
cs_util `cat` import is a different module and is left untouched throughout.

The dangling-reference guard registers the retired flat-import form
`sp_validation.cat import` rather than the bare `sp_validation.cat` token,
which would false-positive on the live `sp_validation.catalog` /
`sp_validation.catalog_builders` modules (same prefix trap as glass_mock).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@martinkilbinger

Copy link
Copy Markdown
Contributor

I am going through the flies; some are (my old ones and) obsolete, I will delete them here.

@cailmdaley cailmdaley merged commit 5b8ffdc into develop Jun 23, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants