HatPdotS · HatPdotS · Jul 1, 2026 · Jul 1, 2026
diff --git a/paper/FIGURE_MEDIANS.md b/paper/FIGURE_MEDIANS.md
@@ -2,34 +2,34 @@
 
 Auto-generated by `analysis/summarize_medians.py` from the metrics CSVs; numbers match the plotted figures. TorchRef = canonical arm `torchref` (xray 1 / geometry 0.2 / adp 0.02).
 
-## Figure 2A — R-factors (PHENIX-validated) (conserved set, n=733)
+## Figure 2A — R-factors (PHENIX-validated) (conserved set, n=722)
 
 | Model | median R-work | median R-free | n |
 |---|---|---|---|
-| AlphaFold (start) | 0.4078 | 0.4083 | 733 |
-| Refmac | 0.2726 | 0.3147 | 733 |
-| PHENIX | 0.2741 | 0.3161 | 733 |
-| TorchRef | 0.2687 | 0.3160 | 733 |
+| AlphaFold (start) | 0.4083 | 0.4085 | 722 |
+| Refmac | 0.2737 | 0.3161 | 722 |
+| PHENIX | 0.2746 | 0.3166 | 722 |
+| TorchRef | 0.2694 | 0.3167 | 722 |
 
-## Figure 2B — geometry RMSZ (median; ideal = 1.0) (conserved set, n=733)
+## Figure 2B — geometry RMSZ (median; ideal = 1.0) (conserved set, n=722)
 
 | Model | Bond RMSZ | Angle RMSZ | Chiral RMSZ | MC B-factor RMSZ | n |
 |---|---|---|---|---|---|
-| Refmac | 0.58 | 0.95 | 0.56 | 0.92 | 733 |
-| PHENIX | 0.83 | 0.92 | 0.33 | 1.03 | 733 |
-| TorchRef | 1.33 | 1.14 | 0.57 | 1.55 | 733 |
+| Refmac | 0.58 | 0.94 | 0.55 | 0.92 | 722 |
+| PHENIX | 0.83 | 0.92 | 0.33 | 1.03 | 722 |
+| TorchRef | 1.33 | 1.14 | 0.57 | 1.55 | 722 |
 
-## Figure 2C — wall-clock runtime (conserved set, n=733)
+## Figure 2C — wall-clock runtime (conserved set, n=722)
 
 | Engine | median runtime (min) | n |
 |---|---|---|
-| Refmac | 0.53 | 733 |
-| TorchRef | 1.55 | 732 |
-| PHENIX | 4.71 | 733 |
+| Refmac | 0.53 | 722 |
+| TorchRef | 1.65 | 722 |
+| PHENIX | 4.63 | 722 |
 
-TorchRef vs Refmac: 2.9× (slower); TorchRef vs PHENIX: 3.0× faster.
+TorchRef vs Refmac: 3.1× (slower); TorchRef vs PHENIX: 2.8× faster.
 
-## Figure 2 — run accounting (N=767 candidates; conserved set n=733)
+## Figure 2 — run accounting (N=767 candidates; conserved set n=722)
 
 Each engine refined the same set of Phaser-placed AlphaFold models. *Validated* = produced a PHENIX-scored R-factor; the headline medians use the conserved intersection where all four engines validated.
 
@@ -38,7 +38,7 @@ Each engine refined the same set of Phaser-placed AlphaFold models. *Validated*
 | AlphaFold (start) | 759 | 8 | 767 |
 | Refmac | 760 | 7 | 767 |
 | PHENIX | 734 | 33 | 767 |
-| TorchRef | 758 | 9 | 767 |
+| TorchRef | 747 | 20 | 767 |
 
 ### Failure reasons (grouped)
 
@@ -51,9 +51,9 @@ Each engine refined the same set of Phaser-placed AlphaFold models. *Validated*
   - 5 × refine: phenix.refine crash: special-position pathology
   - 1 × refine: phenix.refine transient error
   - 1 × refine: phenix.refine: incompatible data flags (Friedel mismatch)
-- **TorchRef** — 9 failed
+- **TorchRef** — 20 failed
+  - 12 × refine: TorchRef OOM at 8 GB (large structure)
   - 8 × scoring: PHENIX validator fails (special-position pathology)
-  - 1 × refine: TorchRef OOM at 8 GB (large structure)
 
 Most failures trace to a **special-position pathology** in some placed AF models: it breaks the uniform PHENIX scorer (so the structure is unscoreable for *every* engine) and makes phenix.refine itself abort, while REFMAC and TorchRef refine it without error. TorchRef's only engine-side failures are OOMs at the 8 GB job limit (large structures; would pass at 16 GB); REFMAC had zero refinement failures.
 
@@ -69,21 +69,21 @@ Forward structure-factor calculation, TorchRef GPU (NVIDIA A100-PCIE-40GB) vs cc
 
 ## Extended Figure 2 — R-factor gap (TorchRef − reference, PHENIX-scored)
 
-n = 733 structures, resolution 1.40–3.00 Å.
+n = 722 structures, resolution 1.40–3.00 Å.
 
 | Reference | median ΔR-work (pp) | median ΔR-free (pp) |
 |---|---|---|
-| PHENIX | -0.36 | -0.03 |
-| REFMAC | -0.24 | +0.17 |
+| PHENIX | -0.37 | -0.04 |
+| REFMAC | -0.27 | +0.15 |
 
 ## Extended Figure 3 — scorer-consistency matrix (median R-free)
 
 | Model | by REFMAC | by PHENIX | by TorchRef | n(paired) |
 |---|---|---|---|---|
 | Refmac | 0.3217 | 0.3147 | 0.3292 | 760 |
 | PHENIX | 0.3247 | 0.3161 | 0.3329 | 734 |
-| TorchRef | 0.3243 | 0.3162 | 0.3315 | 756 |
-| AlphaFold (start) | 0.4158 | 0.4082 | 0.4572 | 758 |
+| TorchRef | 0.3249 | 0.3168 | 0.3327 | 747 |
+| AlphaFold (start) | 0.4158 | 0.4081 | 0.4571 | 759 |
 
 ## Extended Figure 1 — weight landscape, locked-default cell