Skip to content

feat: consume the shared metric-value contract (kind, reference_id, presentation attrs)#32

Open
lewisjared wants to merge 8 commits into
mainfrom
feat/layer-a-consume
Open

feat: consume the shared metric-value contract (kind, reference_id, presentation attrs)#32
lewisjared wants to merge 8 commits into
mainfrom
feat/layer-a-consume

Conversation

@lewisjared

@lewisjared lewisjared commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Makes ref-app consume the shared metric-value contract from climate-ref: the first-class kind (model/reference), the reference_id content hash for deduplicating reference series, and the typed presentation attributes (value_units, value_long_name, index_units, calendar). Role, deduplication, axis units and calendar are now driven by these fields instead of inferring roles from the source_id == "Reference" sentinel, which is removed from production code.

Dependency bump (please note)

The contract only exists on climate-ref main — no published release carries it (latest tag v0.15.0 predates it). This PR therefore pins climate-ref (and its workspace packages) to the unreleased main commit 383bd3f via a git source, and raises requires-python to >=3.12 to match. That bump also pulls in the diagnostic-versioning schema (promoted_version), so the committed sqlite test fixture is migrated through alembic to the new schema. Once the contract ships in a climate-ref release, the pin should be swapped back to a plain version constraint.

What changed

Backend

  • SeriesValue/ScalarValue DTOs surface kind, reference_id and the four presentation attributes. The presentation attributes are normalised from the persisted attributes blob with a provider key fallback (value_unitsvalue_units|units, value_long_namevalue_long_name|long_name), since ESMValTool and ILAMB use different attribute keys and the typed fields are not persisted as columns.
  • Outlier detection keys off kind == "reference" instead of the source_id sentinel; missing/None kind is treated as a model.
  • Test fixture database migrated for the climate-ref 0.15 schema.

Frontend

  • Regenerated API client for the new fields.
  • Model-vs-reference role is derived from kind; reference series are deduplicated by reference_id (falling back to label-based dedup only when reference_id is absent).
  • Per-series value_units/index_units drive the axis unit labels, falling back to the existing card units prop; calendar is surfaced for time handling.
  • Series are aligned by index value (union-join, sorted ascending) rather than by array position, so series with differing or partially overlapping index arrays no longer smear together; index-less series still render positionally.

Testing

  • Backend: uv run pytest src tests → 200 passed, 1 xfailed; mypy and ruff clean.
  • Frontend: npm run build (tsc -b && vite build) succeeds; vitest → 219 passed; Biome clean.

Follow-ups

  • The pre-commit TypeScript hook runs tsc --noEmit against the solution-style root tsconfig.json ("files": []), which type-checks nothing under src/. The effective check is tsc -b. Worth updating the hook so build-breaking type errors are caught locally.

Summary by CodeRabbit

  • New Features

    • Series charts now distinguish model and reference data using shared metadata, improving duplicate handling and alignment across differing index ranges.
    • Axis labels and tooltips now use per-series units and index units when available.
  • Bug Fixes

    • Reference rows are no longer identified by a naming convention, reducing गलत classifications in outlier detection and chart rendering.
  • Chores

    • Updated backend/runtime requirements and refreshed generated client schemas/types to match the new value metadata.

…ue DTOs

SeriesValue now carries kind, reference_id, and normalised presentation
attrs (value_units, value_long_name, index_units, calendar), and
build_scalar/build_series set kind from the row's dimensions instead of
leaving it at its default. Presentation attrs are normalised from the
attributes blob via a provider-key fallback map, since ESMValTool and
ILAMB use different attribute names for the same concepts.

This requires bumping the climate-ref dependency to an unreleased
commit on main, since the underlying kind/reference_id/presentation
fields don't exist in any published release yet. That bump also raises
the Python floor to 3.12 to match climate-ref's own requirement.

Note: bumping climate-ref surfaces a schema mismatch between the newer
package and the committed sqlite test fixtures (diagnostic.promoted_version
is missing from an unrelated versioning migration), breaking the
DB-backed integration tests in test_diagnostics.py, test_executions.py,
test_datasets.py and test_results.py. This is outside this slice's scope
to fix and needs a follow-up before merge.
…_id sentinel

calculate_iqr_bounds_by_source_id and detect_outliers_in_scalar_values
excluded reference rows from IQR bounds by matching source_id == "Reference".
That sentinel breaks down for providers where the reference series carries
a real source_id.

Both functions now key off the kind dimension (kind == "reference") that
sv.dimensions already spreads into the DataFrame, treating a missing/None/
empty kind as a model. If the kind column is absent entirely (older data),
fall back to treating all rows as models rather than raising.
The metric-value contract bump pulls in the diagnostic-versioning and
kind/reference_id migrations. Run the fixture database through alembic so the
committed test data matches the new schema (adds promoted_version and the kind
dimension column).
Surfaces kind, reference_id and the presentation attributes
(value_units, value_long_name, index_units, calendar) on the generated
SeriesValue and ScalarValue types.
…y reference_id

Consume the shared metric-value contract's kind and reference_id
fields on the frontend instead of inferring role from the
source_id === "Reference" sentinel.

- seriesChartContent.tsx splits series into model/reference buckets
  by kind, treating missing or "model" kind as a model series.
- createChartData derives isReference from kind instead of array
  position, and dedups reference series by reference_id, falling
  back to label-based dedup only when reference_id is absent.
- isReferenceItem keys off kind instead of the source_id sentinel.
- DimensionedData gains an optional kind field so grouping utilities
  can access it generically.
…alue

Series charts now prefer each series' own value_units/index_units for
axis labels, falling back to the card-config units prop when a series
doesn't carry them (ILAMB never sets value_units or calendar; ESMValTool
does). createChartData also aligns series by their actual index value
rather than by array position, so series with different or partially
overlapping index arrays no longer get smeared together -- each series
contributes its value to the row matching its own index, with gaps
left null. calendar metadata is surfaced from series attributes and
still relies on ISO-8601 string detection for time-axis selection,
since non-standard calendars (360_day, noleap) aren't fully supported
by JS Date parsing.
Address review findings on the index-value alignment:
- Coerce absent per-series value_units/index_units/calendar to undefined so
  the app type-checks under the real `tsc -b` build (the nullable client
  fields were being assigned into non-null return types).
- Restore the positional fallback for series that carry no index array, so
  index-less series render instead of collapsing to an empty chart.
- Sort the index-value union ascending so fully interleaved series indices
  render left-to-right instead of zigzagging.

Add regression tests for the index-less and interleaved-index cases.
@netlify

netlify Bot commented Jul 1, 2026

Copy link
Copy Markdown

Deploy Preview for climate-ref canceled.

Name Link
🔨 Latest commit a78fd73
🔍 Latest deploy log https://app.netlify.com/projects/climate-ref/deploys/6a450ee00705670009aebc2c

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Introduces a kind field ("model"/"reference") across backend and frontend metric-value models, replacing source_id-based sentinel logic for reference identification. Updates outlier detection, series chart classification, deduplication, and index-based alignment accordingly. Pins climate-ref to a git revision and bumps Python requirement to 3.12.

Changes

Backend contract and outlier logic

Layer / File(s) Summary
Dependency pin and Python bump
backend/pyproject.toml, changelog/32.breaking.md
Requires Python 3.12+ and pins climate-ref to a specific git commit carrying the shared metric-value contract.
SeriesValue/ScalarValue kind and presentation metadata
backend/src/ref_backend/models.py, backend/tests/test_core/test_metric_values.py
Adds kind, reference_id, and normalised presentation fields (value_units, value_long_name, index_units, calendar) to SeriesValue, with _normalize_kind/_normalize_presentation_attributes helpers wired into build_scalar/build_series.
Kind-based reference exclusion in outliers
backend/src/ref_backend/core/outliers.py, backend/tests/test_core/test_outliers.py
Replaces source_id == "Reference" checks with kind == "reference" in IQR bounds calculation and outlier detection, with regression tests.

Frontend series rendering and alignment

Layer / File(s) Summary
Generated client types/schemas
frontend/src/client/schemas.gen.ts, frontend/src/client/types.gen.ts, changelog/32.feature.md
Adds kind and presentation metadata fields to ScalarValueSchema/SeriesValueSchema and matching TypeScript types.
Grouping reference detection by kind
frontend/src/components/explorer/grouping/types.ts, frontend/src/components/explorer/grouping/utils.ts, frontend/src/components/explorer/grouping/utils.test.ts
DimensionedData gains a kind field; isReferenceItem now checks kind === "reference" instead of source_id.
Series chart classification
frontend/src/components/explorer/content/seriesChartContent.tsx
Splits regular vs reference series using series.kind instead of dimensions.source_id.
Chart data alignment, dedup, axis metadata
frontend/src/components/execution/values/series/utils.ts, frontend/src/components/execution/values/series/utils.test.ts
Reworks createChartData to align rows by index-value union, deduplicate reference series by reference_id, and expose valueUnits/indexUnits/calendar; classifies references via kind.
Axis unit rendering
frontend/src/components/execution/values/series/seriesVisualization.tsx, frontend/src/components/execution/values/series/seriesCanvas.tsx
Threads indexUnits/valueUnits through components; X-axis labels now show units for non-time axes.

Sequence Diagram(s)

sequenceDiagram
  participant SeriesChartContent
  participant createChartData
  participant SeriesVisualization
  participant SeriesCanvas

  SeriesChartContent->>SeriesChartContent: split series by kind (reference/regular)
  SeriesChartContent->>SeriesVisualization: regularSeries, referenceSeries
  SeriesVisualization->>createChartData: seriesValues, referenceSeriesValues
  createChartData->>createChartData: dedupe references by reference_id
  createChartData->>createChartData: align rows by index-value union
  createChartData-->>SeriesVisualization: chartData, valueUnits, indexUnits, calendar
  SeriesVisualization->>SeriesCanvas: chartData, effectiveUnits, indexUnits
  SeriesCanvas->>SeriesCanvas: draw axes with unit labels
Loading

Compact metadata

  • Related issues: Not specified in provided information.
  • Related PRs: Not specified in provided information.
  • Suggested labels: backend, frontend, breaking-change
  • Suggested reviewers: Not specified in provided information.

Poem

A rabbit hopped through fields of code,
Where "kind" replaced the sentinel road,
Reference and model, now clearly named,
Charts align by index, never maimed,
Units on axes, calendars too —
A tidy warren, built anew. 🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: adopting the shared metric-value contract with kind, reference_id, and presentation attributes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/layer-a-consume

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/src/ref_backend/core/outliers.py (1)

141-165: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Fallback IQR path doesn't respect kind — reference rows can be misflagged.

The kind-aware reference exclusion is only applied in the source_id-aware branch (Lines 147-155). When source_id is absent or the group is too small (Lines 159-165), flag_outliers_iqr runs over all rows including kind == "reference" ones, and no override forces reference rows back to non-outlier. This means a reference row could be flagged an outlier in this fallback path, contradicting the intended contract that reference values are excluded from outlier logic.

🐛 Proposed fix to force reference rows non-outlier in the fallback branch
         else:
             # Fallback to original IQR method if no source_id or insufficient data
             if len(group_values) >= min_n:
                 iqr_flags = flag_outliers_iqr(group_values.value.to_list(), factor=factor)
+                if "kind" in group_values.columns:
+                    iqr_flags = [
+                        False if kind == "reference" else flag
+                        for flag, kind in zip(iqr_flags, group_values["kind"])
+                    ]
             else:
                 iqr_flags = [False] * len(group_values)
             source_id_flags = iqr_flags  # type: ignore

Note the new test suite only exercises groups where source_id is present, so this gap isn't currently covered by tests either.

🧹 Nitpick comments (1)
backend/tests/test_core/test_outliers.py (1)

114-127: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Weak assertion in test_none_or_empty_kind_treated_as_model.

The test only asserts bounds is not None, which would also pass if the None/empty-kind rows were wrongly excluded (as long as ≥min_n other rows remain). Consider asserting bounds equal those computed from the full 4-row set (as done in test_excludes_rows_tagged_kind_reference) to actually confirm inclusion.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ba0fb19a-5a05-41ba-ab79-f0718c8ff985

📥 Commits

Reviewing files that changed from the base of the PR and between 61fb2dc and a78fd73.

⛔ Files ignored due to path filters (2)
  • backend/tests/test-data/tests.integration.test_cmip7_aft/test_solve_cmip7_aft/db/climate_ref.db is excluded by !**/*.db
  • backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (17)
  • backend/pyproject.toml
  • backend/src/ref_backend/core/outliers.py
  • backend/src/ref_backend/models.py
  • backend/tests/test_core/test_metric_values.py
  • backend/tests/test_core/test_outliers.py
  • changelog/32.breaking.md
  • changelog/32.feature.md
  • frontend/src/client/schemas.gen.ts
  • frontend/src/client/types.gen.ts
  • frontend/src/components/execution/values/series/seriesCanvas.tsx
  • frontend/src/components/execution/values/series/seriesVisualization.tsx
  • frontend/src/components/execution/values/series/utils.test.ts
  • frontend/src/components/execution/values/series/utils.ts
  • frontend/src/components/explorer/content/seriesChartContent.tsx
  • frontend/src/components/explorer/grouping/types.ts
  • frontend/src/components/explorer/grouping/utils.test.ts
  • frontend/src/components/explorer/grouping/utils.ts

Comment thread backend/pyproject.toml
"pydantic-settings<3.0.0,>=2.13.1",
"sentry-sdk[fastapi]>=2.0.0",
"climate-ref[aft-providers,postgres]>=0.13.1,<0.14",
"climate-ref[aft-providers,postgres]>=0.13.1",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
fd -HI '(Dockerfile|pyproject\.toml|requirements.*|.*\.ya?ml)$' . -x sh -c '
  echo "== $1 ==";
  rg -n "uv sync|uv pip|pip install|poetry install|hatch build" "$1" || true
' sh {}

Repository: Climate-REF/ref-app

Length of output: 2336


🏁 Script executed:

sed -n '1,120p' backend/pyproject.toml

Repository: Climate-REF/ref-app

Length of output: 2739


Pin climate-ref in the published dependency metadata too.

[tool.uv.sources] only affects uv resolution; project.dependencies still advertises climate-ref[aft-providers,postgres]>=0.13.1, so pip/wheel/sdist installs can resolve a PyPI release that may not include the new contract this backend expects.

Comment on lines +563 to +572
def _normalize_kind(dimensions: dict[str, str]) -> Literal["model", "reference"]:
"""
Normalise the ``kind`` CV dimension to the model/reference role.

``kind`` is absent from ``dimensions`` for model rows (the committed
default is omitted at serialisation), so a missing or empty value is
treated as ``"model"``.
"""
kind = dimensions.get("kind")
return "reference" if kind == "reference" else "model"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Normalise dimensions as well when blank/null kind is meant to be supported.

The new helper maps missing/empty kind to "model", but both builders still pass the raw dimensions dict straight into the DTOs. If an older row literally carries {"kind": null}, the derived top-level kind is fine while the DTO payload is still invalid. Strip or rewrite invalid kind entries before constructing ScalarValue/SeriesValue, otherwise this path can still 500 on the data shape the helper claims to accept.

Also applies to: 639-639, 678-680

Comment on lines +291 to +299
if (!series.values) return;
// Series with an index align by value; index-less series fall back to
// positional lookup (rawIndex is the row position in that case).
const dataIdx =
series.index && series.index.length > 0
? series.index.indexOf(rawIndex)
: typeof rawIndex === "number"
? rawIndex
: -1;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Index-less series disappear when any indexed series is present.

When hasAnyIndex is true, rawIndex is the actual x value, not the row position. The fallback at Lines 297-299 then uses values like 2020 as array offsets, so mixed indexed/index-less inputs never render the index-less series. Use the row ordinal from .map((rawIndex, rowIdx) => ...) for that fallback, and add a regression test for the mixed case.

Suggested fix
-  const chartData: ChartDataPoint[] = indexValueOrder.map((rawIndex) => {
+  const chartData: ChartDataPoint[] = indexValueOrder.map((rawIndex, rowIdx) => {
@@
       const dataIdx =
         series.index && series.index.length > 0
           ? series.index.indexOf(rawIndex)
-          : typeof rawIndex === "number"
-            ? rawIndex
-            : -1;
+          : rowIdx;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!series.values) return;
// Series with an index align by value; index-less series fall back to
// positional lookup (rawIndex is the row position in that case).
const dataIdx =
series.index && series.index.length > 0
? series.index.indexOf(rawIndex)
: typeof rawIndex === "number"
? rawIndex
: -1;
const chartData: ChartDataPoint[] = indexValueOrder.map((rawIndex, rowIdx) => {
if (!series.values) return;
// Series with an index align by value; index-less series fall back to
// positional lookup (rawIndex is the row position in that case).
const dataIdx =
series.index && series.index.length > 0
? series.index.indexOf(rawIndex)
: rowIdx;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant